11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Sample-distance partial least squares: PLS optimized for many variables, with application to CoMFA.

      Journal of Computer-Aided Molecular Design
      Computer Simulation, Histamine Antagonists, chemistry, Humans, In Vitro Techniques, Least-Squares Analysis, Models, Molecular, Molecular Structure, Software, Steroids, metabolism, Transcortin

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Three-dimensional molecular modeling can provide an unlimited number m of structural properties. Comparative Molecular Field Analysis (CoMFA), for example, may calculate thousands of field values for each model structure. When m is large, partial least squares (PLS) is the statistical method of choice for fitting and predicting biological responses. Yet PLS is usually implemented in a property-based fashion which is optimal only for small m. We describe here a sample-based formulation of PLS which can be used to fit any single response (bioactivity). SAMPLS reduces all explanatory data to the pairwise 'distances' among n samples (molecules), or equivalently to an n-by-n covariance matrix C. This matrix, unmodified, can be used to fit all PLS components. Furthermore, SAMPLS will validate the model by modern resampling techniques, at a cost independent of m. We have implemented SAMPLS as a Fortran program and have reproduced conventional and cross-validated PLS analyses of data from two published studies. Full (leave-each-out) cross-validation of a typical CoMFA takes 0.2 CPU s. SAMPLS is thus ideally suited to structure-activity analysis based on CoMFA fields or bonded topology. The sample-distance formulation also relates PLS to methods like cluster analysis and nonlinear mapping, and shows how drastically PLS simplifies the information in CoMFA fields.

          Related collections

          Most cited references1

          • Record: found
          • Abstract: not found
          • Article: not found

          Atom pairs as molecular features in structure-activity studies: definition and applications

            Bookmark

            Author and article information

            Comments

            Comment on this article