9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Open-Source Machine Learning in Computational Chemistry

      review-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The field of computational chemistry has seen a significant increase in the integration of machine learning concepts and algorithms. In this Perspective, we surveyed 179 open-source software projects, with corresponding peer-reviewed papers published within the last 5 years, to better understand the topics within the field being investigated by machine learning approaches. For each project, we provide a short description, the link to the code, the accompanying license type, and whether the training data and resulting models are made publicly available. Based on those deposited in GitHub repositories, the most popular employed Python libraries are identified. We hope that this survey will serve as a resource to learn about machine learning or specific architectures thereof by identifying accessible codes with accompanying papers on a topic basis. To this end, we also include computational chemistry open-source software for generating training data and fundamental Python libraries for machine learning. Based on our observations and considering the three pillars of collaborative machine learning work, open data, open source (code), and open models, we provide some suggestions to the community.

          Related collections

          Most cited references368

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Highly accurate protein structure prediction with AlphaFold

          Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            SciPy 1.0: fundamental algorithms for scientific computing in Python

            SciPy is an open-source scientific computing library for the Python programming language. Since its initial release in 2001, SciPy has become a de facto standard for leveraging scientific algorithms in Python, with over 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories and millions of downloads per year. In this work, we provide an overview of the capabilities and development practices of SciPy 1.0 and highlight some recent technical developments.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Matplotlib: A 2D Graphics Environment

                Bookmark

                Author and article information

                Journal
                J Chem Inf Model
                J Chem Inf Model
                ci
                jcisd8
                Journal of Chemical Information and Modeling
                American Chemical Society
                1549-9596
                1549-960X
                19 July 2023
                14 August 2023
                19 July 2024
                : 63
                : 15
                : 4505-4532
                Affiliations
                []Institute of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg , 53757 Sankt Augustin, Germany
                []Department of Electrical Engineering, Mechanical Engineering and Technical Journalism, University of Applied Sciences Bonn-Rhein-Sieg , 53757 Sankt Augustin, Germany
                [§ ]Department of Computer Science, University of Applied Sciences Bonn-Rhein-Sieg , 53757 Sankt Augustin, Germany
                Author notes
                Author information
                https://orcid.org/0000-0002-8668-1796
                https://orcid.org/0000-0002-4581-920X
                Article
                10.1021/acs.jcim.3c00643
                10430767
                37466636
                10d52939-9f78-4f3c-aa3a-e84b1d4a6000
                © 2023 The Authors. Published by American Chemical Society

                Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works ( https://creativecommons.org/licenses/by-nc-nd/4.0/).

                History
                : 04 May 2023
                Categories
                Perspective
                Custom metadata
                ci3c00643
                ci3c00643

                Computational chemistry & Modeling
                Computational chemistry & Modeling

                Comments

                Comment on this article