34
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      High-confidence structural annotation of metabolites absent from spectral libraries

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Untargeted metabolomics experiments rely on spectral libraries for structure annotation, but, typically, only a small fraction of spectra can be matched. Previous in silico methods search in structure databases but cannot distinguish between correct and incorrect annotations. Here we introduce the COSMIC workflow that combines in silico structure database generation and annotation with a confidence score consisting of kernel density P value estimation and a support vector machine with enforced directionality of features. On diverse datasets, COSMIC annotates a substantial number of hits at low false discovery rates and outperforms spectral library search. To demonstrate that COSMIC can annotate structures never reported before, we annotated 12 natural bile acids. The annotation of nine structures was confirmed by manual evaluation and two structures using synthetic standards. In human samples, we annotated and manually validated 315 molecular structures currently absent from the Human Metabolome Database. Application of COSMIC to data from 17,400 metabolomics experiments led to 1,715 high-confidence structural annotations that were absent from spectral libraries.

          Abstract

          COSMIC outperforms spectral library search for metabolite annotation and annotates previously unseen structures.

          Related collections

          Most cited references87

          • Record: found
          • Abstract: not found
          • Article: not found

          Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Cytoscape: a software environment for integrated models of biomolecular interaction networks.

            Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              LIBSVM: A library for support vector machines

              LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
                Bookmark

                Author and article information

                Contributors
                kai.duehrkop@uni-jena.de
                sebastian.boecker@uni-jena.de
                Journal
                Nat Biotechnol
                Nat Biotechnol
                Nature Biotechnology
                Nature Publishing Group US (New York )
                1087-0156
                1546-1696
                14 October 2021
                14 October 2021
                2022
                : 40
                : 3
                : 411-421
                Affiliations
                [1 ]GRID grid.9613.d, ISNI 0000 0001 1939 2794, Chair for Bioinformatics, Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, ; Jena, Germany
                [2 ]GRID grid.418160.a, ISNI 0000 0004 0491 7131, International Max Planck Research School ‘Exploration of Ecological Interactions with Molecular and Chemical Techniques’, Max Planck Institute for Chemical Ecology, ; Jena, Germany
                [3 ]GRID grid.266100.3, ISNI 0000 0001 2107 4242, Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, ; San Diego, CA USA
                [4 ]GRID grid.4567.0, ISNI 0000 0004 0483 2525, Metabolomics and Proteomics Core, Helmholtz Zentrum München, ; Neuherberg, Germany
                [5 ]GRID grid.6936.a, ISNI 0000000123222966, Chair of Analytical Food Chemistry, TUM School of Life Sciences, Technical University of Munich, ; Freising-Weihenstephan, Germany
                [6 ]GRID grid.266100.3, ISNI 0000 0001 2107 4242, Departments of Pharmacology and Pediatrics, University of California, San Diego, ; San Diego, CA USA
                [7 ]GRID grid.8591.5, ISNI 0000 0001 2322 4988, Present Address: School of Pharmaceutical Sciences, University of Geneva, ; Geneva, Switzerland
                Author information
                http://orcid.org/0000-0002-3554-2710
                http://orcid.org/0000-0001-9981-2153
                http://orcid.org/0000-0001-7557-0831
                http://orcid.org/0000-0002-0016-8132
                http://orcid.org/0000-0002-1462-4426
                http://orcid.org/0000-0002-3003-1030
                http://orcid.org/0000-0002-9056-0540
                http://orcid.org/0000-0002-9304-8091
                Article
                1045
                10.1038/s41587-021-01045-9
                8926923
                34650271
                eaa5bd61-4521-438f-b59a-f6f231aaf63c
                © The Author(s) 2022

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 9 April 2021
                : 4 August 2021
                Funding
                Funded by: FundRef https://doi.org/10.13039/100000002, U.S. Department of Health & Human Services | National Institutes of Health (NIH);
                Award ID: R01 GM107550
                Award ID: R01 GM107550
                Award ID: P41 GM103484
                Award ID: R03 CA211211
                Award ID: U19 AG063744 01
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/100000936, Gordon and Betty Moore Foundation (Gordon E. and Betty I. Moore Foundation);
                Award ID: GBMF 7622
                Award Recipient :
                Categories
                Article
                Custom metadata
                © The Author(s), under exclusive licence to Springer Nature America, Inc. 2022

                Biotechnology
                molecular biology,data processing
                Biotechnology
                molecular biology, data processing

                Comments

                Comment on this article