25
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A knowledge graph to interpret clinical proteomics data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Implementing precision medicine hinges on the integration of omics data, such as proteomics, into the clinical decision-making process, but the quantity and diversity of biomedical data, and the spread of clinically relevant knowledge across multiple biomedical databases and publications, pose a challenge to data integration. Here we present the Clinical Knowledge Graph (CKG), an open-source platform currently comprising close to 20 million nodes and 220 million relationships that represent relevant experimental data, public databases and literature. The graph structure provides a flexible data model that is easily extendable to new nodes and relationships as new databases become available. The CKG incorporates statistical and machine learning algorithms that accelerate the analysis and interpretation of typical proteomics workflows. Using a set of proof-of-concept biomarker studies, we show how the CKG might augment and enrich proteomics data and help inform clinical decision-making.

          Abstract

          A knowledge graph platform integrates proteomics with other omics data and biomedical databases.

          Related collections

          Most cited references75

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          WGCNA: an R package for weighted correlation network analysis

          Background Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. Results The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. Conclusion The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at .
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification.

            Efficient analysis of very large amounts of raw data for peptide identification and protein quantification is a principal challenge in mass spectrometry (MS)-based proteomics. Here we describe MaxQuant, an integrated suite of algorithms specifically developed for high-resolution, quantitative MS data. Using correlation analysis and graph theory, MaxQuant detects peaks, isotope clusters and stable amino acid isotope-labeled (SILAC) peptide pairs as three-dimensional objects in m/z, elution time and signal intensity space. By integrating multiple mass measurements and correcting for linear and nonlinear mass offsets, we achieve mass accuracy in the p.p.b. range, a sixfold increase over standard techniques. We increase the proportion of identified fragmentation spectra to 73% for SILAC peptide pairs via unambiguous assignment of isotope and missed-cleavage state and individual mass precision. MaxQuant automatically quantifies several hundred thousand peptides per SILAC-proteome experiment and allows statistically robust identification and quantification of >4,000 proteins in mammalian cell lysates.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The FAIR Guiding Principles for scientific data management and stewardship

              There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
                Bookmark

                Author and article information

                Contributors
                alberto.santos@sund.ku.dk
                mmann@biochem.mpg.de
                Journal
                Nat Biotechnol
                Nat Biotechnol
                Nature Biotechnology
                Nature Publishing Group US (New York )
                1087-0156
                1546-1696
                31 January 2022
                31 January 2022
                2022
                : 40
                : 5
                : 692-702
                Affiliations
                [1 ]GRID grid.5254.6, ISNI 0000 0001 0674 042X, NNF Center for Protein Research, Faculty of Health Sciences, , University of Copenhagen, ; Copenhagen, Denmark
                [2 ]GRID grid.4991.5, ISNI 0000 0004 1936 8948, Li-Ka Shing Big Data Institute, , University of Oxford, ; Oxford, UK
                [3 ]GRID grid.5254.6, ISNI 0000 0001 0674 042X, Center for Health Data Science, Faculty of Health Sciences, , University of Copenhagen, ; Copenhagen, Denmark
                [4 ]OmicEra Diagnostics GmbH, Planegg, Germany
                [5 ]GRID grid.418615.f, ISNI 0000 0004 0491 845X, Department of Proteomics and Signal Transduction, , Max Planck Institute of Biochemistry, ; Martinsried, Germany
                [6 ]GRID grid.5254.6, ISNI 0000 0001 0674 042X, Department of Biomedical Sciences, Faculty of Health and Medical Sciences, , University of Copenhagen, ; Copenhagen, Denmark
                [7 ]GRID grid.5254.6, ISNI 0000 0001 0674 042X, Department for Clinical Biochemistry, Rigshospitalet, , University of Copenhagen, ; Copenhagen, Denmark
                Author information
                http://orcid.org/0000-0002-9163-7730
                http://orcid.org/0000-0002-2244-5081
                http://orcid.org/0000-0003-4230-5753
                http://orcid.org/0000-0001-7885-715X
                http://orcid.org/0000-0003-1292-4799
                Article
                1145
                10.1038/s41587-021-01145-6
                9110295
                35102292
                d32fe759-25ab-4186-84c9-06c23fd6402d
                © The Author(s) 2022

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 18 November 2020
                : 1 November 2021
                Funding
                Funded by: FundRef https://doi.org/10.13039/501100009708, Novo Nordisk Fonden (Novo Nordisk Foundation);
                Award ID: NNF14CC0001
                Award ID: NNF14CC0001
                Award ID: NNF15CC0001
                Award ID: NNF14CC0001
                Award ID: NNF15CC0001
                Award ID: NNF14CC0001
                Award ID: NNF15CC0001
                Award ID: NNF14CC0001
                Award ID: NNF15CC0001
                Award ID: NNF14CC0001
                Award ID: NNF15CC0001
                Award ID: NNF14CC0001
                Award ID: NNF14CC0001
                Award ID: NNF15CC0001
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/501100004189, Max-Planck-Gesellschaft (Max Planck Society);
                Funded by: FundRef https://doi.org/10.13039/100010661, EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020);
                Award ID: 846795
                Award Recipient :
                Categories
                Article
                Custom metadata
                © The Author(s), under exclusive licence to Springer Nature America, Inc. 2022

                Biotechnology
                biomarkers,software,proteomics,data integration,data mining
                Biotechnology
                biomarkers, software, proteomics, data integration, data mining

                Comments

                Comment on this article