0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      AlphaPept, a modern and open framework for MS-based proteomics

      Preprint

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          ABSTRACT

          In common with other omics technologies, mass spectrometry (MS)-based proteomics produces ever-increasing amounts of raw data, making their efficient analysis a principal challenge. There is a plethora of different computational tools that process the raw MS data and derive peptide and protein identification and quantification. During the last decade, there has been dramatic progress in computer science and software engineering, including collaboration tools that have transformed research and industry. To leverage these advances, we developed AlphaPept, a Python-based open-source framework for efficient processing of large high-resolution MS data sets. Using Numba for just-in-time machine code compilation on CPU and GPU, we achieve hundred-fold speed improvements while maintaining clear syntax and rapid development speed. AlphaPept uses the Python scientific stack of highly optimized packages, reducing the code base to domain-specific tasks while providing access to the latest advances in machine learning. We provide an easy on-ramp for community validation and contributions through the concept of literate programming, implemented in Jupyter Notebooks of the different modules. A framework for continuous integration, testing, and benchmarking enforces solid software engineering principles. Large datasets can rapidly be processed as shown by the analysis of hundreds of cellular proteomes in minutes per file, many-fold faster than the data acquisiton. The AlphaPept framework can be used to build automated processing pipelines using efficient HDF5 based file formats, web-serving functionality and compatibility with downstream analysis tools. Easy access for end-users is provided by one-click installation of the graphical user interface, for advanced users via a modular Python library, and for developers via a fully open GitHub repository.

          Related collections

          Author and article information

          Contributors
          (View ORCID Profile)
          (View ORCID Profile)
          (View ORCID Profile)
          (View ORCID Profile)
          (View ORCID Profile)
          (View ORCID Profile)
          (View ORCID Profile)
          (View ORCID Profile)
          (View ORCID Profile)
          Journal
          bioRxiv
          July 26 2021
          Article
          10.1101/2021.07.23.453379
          1b0e43ae-14a9-42da-a3cf-8391ae3371be
          © 2021
          History

          Quantitative & Systems biology,Biophysics
          Quantitative & Systems biology, Biophysics

          Comments

          Comment on this article