24
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Efficient denoising algorithms for large experimental datasets and their applications in Fourier transform ion cyclotron resonance mass spectrometry.

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Modern scientific research produces datasets of increasing size and complexity that require dedicated numerical methods to be processed. In many cases, the analysis of spectroscopic data involves the denoising of raw data before any further processing. Current efficient denoising algorithms require the singular value decomposition of a matrix with a size that scales up as the square of the data length, preventing their use on very large datasets. Taking advantage of recent progress on random projection and probabilistic algorithms, we developed a simple and efficient method for the denoising of very large datasets. Based on the QR decomposition of a matrix randomly sampled from the data, this approach allows a gain of nearly three orders of magnitude in processing time compared with classical singular value decomposition denoising. This procedure, called urQRd (uncoiled random QR denoising), strongly reduces the computer memory footprint and allows the denoising algorithm to be applied to virtually unlimited data size. The efficiency of these numerical tools is demonstrated on experimental data from high-resolution broadband Fourier transform ion cyclotron resonance mass spectrometry, which has applications in proteomics and metabolomics. We show that robust denoising is achieved in 2D spectra whose interpretation is severely impaired by scintillation noise. These denoising procedures can be adapted to many other data analysis domains where the size and/or the processing time are crucial.

          Related collections

          Author and article information

          Journal
          Proc. Natl. Acad. Sci. U.S.A.
          Proceedings of the National Academy of Sciences of the United States of America
          1091-6490
          0027-8424
          Jan 28 2014
          : 111
          : 4
          Affiliations
          [1 ] Institut de Génétique et de Biologie Moléculaire et Cellulaire, Institut National de la Santé et de la Recherche, U596.
          Article
          1306700111
          10.1073/pnas.1306700111
          24390542
          7175ae74-7a06-4203-bed0-580aa3bd9118
          History

          FT-ICR MS,big data,matrix rank reduction,noise reduction

          Comments

          Comment on this article