950
views
1
recommends
+1 Recommend
1 collections
    24
    shares
      scite_
       
      • Record: found
      • Abstract: found
      • Poster: found
      Is Open Access

      Mathematical insight of species

      poster
        , , , , ,
      ScienceOpen Posters
      ScienceOpen
      Microbiome R&D and Business Collaboration Forum
      Data, species, algortihm, mathematics, proteomics, mass spectrometry
      Bookmark

            Abstract

            The application of mathematics to biology is not a novelty and just like the discovery of the microscope created a revolution in biology by revealing otherwise invisible and previously unknown worlds, mathematics could also be regarded as a form of microscope which can be used to visualise highly complex data. Today, when Mass-Spectrometry based proteomic assays are becoming standard to identify microorganisms at the genus and species level in many clinical microbiology laboratories, it is prime time to find solutions which would solve the classification problem. In our lab, we are developing one such solution by replacing numerous unrelated single protein searches with something we call a “proteom fingerprint”. By applying novel mathematics models we can display invisible worlds hidden within forests of data generated by mass spectrometry based proteomics. By diving each protein into n-grams and making a vector of n-gram frequencies we can represent species proteome using term vector model. CUR matrix decomposition, used as a tool for exploratory data analysis, suggests that there are less then 1% of particulary important or influential n-grams in low-dimensional classification problem. Therefore, we propose a proficient dimensionality reduction-based algorithm to identify microorganisms on both species and strain level. Singular Value Decomposition (SVD) often used by Latent Semantic Indexing (LSI) is used to obtain a reduced data representation, therefore solving the sparsity and scalability issues. As a result, significant improvements in computational time are observed together with increased quality of returned results and decreased memory requirements. To show the efficiency of our algorithm, we employed it to bacterial proteomes downloaded from NCBI. The algorithm assigned correctly both strain and species in more than 95% of tested samples. This work was funded by HRZZ (Croatian Science Foundation) research project “Clinical proteomics of microorganisms”.

            Content

            Author and article information

            Conference
            ScienceOpen Posters
            ScienceOpen
            May 29 2015
            Author information
            https://orcid.org/0000-0002-3437-2887
            Article
            10.14293/P2199-8442.1.SOP-LIFE.P94XJX.v1
            ec8f8be5-c9bb-4913-954d-571150f7e46b

            This work has been published open access under Creative Commons Attribution License CC BY 4.0 , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at www.scienceopen.com .

            Microbiome R&D and Business Collaboration Forum
            History

            Mathematical software,Applied mathematics,Evolutionary Biology
            Data, species, algortihm, mathematics, proteomics, mass spectrometry

            Comments

            Comment on this article