8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Parsers, Data Structures and Algorithms for Macromolecular Analysis Toolkit (MAT): Design and Implementation

      Preprint
      , , , , ,
      bioRxiv

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The structural information of biological macromolecules are stored in .pdb, .mm-cif and lately mmtf files and thus it requires accurate and efficient biological tools for various utilities. Here, we describe Macromolecular Analysis Toolkit (MAT) that parses .pdb, .mmcif and .mmtf files; and builds data structures from the input. This original program is written in C++ programming language to ensure efficiency and consistency to organize structural information in an integral way. The novelty of the program lies in the addition of new structure-based biological algorithms and applications. This package also stands out from other similar libraries by being 1) faster and 2) accurate. We also provide detailed comparison of available parsers on the whole PDB database. The parser of MAT is designed in such a way that it allows quick extraction and organized loading of the core data structure. The same data structure is extended to accommodate information from the .mmcif and .mmtf file parsers. Tokenization of the data allows the extraction of information from disordered text, making it compatible for accurate identification of the entities present in the .pdb file. Additionally, we add a new approach of performance optimization by creating a few derived data structures, namely kD-Tree, Octree and graphs, for certain applications that need spatial coordinate calculations. MAT provides advanced data structure which is time efficient and is designed to avail reusability and consistency in a systematic framework. MAT parser can be accessed online through bitbucket at https://bitbucket.org/gazalk/pdb_parser/.

          Related collections

          Author and article information

          Journal
          bioRxiv
          April 11 2019
          Article
          10.1101/605907
          94efd59a-3e15-4d84-92c7-9f28a88f2781
          © 2019
          History

          Quantitative & Systems biology,Biophysics
          Quantitative & Systems biology, Biophysics

          Comments

          Comment on this article