Blog
About

  • Record: found
  • Abstract: found
  • Article: found
Is Open Access

CABRA: Cluster and Annotate Blast Results Algorithm

,

BMC Research Notes

BioMed Central

Clustering, BLAST search, Web tool, Computational biology

Read this article at

Bookmark
      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

      Abstract

      BackgroundBasic local alignment search tool (BLAST) searches are frequently used to look for homologous sequences and to annotate a query protein, but the increasing size of protein databases makes it difficult to review all results from a similarity search.FindingsWe developed a web tool called Cluster and Annotate Blast Results Algorithm (CABRA), which enables a rapid BLAST search in a variety of updated reference proteomes, and provides a new way to functionally evaluate the results by the subsequent clustering of the hits and annotation of the clusters. The tool can be accessed from the following web-resource: http://cbdm-01.zdv.uni-mainz.de/~munoz/CABRA.ConclusionsCluster and Annotate Blast Results Algorithm simplifies the analysis of the results of a BLAST search by providing an overview of the result’s annotations organized in clusters that can be iteratively modified by the user.

      Related collections

      Most cited references 12

      • Record: found
      • Abstract: not found
      • Article: not found

      Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

      The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
        Bookmark
        • Record: found
        • Abstract: found
        • Article: found

        UniProt: a hub for protein information

        UniProt is an important collection of protein sequences and their annotations, which has doubled in size to 80 million sequences during the past year. This growth in sequences has prompted an extension of UniProt accession number space from 6 to 10 characters. An increasing fraction of new sequences are identical to a sequence that already exists in the database with the majority of sequences coming from genome sequencing projects. We have created a new proteome identifier that uniquely identifies a particular assembly of a species and strain or subspecies to help users track the provenance of sequences. We present a new website that has been designed using a user-experience design process. We have introduced an annotation score for all entries in UniProt to represent the relative amount of knowledge known about each protein. These scores will be helpful in identifying which proteins are the best characterized and most informative for comparative analysis. All UniProt data is provided freely and is available on the web at http://www.uniprot.org/.
          Bookmark
          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Regularized gene selection in cancer microarray meta-analysis

          Background In cancer studies, it is common that multiple microarray experiments are conducted to measure the same clinical outcome and expressions of the same set of genes. An important goal of such experiments is to identify a subset of genes that can potentially serve as predictive markers for cancer development and progression. Analyses of individual experiments may lead to unreliable gene selection results because of the small sample sizes. Meta analysis can be used to pool multiple experiments, increase statistical power, and achieve more reliable gene selection. The meta analysis of cancer microarray data is challenging because of the high dimensionality of gene expressions and the differences in experimental settings amongst different experiments. Results We propose a Meta Threshold Gradient Descent Regularization (MTGDR) approach for gene selection in the meta analysis of cancer microarray data. The MTGDR has many advantages over existing approaches. It allows different experiments to have different experimental settings. It can account for the joint effects of multiple genes on cancer, and it can select the same set of cancer-associated genes across multiple experiments. Simulation studies and analyses of multiple pancreatic and liver cancer experiments demonstrate the superior performance of the MTGDR. Conclusion The MTGDR provides an effective way of analyzing multiple cancer microarray studies and selecting reliable cancer-associated genes.
            Bookmark

            Author and article information

            Affiliations
            [ ]Faculty of Biology, JGU Mainz, Gresemundweg, 2, 55128 Mainz, Germany
            [ ]Institute of Molecular Biology, Ackermannweg 4, 55128 Mainz, Germany
            Contributors
            munoz@uni-mainz.com
            andrade@uni-mainz.de
            Journal
            BMC Res Notes
            BMC Res Notes
            BMC Research Notes
            BioMed Central (London )
            1756-0500
            30 April 2016
            30 April 2016
            2016
            : 9
            27129717
            4851773
            2062
            10.1186/s13104-016-2062-y
            © Mier and Andrade-Navarro. 2016

            Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

            Categories
            Technical Note
            Custom metadata
            © The Author(s) 2016

            Medicine

            clustering, computational biology, web tool, blast search

            Comments

            Comment on this article