Inviting an author to review:
Find an author and click ‘Invite to review selected article’ near their name.
Search for authorsSearch for similar articles
11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Contextual Multiple Sequence Alignment

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In a recently proposed contextual alignment model, efficient algorithms exist for global and local pairwise alignment of protein sequences. Preliminary results obtained for biological data are very promising. Our main motivation was to adopt the idea of context dependency to the multiple alignment setting. To this aim the relaxation of the model was developed (we call this new model averaged contextual alignment) and a new family of amino acids substitution matrices are constructed. In this paper we present a contextual multiple alignment algorithm and report the outcomes of experiments performed for the BAliBASE test set. The contextual approach turned out to give much better results for the set of sequences containing orphan genes.

          Related collections

          Most cited references8

          • Record: found
          • Abstract: found
          • Article: not found

          Amino acid substitution matrices from protein blocks.

          Methods for alignment of protein sequences typically measure similarity by using a substitution matrix with scores for all possible exchanges of one amino acid with another. The most widely used matrices are based on the Dayhoff model of evolutionary rates. Using a different approach, we have derived substitution matrices from about 2000 blocks of aligned sequence segments characterizing more than 500 groups of related proteins. This led to marked improvements in alignments and in searches using queries from each of the groups.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A comprehensive comparison of multiple sequence alignment programs.

            In recent years improvements to existing programs and the introduction of new iterative algorithms have changed the state-of-the-art in protein sequence alignment. This paper presents the first systematic study of the most commonly used alignment programs using BAliBASE benchmark alignments as test cases. Even below the 'twilight zone' at 10-20% residue identity, the best programs were capable of correctly aligning on average 47% of the residues. We show that iterative algorithms often offer improved alignment accuracy though at the expense of computation time. A notable exception was the effect of introducing a single divergent sequence into a set of closely related sequences, causing the iteration to diverge away from the best alignment. Global alignment programs generally performed better than local methods, except in the presence of large N/C-terminal extensions and internal insertions. In these cases, a local algorithm was more successful in identifying the most conserved motifs. This study enables us to propose appropriate alignment strategies, depending on the nature of a particular set of sequences. The employment of more than one program based on different alignment techniques should significantly improve the quality of automatic protein sequence alignment methods. The results also indicate guidelines for improvement of alignment algorithms.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Blocks+: a non-redundant database of protein alignment blocks derived from multiple compilations.

              As databanks grow, sequence classification and prediction of function by searching protein family databases becomes increasingly valuable. The original Blocks Database, which contains ungapped multiple alignments for families documented in Prosite, can be searched to classify new sequences. However, Prosite is incomplete, and families from other databases are now available to expand coverage of the Blocks Database. To take advantage of protein family information present in several existing compilations, we have used five databases to construct Blocks+, a unified database that is built on the PROTOMAT/BLOSUM scoring model and that can be searched using a single algorithm for consistent sequence classification. The LAMA blocks-versus-blocks searching program identifies overlapping protein families, making possible a non-redundant hierarchical compilation. Blocks+ consists of all blocks derived from PROSITE, blocks from Prints not present in PROSITE, blocks from Pfam-A not present in PROSITE or Prints, and so on for ProDom and Domo, for a total of 1995 protein families represented by 8909 blocks, doubling the coverage of the original Blocks Database. A challenge for any procedure aimed at non-redundancy is to retain related but distinct families while discarding those that are duplicates. We illustrate how using multiple compilations can minimize this potential problem by examining the SNF2 family of ATPases, which is detectably similar to distinct families of helicases and ATPases. http://blocks.fhcrc.org/
                Bookmark

                Author and article information

                Journal
                J Biomed Biotechnol
                JBB
                Journal of Biomedicine and Biotechnology
                Hindawi Publishing Corporation
                1110-7243
                2005
                : 2005
                : 2
                : 124-131
                Affiliations
                Institute of Informatics, Warsaw University, Banacha 2, 02-097 Warsaw, Poland
                Author notes
                Article
                10.1155/JBB.2005.124
                1184041
                16046817
                76722ce9-a9de-482f-a61a-5c15ddca7d75
                Hindawi Publishing Corporation

                This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 30 November 2003
                : 27 February 2004
                : 12 March 2004
                Categories
                Research Article

                Molecular medicine
                Molecular medicine

                Comments

                Comment on this article