MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.

Related collections

Most cited references 37

Record: found
Abstract: found
Article: not found

Amino acid substitution matrices from protein blocks.

S Henikoff, J. Henikoff (1992)

Methods for alignment of protein sequences typically measure similarity by using a substitution matrix with scores for all possible exchanges of one amino acid with another. The most widely used matrices are based on the Dayhoff model of evolutionary rates. Using a different approach, we have derived substitution matrices from about 2000 blocks of aligned sequence segments characterizing more than 500 groups of related proteins. This led to marked improvements in alignments and in searches using queries from each of the groups.

0 comments Cited 1082 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Multiple alignment of DNA sequences with MAFFT.

Kazutaka Katoh, George Asimenos, Hiroyuki Toh (2009)

Multiple alignment of DNA sequences is an important step in various molecular biological analyses. As a large amount of sequence data is becoming available through genome and other large-scale sequencing projects, scalability, as well as accuracy, is currently required for a multiple sequence alignment (MSA) program. In this chapter, we outline the algorithms of an MSA program MAFFT and provide practical advice, focusing on several typical situations a biologist sometimes faces. For genome alignment, which is beyond the scope of MAFFT, we introduce two tools: TBA and MAUVE.

0 comments Cited 390 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

PROSITE, a protein domain database for functional characterization and annotation

Christian J A Sigrist, Lorenzo Cerutti, Edouard de Castro … (2010)

PROSITE consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule, a collection of rules based on profiles and patterns, which increases the discriminatory power of these profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. PROSITE is largely used for the annotation of domain features of UniProtKB/Swiss-Prot entries. Among the 983 (DNA-binding) domains, repeats and zinc fingers present in Swiss-Prot (release 57.8 of 22 September 2009), 696 (∼70%) are annotated with PROSITE descriptors using information from ProRule. In order to allow better functional characterization of domains, PROSITE developments focus on subfamily specific profiles and a new profile building method giving more weight to functionally important residues. Here, we describe AMSA, an annotated multiple sequence alignment format used to build a new generation of generalized profiles, the migration of ScanProsite to Vital-IT, a cluster of 633 CPUs, and the adoption of the Distributed Annotation System (DAS) to facilitate PROSITE data integration and interchange with other sources. The latest version of PROSITE (release 20.54, of 22 September 2009) contains 1308 patterns, 863 profiles and 869 ProRules. PROSITE is accessible at: http://www.expasy.org/prosite/.

0 comments Cited 292 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Mol Biol Evol

Journal ID (iso-abbrev): Mol. Biol. Evol

Journal ID (publisher-id): molbev

Journal ID (hwp): molbiolevol

Title: Molecular Biology and Evolution

Publisher: Oxford University Press

ISSN (Print): 0737-4038

ISSN (Electronic): 1537-1719

Publication date (Print): April 2013

Publication date (Electronic): 16 January 2013

Publication date PMC-release: 16 January 2013

Volume: 30

Issue: 4

Pages: 772-780

Affiliations

¹Immunology Frontier Research Center, Osaka University, Suita, Osaka, Japan

²Computational Biology Research Center, The National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan

Author notes

*Corresponding author: E-mail: kazutaka.katoh@ 123456aist.go.jp .

Associate editor: Sudhir Kumar

Article

Publisher ID: mst010

DOI: 10.1093/molbev/mst010

PMC ID: 3603318

PubMed ID: 23329690

SO-VID: d00ded02-1df1-4526-ba21-425067486d34

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Page count

Pages: 9

Comments

Comment on this article

scite_

Cited by 15,273

See all cited by

Most referenced authors 1,185

See all reference authors

- Version 1
- Version 1

MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability

Read this article at

Abstract

Related collections

Higher order chromatin architecture

Most cited references 37

Amino acid substitution matrices from protein blocks.

Multiple alignment of DNA sequences with MAFFT.

PROSITE, a protein domain database for functional characterization and annotation

Author and article information

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 26

Cited by 15,273

Most referenced authors 1,185