There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.
Abstract
We describe MUSCLE, a new computer program for creating multiple alignments of protein
sequences. Elements of the algorithm include fast distance estimation using kmer counting,
progressive alignment using a new profile function we call the log-expectation score,
and refinement using tree-dependent restricted partitioning. The speed and accuracy
of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference
alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves
the highest, or joint highest, rank in accuracy on each of these sets. Without refinement,
MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and
MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning
5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE
program, source code and PREFAB test data are freely available at http://www.drive5.
com/muscle.