80
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      SDT: A Virus Classification Tool Based on Pairwise Sequence Alignment and Identity Calculation

      research-article
      1 , * ,   1 , 2 , 3 , 1
      PLoS ONE
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The perpetually increasing rate at which viral full-genome sequences are being determined is creating a pressing demand for computational tools that will aid the objective classification of these genome sequences. Taxonomic classification approaches that are based on pairwise genetic identity measures are potentially highly automatable and are progressively gaining favour with the International Committee on Taxonomy of Viruses (ICTV). There are, however, various issues with the calculation of such measures that could potentially undermine the accuracy and consistency with which they can be applied to virus classification. Firstly, pairwise sequence identities computed based on multiple sequence alignments rather than on multiple independent pairwise alignments can lead to the deflation of identity scores with increasing dataset sizes. Also, when gap-characters need to be introduced during sequence alignments to account for insertions and deletions, methodological variations in the way that these characters are introduced and handled during pairwise genetic identity calculations can cause high degrees of inconsistency in the way that different methods classify the same sets of sequences. Here we present Sequence Demarcation Tool (SDT), a free user-friendly computer program that aims to provide a robust and highly reproducible means of objectively using pairwise genetic identity calculations to classify any set of nucleotide or amino acid sequences. SDT can produce publication quality pairwise identity plots and colour-coded distance matrices to further aid the classification of sequences according to ICTV approved taxonomic demarcation criteria. Besides a graphical interface version of the program for Windows computers, command-line versions of the program are available for a variety of different operating systems (including a parallel version for cluster computing platforms).

          Related collections

          Most cited references18

          • Record: found
          • Abstract: found
          • Article: not found

          Full genome-based classification of rotaviruses reveals a common origin between human Wa-Like and porcine rotavirus strains and human DS-1-like and bovine rotavirus strains.

          Group A rotavirus classification is currently based on the molecular properties of the two outer layer proteins, VP7 and VP4, and the middle layer protein, VP6. As reassortment of all the 11 rotavirus gene segments plays a key role in generating rotavirus diversity in nature, a classification system that is based on all the rotavirus gene segments is desirable for determining which genes influence rotavirus host range restriction, replication, and virulence, as well as for studying rotavirus epidemiology and evolution. Toward establishing such a classification system, gene sequences encoding VP1 to VP3, VP6, and NSP1 to NSP5 were determined for human and animal rotavirus strains belonging to different G and P genotypes in addition to those available in databases, and they were used to define phylogenetic relationships among all rotavirus genes. Based on these phylogenetic analyses, appropriate identity cutoff values were determined for each gene. For the VP4 gene, a nucleotide identity cutoff value of 80% completely correlated with the 27 established P genotypes. For the VP7 gene, a nucleotide identity cutoff value of 80% largely coincided with the established G genotypes but identified four additional distinct genotypes comprised of murine or avian rotavirus strains. Phylogenetic analyses of the VP1 to VP3, VP6, and NSP1 to NSP5 genes showed the existence of 4, 5, 6, 11, 14, 5, 7, 11, and 6 genotypes, respectively, based on nucleotide identity cutoff values of 83%, 84%, 81%, 85%, 79%, 85%, 85%, 85%, and 91%, respectively. In accordance with these data, a revised nomenclature of rotavirus strains is proposed. The novel classification system allows the identification of (i) distinct genotypes, which probably followed separate evolutionary paths; (ii) interspecies transmissions and a plethora of reassortment events; and (iii) certain gene constellations that revealed (a) a common origin between human Wa-like rotavirus strains and porcine rotavirus strains and (b) a common origin between human DS-1-like rotavirus strains and bovine rotaviruses. These close evolutionary links between human and animal rotaviruses emphasize the need for close simultaneous monitoring of rotaviruses in animals and humans.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A comprehensive comparison of multiple sequence alignment programs.

            In recent years improvements to existing programs and the introduction of new iterative algorithms have changed the state-of-the-art in protein sequence alignment. This paper presents the first systematic study of the most commonly used alignment programs using BAliBASE benchmark alignments as test cases. Even below the 'twilight zone' at 10-20% residue identity, the best programs were capable of correctly aligning on average 47% of the residues. We show that iterative algorithms often offer improved alignment accuracy though at the expense of computation time. A notable exception was the effect of introducing a single divergent sequence into a set of closely related sequences, causing the iteration to diverge away from the best alignment. Global alignment programs generally performed better than local methods, except in the presence of large N/C-terminal extensions and internal insertions. In these cases, a local algorithm was more successful in identifying the most conserved motifs. This study enables us to propose appropriate alignment strategies, depending on the nature of a particular set of sequences. The employment of more than one program based on different alignment techniques should significantly improve the quality of automatic protein sequence alignment methods. The results also indicate guidelines for improvement of alignment algorithms.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              A genome-wide pairwise-identity-based proposal for the classification of viruses in the genus Mastrevirus (family Geminiviridae).

                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2014
                26 September 2014
                : 9
                : 9
                : e108277
                Affiliations
                [1 ]Department of Clinical Laboratory Sciences, University of Cape Town, Cape Town, South Africa
                [2 ]School of Biological Sciences and Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand
                [3 ]Department of Plant Pathology and Emerging Pathogens Institute, University of Florida, Gainesville, Florida, United States of America
                Division of Clinical Research, United States of America
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Conceived and designed the experiments: BMM DPM AV. Performed the experiments: BMM DPM. Analyzed the data: BMM DPM. Contributed reagents/materials/analysis tools: BMM DPM. Contributed to the writing of the manuscript: BMM AV DPM. Software design and programming: BMM.

                Article
                PONE-D-14-18029
                10.1371/journal.pone.0108277
                4178126
                25259891
                46cedda2-3af5-4ca0-92ff-e0a90945aa73
                Copyright @ 2014

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 22 April 2014
                : 26 August 2014
                Page count
                Pages: 8
                Funding
                BMM is funded by the University of Cape Town. AV and DPM are supported by the National Research Foundation, South Africa. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Computational Biology
                Genome Evolution
                Evolutionary Biology
                Molecular Evolution
                Taxonomy
                Custom metadata
                The authors confirm that all data underlying the findings are fully available without restriction. All relevant data are within the paper and its Supporting Information files.

                Uncategorized
                Uncategorized

                Comments

                Comment on this article