18
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Pango lineage designation and assignment using SARS-CoV-2 spike gene nucleotide sequences

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          More than 2 million SARS-CoV-2 genome sequences have been generated and shared since the start of the COVID-19 pandemic and constitute a vital information source that informs outbreak control, disease surveillance, and public health policy. The Pango dynamic nomenclature is a popular system for classifying and naming genetically-distinct lineages of SARS-CoV-2, including variants of concern, and is based on the analysis of complete or near-complete virus genomes. However, for several reasons, nucleotide sequences may be generated that cover only the spike gene of SARS-CoV-2. It is therefore important to understand how much information about Pango lineage status is contained in spike-only nucleotide sequences. Here we explore how Pango lineages might be reliably designated and assigned to spike-only nucleotide sequences. We survey the genetic diversity of such sequences, and investigate the information they contain about Pango lineage status.

          Results

          Although many lineages, including the main variants of concern, can be identified clearly using spike-only sequences, some spike-only sequences are shared among tens or hundreds of Pango lineages. To facilitate the classification of SARS-CoV-2 lineages using subgenomic sequences we introduce the notion of designating such sequences to a “lineage set”, which represents the range of Pango lineages that are consistent with the observed mutations in a given spike sequence.

          Conclusions

          We find that many lineages, including the main variants-of-concern, can be reliably identified by spike alone and we define lineage-sets to represent the lineage precision that can be achieved using spike-only nucleotide sequences. These data provide a foundation for the development of software tools that can assign newly-generated spike nucleotide sequences to Pango lineage sets.

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s12864-022-08358-2.

          Related collections

          Most cited references19

          • Record: found
          • Abstract: found
          • Article: not found

          Minimap2: pairwise alignment for nucleotide sequences

          Heng Li (2018)
          Recent advances in sequencing technologies promise ultra-long reads of ∼100 kb in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 Mb in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Characteristics of SARS-CoV-2 and COVID-19

            Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly transmissible and pathogenic coronavirus that emerged in late 2019 and has caused a pandemic of acute respiratory disease, named ‘coronavirus disease 2019’ (COVID-19), which threatens human health and public safety. In this Review, we describe the basic virology of SARS-CoV-2, including genomic characteristics and receptor use, highlighting its key difference from previously known coronaviruses. We summarize current knowledge of clinical, epidemiological and pathological features of COVID-19, as well as recent progress in animal models and antiviral treatment approaches for SARS-CoV-2 infection. We also discuss the potential wildlife hosts and zoonotic origin of this emerging virus in detail.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology

              The ongoing pandemic spread of a novel human coronavirus, SARS-COV-2, associated with severe pneumonia disease (COVID-19), has resulted in the generation of tens of thousands of virus genome sequences. The rate of genome generation is unprecedented, yet there is currently no coherent nor accepted scheme for naming the expanding phylogenetic diversity of SARS-CoV-2. We present a rational and dynamic virus nomenclature that uses a phylogenetic framework to identify those lineages that contribute most to active spread. Our system is made tractable by constraining the number and depth of hierarchical lineage labels and by flagging and de-labelling virus lineages that become unobserved and hence are likely inactive. By focusing on active virus lineages and those spreading to new locations this nomenclature will assist in tracking and understanding the patterns and determinants of the global spread of SARS-CoV-2.
                Bookmark

                Author and article information

                Contributors
                aine.otoole@ed.ac.uk
                Journal
                BMC Genomics
                BMC Genomics
                BMC Genomics
                BioMed Central (London )
                1471-2164
                11 February 2022
                11 February 2022
                2022
                : 23
                : 121
                Affiliations
                [1 ]GRID grid.4305.2, ISNI 0000 0004 1936 7988, Institute of Evolutionary Biology, University of Edinburgh, ; Edinburgh, UK
                [2 ]GRID grid.4991.5, ISNI 0000 0004 1936 8948, Department of Zoology, , University of Oxford, ; Oxford, UK
                [3 ]GRID grid.418152.b, ISNI 0000 0004 0543 9493, Microbial Sciences, , BioPharmaceuticals R&D, AstraZeneca, ; Gaithersburg, MD USA
                Author information
                http://orcid.org/0000-0001-8083-474X
                Article
                8358
                10.1186/s12864-022-08358-2
                8832810
                35148677
                7a374090-65bf-4687-b259-12b4ed5edb2d
                © The Author(s) 2022

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 11 August 2021
                : 1 February 2022
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100004211, Oxford Martin School, University of Oxford;
                Funded by: FundRef http://dx.doi.org/10.13039/100004440, Wellcome Trust;
                Award ID: grant.203783/Z/16/Z
                Funded by: Fast grants
                Award ID: 2236
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2022

                Genetics
                sars-cov-2,genomic surveillance,spike,pango,lineage
                Genetics
                sars-cov-2, genomic surveillance, spike, pango, lineage

                Comments

                Comment on this article