49
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The blood DNA virome in 8,000 humans

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The characterization of the blood virome is important for the safety of blood-derived transfusion products, and for the identification of emerging pathogens. We explored non-human sequence data from whole-genome sequencing of blood from 8,240 individuals, none of whom were ascertained for any infectious disease. Viral sequences were extracted from the pool of sequence reads that did not map to the human reference genome. Analyses sifted through close to 1 Petabyte of sequence data and performed 0.5 trillion similarity searches. With a lower bound for identification of 2 viral genomes/100,000 cells, we mapped sequences to 94 different viruses, including sequences from 19 human DNA viruses, proviruses and RNA viruses (herpesviruses, anelloviruses, papillomaviruses, three polyomaviruses, adenovirus, HIV, HTLV, hepatitis B, hepatitis C, parvovirus B19, and influenza virus) in 42% of the study participants. Of possible relevance to transfusion medicine, we identified Merkel cell polyomavirus in 49 individuals, papillomavirus in blood of 13 individuals, parvovirus B19 in 6 individuals, and the presence of herpesvirus 8 in 3 individuals. The presence of DNA sequences from two RNA viruses was unexpected: Hepatitis C virus is revealing of an integration event, while the influenza virus sequence resulted from immunization with a DNA vaccine. Age, sex and ancestry contributed significantly to the prevalence of infection. The remaining 75 viruses mostly reflect extensive contamination of commercial reagents and from the environment. These technical problems represent a major challenge for the identification of novel human pathogens. Increasing availability of human whole-genome sequences will contribute substantial amounts of data on the composition of the normal and pathogenic human blood virome. Distinguishing contaminants from real human viruses is challenging.

          Author summary

          Novel sequencing technologies offer insight into the virome in human samples. Here, we identify the viral DNA sequences in blood of over 8,000 individuals undergoing whole genome sequencing. This approach serves to identify 94 viruses; however, many are shown to reflect widespread DNA contamination of commercial reagents or of environmental origin. While this represents a significant limitation to reliably identify novel viruses infecting humans, we could confidently detect sequences and quantify abundance of 19 human viruses in 42% of individuals. Ancestry, sex, and age were important determinants of viral prevalence. This large study calls attention on the challenge of interpreting next generation sequencing data for the identification of novel viruses. However, it serves to categorize the abundance of human DNA viruses using an unbiased technique.

          Related collections

          Most cited references77

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Fast and accurate short read alignment with Burrows–Wheeler transform

          Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ∼10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: rd@sanger.ac.uk
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Basic local alignment search tool.

            A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Fast and sensitive protein alignment using DIAMOND.

              The alignment of sequencing reads against a protein reference database is a major computational bottleneck in metagenomics and data-intensive evolutionary projects. Although recent tools offer improved performance over the gold standard BLASTX, they exhibit only a modest speedup or low sensitivity. We introduce DIAMOND, an open-source algorithm based on double indexing that is 20,000 times faster than BLASTX on short reads and has a similar degree of sensitivity.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Pathog
                PLoS Pathog
                plos
                plospath
                PLoS Pathogens
                Public Library of Science (San Francisco, CA USA )
                1553-7366
                1553-7374
                22 March 2017
                March 2017
                : 13
                : 3
                : e1006292
                Affiliations
                [1 ]Human Longevity Inc., San Diego, California, United States of America
                [2 ]Human Longevity Singapore Pte. Ltd., Singapore
                [3 ]Blood Systems Research Institute, Department of Laboratory Medicine, University of California San Francisco, San Francisco, California, United States of America
                [4 ]J. Craig Venter Institute, La Jolla, California, United States of America
                Plymouth University, UNITED KINGDOM
                Author notes

                Except ED, all authors are employees or own stock of Human Longevity Inc.

                • Conceptualization: AT JCV.

                • Data curation: AM CX EK EW.

                • Formal analysis: AM CX EK EW.

                • Funding acquisition: JCV.

                • Investigation: AT AM WB.

                • Methodology: AM CX.

                • Project administration: AT.

                • Resources: CX.

                • Software: AM CX.

                • Supervision: YT KB KEN JCV AT.

                • Validation: ED AT.

                • Visualization: AM.

                • Writing – original draft: AT ED AM.

                • Writing – review & editing: AT ED AM.

                Author information
                http://orcid.org/0000-0002-0111-3555
                http://orcid.org/0000-0002-3564-2498
                http://orcid.org/0000-0001-9480-869X
                http://orcid.org/0000-0002-6296-4484
                http://orcid.org/0000-0001-6290-7677
                Article
                PPATHOGENS-D-16-02747
                10.1371/journal.ppat.1006292
                5378407
                28328962
                e3b2796f-ee81-4ec4-90fe-c6d42dcc7ed7
                © 2017 Moustafa et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 8 December 2016
                : 14 March 2017
                Page count
                Figures: 6, Tables: 1, Pages: 20
                Funding
                Funded by Human Longevity Inc. The funders had a role in study design, data collection and analysis, decision to publish, and preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Anatomy
                Body Fluids
                Blood
                Medicine and Health Sciences
                Anatomy
                Body Fluids
                Blood
                Biology and Life Sciences
                Physiology
                Body Fluids
                Blood
                Medicine and Health Sciences
                Physiology
                Body Fluids
                Blood
                Biology and Life Sciences
                Genetics
                Genomics
                Human Genomics
                Biology and Life Sciences
                Genetics
                Genomics
                Microbial Genomics
                Viral Genomics
                Biology and Life Sciences
                Microbiology
                Microbial Genomics
                Viral Genomics
                Biology and Life Sciences
                Microbiology
                Virology
                Viral Genomics
                Research and Analysis Methods
                Database and Informatics Methods
                Biological Databases
                Genomic Databases
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Genomic Databases
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Analysis
                Genomic Databases
                Biology and life sciences
                Organisms
                Viruses
                RNA viruses
                Flaviviruses
                Hepacivirus
                Hepatitis C virus
                Biology and life sciences
                Microbiology
                Medical microbiology
                Microbial pathogens
                Viral pathogens
                Flaviviruses
                Hepacivirus
                Hepatitis C virus
                Medicine and health sciences
                Pathology and laboratory medicine
                Pathogens
                Microbial pathogens
                Viral pathogens
                Flaviviruses
                Hepacivirus
                Hepatitis C virus
                Biology and life sciences
                Organisms
                Viruses
                Viral pathogens
                Flaviviruses
                Hepacivirus
                Hepatitis C virus
                Biology and life sciences
                Microbiology
                Medical microbiology
                Microbial pathogens
                Viral pathogens
                Hepatitis viruses
                Hepatitis C virus
                Medicine and health sciences
                Pathology and laboratory medicine
                Pathogens
                Microbial pathogens
                Viral pathogens
                Hepatitis viruses
                Hepatitis C virus
                Biology and life sciences
                Organisms
                Viruses
                Viral pathogens
                Hepatitis viruses
                Hepatitis C virus
                Biology and life sciences
                Organisms
                Viruses
                DNA viruses
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Genomic Libraries
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Analysis
                Genomic Libraries
                Biology and Life Sciences
                Microbiology
                Medical Microbiology
                Microbial Pathogens
                Viral Pathogens
                Medicine and Health Sciences
                Pathology and Laboratory Medicine
                Pathogens
                Microbial Pathogens
                Viral Pathogens
                Biology and Life Sciences
                Organisms
                Viruses
                Viral Pathogens
                Custom metadata
                vor-update-to-uncorrected-proof
                2017-04-03
                Virome reads are available for downloading at www.HLI-OpenData.com/Virome/. In addition, see the Data Access Statement ( www.humanlongevity.com/wp-content/uploads/HLIDataAccessAgreement020416.docx.) for information on extended access.

                Infectious disease & Microbiology
                Infectious disease & Microbiology

                Comments

                Comment on this article