49
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      The Genographic Project Public Participation Mitochondrial DNA Database

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Genographic Project is studying the genetic signatures of ancient human migrations and creating an open-source research database. It allows members of the public to participate in a real-time anthropological genetics study by submitting personal samples for analysis and donating the genetic results to the database. We report our experience from the first 18 months of public participation in the Genographic Project, during which we have created the largest standardized human mitochondrial DNA (mtDNA) database ever collected, comprising 78,590 genotypes. Here, we detail our genotyping and quality assurance protocols including direct sequencing of the mtDNA HVS-I, genotyping of 22 coding-region SNPs, and a series of computational quality checks based on phylogenetic principles. This database is very informative with respect to mtDNA phylogeny and mutational dynamics, and its size allows us to develop a nearest neighbor–based methodology for mtDNA haplogroup prediction based on HVS-I motifs that is superior to classic rule-based approaches. We make available to the scientific community and general public two new resources: a periodically updated database comprising all data donated by participants, and the nearest neighbor haplogroup prediction tool.

          Author Summary

          The Genographic Project was launched in 2005 to address anthropological questions on a global scale using genetics as a tool. Samples are collected in two ways. First, the project comprises a consortium of ten scientific teams from around the world united by a core ethical and scientific framework that is responsible for sample collection and analysis in their respective region. Second, the project promotes public participation in countries around the world and anyone can participate by purchasing a participation kit ( Video S1). The mitochondrial DNA (mtDNA), typed in female participants, is inherited from the mother without recombining, being particularly informative with respect to maternal ancestry. Over the first 18 months of public participation in the project we have built up the largest to date database of mtDNA variants, containing 78,590 entries from around the world. Here, we describe the procedures used to generate, manage, and analyze the genetic data, and the first insights from them. We can understand new aspects of the structure of the mtDNA tree and develop much better ways of classifying mtDNA. We therefore now release this dataset and the new methods we have developed, and will continue to update them as more people join the Genographic Project.

          Related collections

          Most cited references34

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          An enhanced MITOMAP with a global mtDNA mutational phylogeny

          The MITOMAP () data system for the human mitochondrial genome has been greatly enhanced by the addition of a navigable mutational mitochondrial DNA (mtDNA) phylogenetic tree of ∼3000 mtDNA coding region sequences plus expanded pathogenic mutation tables and a nuclear-mtDNA pseudogene (NUMT) data base. The phylogeny reconstructs the entire mutational history of the human mtDNA, thus defining the mtDNA haplogroups and differentiating ancient from recent mtDNA mutations. Pathogenic mutations are classified by both genotype and phenotype, and the NUMT sequences permits detection of spurious inclusion of pseudogene variants during mutation analysis. These additions position MITOMAP for the implementation of our automated mtDNA sequence analysis system, Mitomaster.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A nomenclature system for the tree of human Y-chromosomal binary haplogroups.

            The Y chromosome contains the largest nonrecombining block in the human genome. By virtue of its many polymorphisms, it is now the most informative haplotyping system, with applications in evolutionary studies, forensics, medical genetics, and genealogical reconstruction. However, the emergence of several unrelated and nonsystematic nomenclatures for Y-chromosomal binary haplogroups is an increasing source of confusion. To resolve this issue, 245 markers were genotyped in a globally representative set of samples, 74 of which were males from the Y Chromosome Consortium cell line repository. A single most parsimonious phylogeny was constructed for the 153 binary haplogroups observed. A simple set of rules was developed to unambiguously label the different clades nested within this tree. This hierarchical nomenclature system supersedes and unifies past nomenclatures and allows the inclusion of additional mutations and haplogroups yet to be discovered.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Tracing European founder lineages in the Near Eastern mtDNA pool.

              Founder analysis is a method for analysis of nonrecombining DNA sequence data, with the aim of identification and dating of migrations into new territory. The method picks out founder sequence types in potential source populations and dates lineage clusters deriving from them in the settlement zone of interest. Here, using mtDNA, we apply the approach to the colonization of Europe, to estimate the proportion of modern lineages whose ancestors arrived during each major phase of settlement. To estimate the Palaeolithic and Neolithic contributions to European mtDNA diversity more accurately than was previously achievable, we have now extended the Near Eastern, European, and northern-Caucasus databases to 1,234, 2, 804, and 208 samples, respectively. Both back-migration into the source population and recurrent mutation in the source and derived populations represent major obstacles to this approach. We have developed phylogenetic criteria to take account of both these factors, and we suggest a way to account for multiple dispersals of common sequence types. We conclude that (i) there has been substantial back-migration into the Near East, (ii) the majority of extant mtDNA lineages entered Europe in several waves during the Upper Palaeolithic, (iii) there was a founder effect or bottleneck associated with the Last Glacial Maximum, 20,000 years ago, from which derives the largest fraction of surviving lineages, and (iv) the immigrant Neolithic component is likely to comprise less than one-quarter of the mtDNA pool of modern Europeans.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Genet
                pgen
                plge
                plosgen
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                1553-7390
                1553-7404
                June 2007
                29 June 2007
                : 3
                : 6
                : e104
                Affiliations
                [1 ] Genomics Research Center, Family Tree DNA, Houston, Texas, United States of America
                [2 ] Molecular Medicine Laboratory, Rambam Health Care Campus, Haifa, Israel
                [3 ] Data Analytics Research Group, IBM T. J. Watson Research Center, Yorktown Heights, New York, United States of America
                [4 ] The Genographic Project, National Geographic Society, Washington, District of Columbia, United States of America
                [5 ] Research Centre for Medical Genetics, Russian Academy of Medical Sciences, Moscow, Russia
                [6 ] Unitat de Biologia Evolutiva, Universitat Pompeu Fabra, Barcelona, Spain
                [7 ] Department of Genetics, La Trobe University, Bundoora, Australia
                [8 ] Institut Pasteur, Paris, France
                [9 ] CNRS, URA3012, Paris, France
                [10 ] The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
                University of Oxford, United Kingdom
                Author notes
                * To whom correspondence should be addressed. E-mail: genopubs@ 123456ngs.org
                Article
                07-PLGE-RA-0093R2 plge-03-06-23
                10.1371/journal.pgen.0030104
                1904368
                17604454
                9cad9802-6f4b-4a5b-a8d7-a274d73de5a6
                Copyright: © 2007 Behar et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 12 February 2007
                : 11 May 2007
                Page count
                Pages: 13
                Categories
                Research Article
                Computational Biology
                Genetics and Genomics
                Homo (Human)
                Custom metadata
                Behar DM, Rosset S, Blue-Smith J, Balanovsky O, Tzur S, et al. (2007) The Genographic Project public participation mitochondrial DNA database. PLoS Genet 3(6): e104. doi: 10.1371/journal.pgen.0030104

                Genetics
                Genetics

                Comments

                Comment on this article