The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Abstract The GWAS Catalog delivers a high-quality curated collection of all published genome-wide association studies enabling investigations to identify causal variants, understand disease mechanisms, and establish targets for novel therapies. The scope of the Catalog has also expanded to targeted and exome arrays with 1000 new associations added for these technologies. As of September 2018, the Catalog contains 5687 GWAS comprising 71673 variant-trait associations from 3567 publications. New content includes 284 full P-value summary statistics datasets for genome-wide and new targeted array studies, representing 6 × 109 individual variant-trait statistics. In the last 12 months, the Catalog's user interface was accessed by ∼90000 unique users who viewed >1 million pages. We have improved data access with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database. Summary statistics provision is supported by a new format proposed as a community standard for summary statistics data representation. This format was derived from our experience in standardizing heterogeneous submissions, mapping formats and in harmonizing content. Availability: https://www.ebi.ac.uk/gwas/.

Related collections

Most cited references 35

Record: found
Abstract: found
Article: not found

PLINK: a tool set for whole-genome association and population-based linkage analyses.

Shaun Purcell, Benjamin M. Neale, Kathe Todd-Brown … (2007)

Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

0 comments Cited 5314 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

A global reference for human genetic variation

Lachlan Coin, Robert Garry, Oleksyk Taras (2017)

The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

0 comments Cited 4130 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

The FAIR Guiding Principles for scientific data management and stewardship

Mark D Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg … (2016)

There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.

0 comments Cited 3013 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Title: Nucleic Acids Research

Publisher: Oxford University Press (OUP)

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date Created: January 08 2019

Publication date Created: November 16 2018

Publication date Other: January 08 2019

Publication date (Print): January 08 2019

Publication date (Electronic): November 16 2018

Volume: 47

Issue: D1

Pages: D1005-D1012

Affiliations

[1 ]European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK

[2 ]Open Targets, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK

[3 ]Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK

[4 ]JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Wellcome Centre for Human Genetics, University of Oxford, NIHR Oxford Biomedical Research Centre, Nuffield Department of Medicine, Oxford, UK

[5 ]Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA

Article

DOI: 10.1093/nar/gky1120

SO-VID: 92bea28b-e4ed-4f83-a854-5f867a4998d3

License:

http://creativecommons.org/licenses/by/4.0/

History

Data availability:

Comments

Comment on this article

scite_

Cited by 1,661

See all cited by

Most referenced authors 3,391

See all reference authors

The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019

Read this article at

Abstract

Related collections

Association of European University Presses (AEUP)

Most cited references 35

PLINK: a tool set for whole-genome association and population-based linkage analyses.

A global reference for human genetic variation

The FAIR Guiding Principles for scientific data management and stewardship

Author and article information

Journal

Affiliations

Article

History

Comments

Comment on this article

Similar content 2,976

Cited by 1,661

Most referenced authors 3,391