Defining Phenotypes from Clinical Data to Drive Genomic Research

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The rise in available longitudinal patient information in electronic health records (EHRs) and their coupling to DNA biobanks have resulted in a dramatic increase in genomic research using EHR data for phenotypic information. EHRs have the benefit of providing a deep and broad data source of health-related phenotypes, including drug response traits, expanding the phenomes available to researchers for discovery. The earliest efforts at repurposing EHR data for research involved manual chart review of limited numbers of patients but now typically involve applications of rule-based and machine learning algorithms operating on sometimes huge corpora for both genome-wide and phenome-wide approaches. In this review, we highlight the current methods, impact, challenges, and opportunities for repurposing clinical data to define patient phenotypes for genomic discovery. Use of EHR data has proven a powerful method for elucidating genomic influences on diseases, traits, and drug-response phenotypes and will continue to have increasing applications in large cohort studies.

Most cited references 73

Record: found
Abstract: found
Article: not found

Personal health records: definitions, benefits, and strategies for overcoming barriers to adoption.

Paul Tang, Joan S Ash, David Bates … (2005)

Recently there has been a remarkable upsurge in activity surrounding the adoption of personal health record (PHR) systems for patients and consumers. The biomedical literature does not yet adequately describe the potential capabilities and utility of PHR systems. In addition, the lack of a proven business case for widespread deployment hinders PHR adoption. In a 2005 working symposium, the American Medical Informatics Association's College of Medical Informatics discussed the issues surrounding personal health record systems and developed recommendations for PHR-promoting activities. Personal health record systems are more than just static repositories for patient data; they combine data, knowledge, and software tools, which help patients to become active participants in their own care. When PHRs are integrated with electronic health record systems, they provide greater benefits than would stand-alone systems for consumers. This paper summarizes the College Symposium discussions on PHR systems and provides definitions, system characteristics, technical architectures, benefits, barriers to adoption, and strategies for increasing adoption.

0 comments Cited 240 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network.

Melissa Basford, Peggy L. Peissig, Jodi L. Berg … (2013)

Genetic studies require precise phenotype definitions, but electronic medical record (EMR) phenotype data are recorded inconsistently and in a variety of formats. To present lessons learned about validation of EMR-based phenotypes from the Electronic Medical Records and Genomics (eMERGE) studies. The eMERGE network created and validated 13 EMR-derived phenotype algorithms. Network sites are Group Health, Marshfield Clinic, Mayo Clinic, Northwestern University, and Vanderbilt University. By validating EMR-derived phenotypes we learned that: (1) multisite validation improves phenotype algorithm accuracy; (2) targets for validation should be carefully considered and defined; (3) specifying time frames for review of variables eases validation time and improves accuracy; (4) using repeated measures requires defining the relevant time period and specifying the most meaningful value to be studied; (5) patient movement in and out of the health plan (transience) can result in incomplete or fragmented data; (6) the review scope should be defined carefully; (7) particular care is required in combining EMR and research data; (8) medication data can be assessed using claims, medications dispensed, or medications prescribed; (9) algorithm development and validation work best as an iterative process; and (10) validation by content experts or structured chart review can provide accurate results. Despite the diverse structure of the five EMRs of the eMERGE sites, we developed, validated, and successfully deployed 13 electronic phenotype algorithms. Validation is a worthwhile process that not only measures phenotype performance but also strengthens phenotype algorithm definitions and enhances their inter-institutional sharing.

0 comments Cited 148 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record

Wei-Qi Wei, Lisa Bastarache, Robert J. Carroll … (2017)

Objective To compare three groupings of Electronic Health Record (EHR) billing codes for their ability to represent clinically meaningful phenotypes and to replicate known genetic associations. The three tested coding systems were the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes, the Agency for Healthcare Research and Quality Clinical Classification Software for ICD-9-CM (CCS), and manually curated “phecodes” designed to facilitate phenome-wide association studies (PheWAS) in EHRs. Methods and materials We selected 100 disease phenotypes and compared the ability of each coding system to accurately represent them without performing additional groupings. The 100 phenotypes included 25 randomly-chosen clinical phenotypes pursued in prior genome-wide association studies (GWAS) and another 75 common disease phenotypes mentioned across free-text problem lists from 189,289 individuals. We then evaluated the performance of each coding system to replicate known associations for 440 SNP-phenotype pairs. Results Out of the 100 tested clinical phenotypes, phecodes exactly matched 83, compared to 53 for ICD-9-CM and 32 for CCS. ICD-9-CM codes were typically too detailed (requiring custom groupings) while CCS codes were often not granular enough. Among 440 tested known SNP-phenotype associations, use of phecodes replicated 153 SNP-phenotype pairs compared to 143 for ICD-9-CM and 139 for CCS. Phecodes also generally produced stronger odds ratios and lower p-values for known associations than ICD-9-CM and CCS. Finally, evaluation of several SNPs via PheWAS identified novel potential signals, some seen in only using the phecode approach. Among them, rs7318369 in PEPD was associated with gastrointestinal hemorrhage. Conclusion Our results suggest that the phecode groupings better align with clinical diseases mentioned in clinical practice or for genomic studies. ICD-9-CM, CCS, and phecode groupings all worked for PheWAS-type studies, though the phecode groupings produced superior results.

0 comments Cited 139 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Title: Annual Review of Biomedical Data Science

Abbreviated Title: Annu. Rev. Biomed. Data Sci.

Publisher: Annual Reviews

ISSN (Print): 2574-3414

ISSN (Electronic): 2574-3414

Publication date Created: July 20 2018

Publication date (Print): July 20 2018

Volume: 1

Issue: 1

Pages: 69-92

Affiliations

[1 ]Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA;

[2 ]Department of General Surgery, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA

[3 ]Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA

[4 ]Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA

Article

DOI: 10.1146/annurev-biodatasci-080917-013335

PubMed ID: 34109303

SO-VID: 29601c71-8990-4989-afff-5c3401990722

History

ScienceOpen disciplines: Molecular medicine,Biomedical engineering,Bioinformatics & Computational biology,Biotechnology,Genetics,Public health

Data availability:

ScienceOpen disciplines: Molecular medicine, Biomedical engineering, Bioinformatics & Computational biology, Biotechnology, Genetics, Public health

Defining Phenotypes from Clinical Data to Drive Genomic Research

Read this article at

Abstract

Most cited references 73

Personal health records: definitions, benefits, and strategies for overcoming barriers to adoption.

Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network.

Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record

Author and article information

Journal

Affiliations

Article

History

Comments

Comment on this article

Similar content 136

Cited by 14

Most referenced authors 1,647