Deep Phenotyping: Embracing Complexity and Temporality—Towards Scalability, Portability, and Interoperability

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

1 Introduction Clinical data are the basic staple of health learning [1]. The rapidly growing interoperable clinical datasets, including electronic health records (EHR), administrative and claims records, and human phenotype data collected from clinical research studies, present unprecedented opportunities for developing high-throughput methods for electronic phenotyping. The term phenotyping is still young and evolving, emerging around 2006 when the Electronic Medical Records and Genomics (eMERGE) program was launched in the United States and directed significant effort into phenotyping using EHR data [2]. Meanwhile, the biomedical informatics research community has been exploring electronic phenotyping solutions for decades, given the essential roles electronic phenotyping plays in disease knowledge discovery, application, and clinical research. Early phenotyping research focused primarily on case ascertainment or cohort identification [3]. In contrast, deep phenotyping shifted the focus from identification to characterization, which aims to deliver precise and comprehensive characterization of observable traits representing unique morphological, biochemical, physiological, or behavioral properties of the identified patient populations [4]. Deep phenotyping brings us a step closer on the path towards Precision Medicine via the development of precise disease classification systems. Consequently, the task of deep phenotyping poses the following new requirements. First , it requires the extraction of nuanced phenotypic traits, such as “short stature”, “large head”, “poor weight gain”, “depressed nasal bridge”, and “clubbing of toes”. These traits are occasionally available in structured coded data but more often can only be accurately communicated as clinical text. Therefore, natural language processing (NLP) is essential for identifying the rich information to accomplish deep phenotyping for many phenotypes [5], and has been playing an increasingly important role in deep phenotyping. Second , to achieve richer, deeper, and more precise characterization, deep phenotyping algorithms need to be more expressive, semantically interoperable, and interpretable than the conventional “black box” computational solutions, which are often criticized for lacking explainability [6]. Therefore, the extracted phenotypes need to be normalized using various clinical terminologies or ontologies such as HPO (The Human Phenotype Ontology), ICD-9-CM, ICD-10-CM, and SNOMED-CT. This requirement is at odds with the first requirement because the existing semantic knowledge resources have varying concept coverage and concept granularities, posing challenges for concept normalization in addition to the limitations noted with the first requirement. Consequently, creating code sets for a phenotype has become a foundational building block and yet the most time-consuming bottleneck in the knowledge engineering process for deep phenotyping. Since clinical databases tend to be heterogeneous, empirical knowledge of data sources and clinical processes is critical for identifying useful codes that have high sensitivity and specificity at a given site [7]. This heterogeneity also creates barriers to portability and interoperability of solutions leveraging distributed big clinical data for clinical phenotyping [8]. Standards beyond terminologies and ontologies at a higher level, such as common data models, promise to meet some of these needs, but efforts to improve existing widely adopted standards are still warranted. Third , deep phenotyping requires the processing of a broad range of data types, which can include voice recordings, videos, images, genome sequences, or biological pathways, to name a few. For example, for the current COVID-19 pandemic, the ground-glass appearance in the lower lungs in chest x-rays for COVID-19 patients paired with clinical narrative reports for dry cough and dyspnea (labored breathing) are key phenotypes of COVID-19-positive patients [9], [10], [11]. Fourth , deep phenotyping requires more sophisticated analytics beyond case-control classification and may involve the characterization of the temporal trajectory of a phenotype or the identification of disease subtypes based on their differentiating phenotypes, progressions, and clinical outcomes. Incorporation of domain knowledge remains critical. Both supervised and unsupervised methods will be useful for deep phenotyping in different contexts. Fifth, deep phenotyping requires the identification of connections between diseases and their common phenotypic traits (i.e., phenotypes that may be observed across various disease domains) to enable linking potentially common etiologies across diseases and to facilitate important applications such as drug repurposing based on disease physiology commonalities. For example, for the current COVID-19 pandemic, international doctors have reported that loss of smell or taste are widely observed in infected patients. Similarly, studies have also shown olfactory deficits are prevalent in patients with Alzheimer’s disease [12]. Therefore, the currently hot research on calculating disease similarities will play an important role in deep phenotyping to fulfill this need [13]. In order to satisfy the above requirements for advancing deep phenotyping, this special issue offers twenty original articles presenting novel methodologies for case ascertainment, patient stratification, disease subtyping and temporal phenotyping. These novel methods were demonstrated across various disease domains (such as cancer, rare diseases, obesity, acute or chronic kidney diseases, and schizophrenia, to name a few) using a broad range of novel data sources (including clinical narratives, voice, biological pathways, research questionnaire data, and claims data in addition to EHR data) while addressing challenges in data quality, algorithm portability and interoperability, process efficiency and scalability. Table 1 lists the primary contributions of these twenty articles for deep phenotyping towards the above five topic areas. Table 1 Contributions of the twenty included papers in response to the five requirements for deep phenotyping Reference First author Summary of Contributions Requirement 1: Natural Language Processing [14] Datta, S A systematic review of NLP on cancer notes [15] Liu, Q Symptom extraction for patient stratification [16] Lyudovyk, O NLP on pathology notes for subtyping [17] Liu, C Ensemble of NLP for better portability Requirement 2: Standardization [18] Hong, N A FHIR-based EHR phenotyping framework [19] Shang, N An empirical study of “making phenotyping work visible” that demonstrates the need for standardized processes [20] Hripcsak, G Demonstrate OMOP’s value in improving phenotyping algorithms’ portability [21] Ostropolets, A Adapting EHR phenotypes to claims data using OMOP Common Data Model [22] Reps, J OMOP CDM-based probabilistic phenotyping algorithms using self-reported data [23] Swerdel, J OMOP CDM-based standardized phenotype evaluation algorithms [24] Warner, J Expansion of OMOP CDM to cancer phenotypes [25] Shen, F Extension of HPO using embedding of phenotype knowledge resources Requirement 3: Novel Data for Phenotyping [26] Trace, JM Using voice to diagnose Parkinson’s disease Requirement 4: Temporal Phenotyping and Subtyping via Similarity Metrics [27] Mate, S A graphical model of temporal constraints [28] Meng, W Temporal phenotyping of cancer treatment pathways [29] Zhao, J Temporal phenotyping via tensor factorization [30] Chen, X Phenotypic similarity for rare diseases [31] Xu, Z Subtyping for acute kidney injury Requirement 5: Scalability [32] Zhang, L Automated grouping of medical codes [33] Chen, P Deep representation learning for phenotyping 2 Natural Language Processing for Deep Phenotyping Four articles in this special issue leveraged NLP on different data resources for phenotyping [14], [15], [16], [17]. Datta et al. provided a methodology review that provides a frame semantic overview of NLP-based information extraction from EHR notes [14]. Using cancer as an example, this article contributes a model for identifying important disease-specific information using NLP techniques and serves as a useful resource for future researchers requiring disease-specific information extracted from EHR notes. Liu et al. extracted symptom concepts from clinical notes to stratify patients with mental illness [15]. Contextual terms were extracted to identify constellations of symptoms in a cohort of patients diagnosed with schizophrenia and related disorders. Topic modeling and dimensionality reduction were applied to identify similar groups of patients, who were further evaluated through visualization and interrogation of clinically interpretable weighted features. Lyudovyk et al. extracted differentiating clinical phenotypes from pathology reports and combined these phenotypes with biological pathways to identify novel cancer subtypes with prognostic value [16]. This paper shows the value of integrating multi-level biological and clinical observations for deep phenotyping. Liu et al. addresses the portability challenge facing NLP phenotyping algorithms by presenting an ensemble-based study [17]. Compared to the previously published ensemble methods, this study initially reported comprehensive comparative effectiveness of four different ensemble techniques over four widely-adopted NLP systems, i.e., MetaMapLite [34], MedLEE [35], ClinPhen [36], and cTAKES[37]. The authors evaluated the performance of different approaches in identifying generic phenotypic concepts and patient-specific phenotypic concepts, respectively, and demonstrated that both tasks benefit from the ensemble techniques. 3 Standards: Uses and Development A total of eight papers fall into the category of standards use and development. The first paper leverages the fairly new standard, FHIR (Fast Healthcare Interoperability Resource), for phenotyping. Hong et al. contributed a FHIR-based framework for phenotyping [18] and demonstrated its effectiveness in identifying patients with obesity and multiple comorbidities from clinical text. Given the National Institutes of Health’s endorsement of FHIR as the primary data standard for supporting clinical research, this work is timely and relevant. The next cluster of contributions [19], [20], [21], [22], [23] are from the burgeoning open science community, The Observational Health Data Sciences and Informatics (OHDSI) consortium (www.ohdsi.org), in collaboration with the eMERGE network. All of these papers use the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) to standardizing phenotyping algorithms, data models, or evaluation methods. These standards-based solutions enable transparent collaboration and improve the scalability and efficiency of deep phenotyping. From firsthand knowledge, retrospective analysis and user surveys, Shang et al. summarized all of the manual effort required to implement electronic phenotypes within the eMERGE Network [19]. This study introduces a novel Knowledge-Interpretation-Programming (KIP) metric to measure portability of phenotype algorithms for quantifying such efforts across the eMERGE Network. The authors concluded that the OMOP CDM can be employed to improve the portability for some ‘knowledge-oriented’ tasks. Hripcsak et al. conducted a network-wide study in the eMERGE Network and further demonstrated that the OMOP CDM can facilitate phenotype transfer across network sites and minimize manual implementation [20]. Ostropolets et al. reported lessons learned in adapting EHR-derived phenotypes to claims data, which contain relatively more limited information [21]. The authors succeeded in improving the generalizability and consistency of the chronic kidney disease (CKD) phenotypes by using data and vocabulary standardized by the OMOP CDM. However, performance varied across datasets, implying that even when using a standardized vocabulary, it is important to identify and address coding and data heterogeneity to improve the performance of electronic phenotypes. Reps et al. developed a novel generalizable probabilistic phenotype model, Current Risk of Smoking Status (CROSS), for current smoking status identification using claims data, which often contains limited data [22]. CROSS can be readily implemented to any US insurance claims mapped to the OMOP CDM and will be useful to impute smoking status when conducting epidemiology studies where smoking is a known confounder but smoking status is not recorded. A big challenge facing phenotyping research is the lack of rigorous evaluation, which often entails laborious and expensive manual gold standard generation. In order to address this challenge, Swerdel et al. from the OHDSI community contributed an open-source evaluation tool called PheValuator to estimate phenotype algorithm performance [23]. A key contribution of this work is in that it enables scalable, unsupervised evaluation of electronic phenotypes without laborious manual gold standards generation. Existing standards often have limitations in their content coverage. Warner et al. extended the widely adopted OMOP CDM to expand coverage within the cancer domain by defining a new standard vocabulary for chemotherapy regimen [24]. Similarly, Shen et al. developed a scalable knowledge engineering method to enrich node embedding for HPO [25]. The authors parsed disease-phenotype associations contained in heterogeneous knowledge resources such as OMIM and Orphanet to enrich non-inheritance relationships among phenotypic nodes in HPO. 4 Disease Subtyping Using Novel Data Sources and Temporal Reasoning Trace et al. applied conventional machine learning methods on voice signals to differentiate participants with Parkinson’s disease (PD) who exhibit little to no symptoms from healthy controls. They confirmed that voice may serve as a deep phenotype for Parkinson’s disease [26]. Both conventional machine learning and emerging deep learning methods have been leveraged for identifying disease subtypes based on temporal characterization of diseases and patient similarity measures. Mate et al. addressed the temporal complexity of cohort queries and the limitations in existing query tools by creating a novel graphical model for representing temporal cohort queries [27]. This work is a significant applied extension of Allen’s time interval algebra. The model was demonstrated to be effective in representing typical temporal phenotype queries using the public MIMIC data. Meng et al. developed temporal phenotyping methods to derive cancer treatment pathways within a large insurance claims dataset [28]. The authors aggregated lines of therapy information via clustering followed with data visualization to derive temporal cancer phenotypes in support of disease management and progression prediction. Zhao et al. applied a constrained non-negative tensor-factorization approach on electronic health records to detect temporal phenotypes of complex cardiovascular diseases (CVD) [29]. From a cohort of 12,380 CVD adults, they identified 14 subphenotypes. Through the association analysis with estimated CVD risk for each subtype, they found novel phenotypic topics such as Vitamin D deficiency, depression, and urinary infections. Through a survival analysis, the authors found different risks of subsequent myocardial infection following the diagnosis of CVD among the six most prevalent topics (p < 0.0001), indicating their correspondence to clinically meaningful subphenotypes of CVD. Of note, this study leveraged a coding standard called PheCode [38] to reduce dimensions. 5 Disease Subtyping Based on Patient Similarity Chen et al. developed a comprehensive phenotype similarity metric integrating clinical and questionnaire data for subtyping rare diseases and applied it to ciliopathies [30]. The computed similarity was then validated using genomic data. Similarly, Lyudovyk et al. evaluated the use of genomic test reports ordered for cancer patients in order to derive cancer subtypes and to identify biological pathways predictive of poor survival outcomes [16]. A novel patient similarity metric based on affected biological pathways was proposed. The authors demonstrated that this approach identified subtypes of prognostic value, linked to survival, with implications for precision treatment selection and a better understanding of the underlying disease. Xu et al. used a memory network-based deep learning approach to discover three acute kidney injury (AKI) sub-phenotypes using EHR data [31]. Group one had an average age of 63.0 ± 17.3 years and mild loss of kidney excretory function, characterizing patients more likely to develop stage I AKI. Group two had an average age of 66.8 ± 10.4 years and severe loss of kidney function, characterizing patients more likely to develop stage III AKI. Group three had an average age of 65.1 ± 11.3 years and moderate loss of kidney function, characterizing patients more likely to develop stage II AKI. 6 Overcoming Challenges in Data Quality and Scalability Zhang et al. addresses the scalability challenge in developing accurate phenotype algorithms while minimizing manual efforts by contributing a data-driven approach to automate grouping medical terms into clinically relevant concepts by combining multiple up-to-date data sources in an unbiased manner [32]. The proposed method consists of a banding step that leverages the prior knowledge from the existing coding hierarchy, and a combining step that performs spectral clustering on an optimally weighted matrix. The resulting ICD groupings enjoy comparable interpretability and consistency with the current ICD hierarchy. In contrast to manually creating unbiased estimators of treatment effects, which can be time-consuming and subjective, Chen et al. contributed a scalable method for deep representation learning [33] and applied it for individualized treatment effect estimation using EHR data. The automatically trained representation revealed consistent findings with existing medical knowledge and generated new clinical hypotheses. 7 Summary We observed the following trends in deep phenotyping research from this set of articles (Table 2 ). Both the collaborative open science consortia of OHDSI and eMERGE collectively have made significant contributions to deep phenotyping research, with each contributing five or six articles to this special issue, and established best practices for standards-based collaborative phenotyping efforts. The OMOP CDM contributed by OHDSI has demonstrated its promise in facilitating the reusability, efficiency, portability and reproducibility of electronic phenotyping within the eMERGE network. Cancer offers more research challenges and opportunities than other diseases for deep phenotyping. Rare diseases may suffer from limited data but still have a huge need for deep phenotyping. One third of the included papers leveraged the emerging deep learning technologies. Sixty percent of the included papers use formal methods or standards, implying the significant value of data standards in deep phenotyping. About one third of the studies leveraged various NLP methods to include important clinical text in phenotyping. NLP will continue to play an important role in phenotyping research. Six (30%) articles used novel data sources, including clinical text, biological pathways, voice, public knowledge bases, and claims data. Table 2 Observed trends (NLP: using NLP; STND: using formal standards or semantic knowledge resources such as UMLS; DL: using deep learning technologies; SRC: using novel phenotype data sources; SUB: developing subtypes) Reference First author NLP STND DL SRC SUB Disease Focus eMERGE OHDSI [14] Datta, S X [15] Liu, Q X X [16] Lyudovyk, O X X X Cancer X [17] Liu, C X X X [18] Hong, N X [19] Shang, N X X X [20] Hripcsak, G X X X [21] Ostropolets, A X X X [22] Reps, J X X X [23] Swerdel, J X X [24] Warner, J X Cancer X [25] Shen, F X X X X [26] Trace, JM X Parkinson’s [27] Mate, S X [28] Meng, W Cancer [29] Zhao, J X X Cardiovascular X [30] Chen, X X X Rare disease [31] Xu, Z X X X Acute kidney injury [32] Zhang, L X X [33] Chen, P X TOTAL 6 12 6 6 3 5 6 We expect that broad sharing of phenotype definitions and inclusion of diverse data-types in those definitions will continue. We anticipate the next set of challenges to be around including time information in phenotyping as well as in defining the criteria when a phenotype ends. We expect advances in incorporating knowledge--such as physiology and previous evidence--into the phenotype generating process. Another fruitful area of investigation will be automated ways of estimating the portability of phenotype definitions and defining conditions under which porting a definition is unlikely to work. The broad sharing of phenotype definitions already occurring positions us well to pursue these research directions. Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Related collections

Most cited references 34

Record: found
Abstract: found
Article: found

World Health Organization declares global emergency: A review of the 2019 novel coronavirus (COVID-19)

Catrin Sohrabi, Zaid Alsafi, Niamh O'Neill … (2020)

An unprecedented outbreak of pneumonia of unknown aetiology in Wuhan City, Hubei province in China emerged in December 2019. A novel coronavirus was identified as the causative agent and was subsequently termed COVID-19 by the World Health Organization (WHO). Considered a relative of severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome (MERS), COVID-19 is caused by a betacoronavirus named SARS-CoV-2 that affects the lower respiratory tract and manifests as pneumonia in humans. Despite rigorous global containment and quarantine efforts, the incidence of COVID-19 continues to rise, with 90,870 laboratory-confirmed cases and over 3,000 deaths worldwide. In response to this global outbreak, we summarise the current state of knowledge surrounding COVID-19.

0 comments Cited 2166 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Frequency and Distribution of Chest Radiographic Findings in COVID-19 Positive Patients

Ho Yuen Frank Wong, Hiu Yin Sonia Lam, Ambrose Fong … (2020)

Background Current COVID-19 radiological literature is dominated by CT and a detailed description of chest x-ray (CXR) appearances in relation to the disease time course is lacking. Purpose To describe the time course and severity of the CXR findings of COVID-19 and correlate these with real time reverse transcription polymerase chain reaction (RT-PCR) testing for SARS-Cov-2 nucleic acid. Materials and Methods Retrospective study of COVID-19 patients with RT-PCR confirmation and CXRs admitted across 4 hospitals evaluated between January and March 2020. Baseline and serial CXRs (total 255 CXRs) were reviewed along with RT-PCRs. Correlation with concurrent CTs (total 28 CTs) was made when available. Two radiologists scored each CXR in consensus for: consolidation, ground glass opacity (GGO), location and pleural fluid. A severity index was determined for each lung. The lung scores were summed to produce the final severity score. Results There were 64 patients (26 men, mean age 56±19 years). Of these, 58, 44 and 38 patients had positive initial RT-PCR (91%, [CI: 81-96%]), abnormal baseline CXR (69%, [CI: 56-80%]) and positive initial RT-PCR with abnormal baseline CXR (59 [CI:46-71%]) respectively. Six patients (9%) showed CXR abnormalities before eventually testing positive on RT-PCR. Sensitivity of initial RT-PCR (91% [95% CI: 83-97%]) was higher than baseline CXR (69% [95% CI: 56-80%]) (p = 0.009). Radiographic (mean 6 ± 5 days) and virologic recovery (mean 8 ± 6 days) were not significantly different (p= 0.33). Consolidation was the most common finding (30/64, 47%), followed by GGO (21/64, 33%). CXR abnormalities had a peripheral (26/64, 41%) and lower zone distribution (32/64, 50%) with bilateral involvement (32/64, 50%). Pleural effusion was uncommon (2/64, 3%). The severity of CXR findings peaked at 10-12 days from the date of symptom onset. Conclusion Chest x-ray findings in COVID-19 patients frequently showed bilateral lower zone consolidation which peaked at 10-12 days from symptom onset.

0 comments Cited 587 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Chest Radiographic and CT Findings of the 2019 Novel Coronavirus Disease (COVID-19): Analysis of Nine Patients Treated in Korea

Soon Ho Yoon, Kyung Hee Lee, Jin Kim … (2020)

Objective This study presents a preliminary report on the chest radiographic and computed tomography (CT) findings of the 2019 novel coronavirus disease (COVID-19) pneumonia in Korea. Materials and Methods As part of a multi-institutional collaboration coordinated by the Korean Society of Thoracic Radiology, we collected nine patients with COVID-19 infections who had undergone chest radiography and CT scans. We analyzed the radiographic and CT findings of COVID-19 pneumonia at baseline. Fisher's exact test was used to compare CT findings depending on the shape of pulmonary lesions. Results Three of the nine patients (33.3%) had parenchymal abnormalities detected by chest radiography, and most of the abnormalities were peripheral consolidations. Chest CT images showed bilateral involvement in eight of the nine patients, and a unilobar reversed halo sign in the other patient. In total, 77 pulmonary lesions were found, including patchy lesions (39%), large confluent lesions (13%), and small nodular lesions (48%). The peripheral and posterior lung fields were involved in 78% and 67% of the lesions, respectively. The lesions were typically ill-defined and were composed of mixed ground-glass opacities and consolidation or pure ground-glass opacities. Patchy to confluent lesions were primarily distributed in the lower lobes (p = 0.040) and along the pleura (p < 0.001), whereas nodular lesions were primarily distributed along the bronchovascular bundles (p = 0.006). Conclusion COVID-19 pneumonia in Korea primarily manifested as pure to mixed ground-glass opacities with a patchy to confluent or nodular shape in the bilateral peripheral posterior lungs. A considerable proportion of patients with COVID-19 pneumonia had normal chest radiographs.

0 comments Cited 290 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Chunhua Weng: Role: Guest Editors

Nigam Shah

George Hripcsak

Journal

Journal ID (nlm-ta): J Biomed Inform

Journal ID (iso-abbrev): J Biomed Inform

Title: Journal of Biomedical Informatics

Publisher: Elsevier Inc.

ISSN (Print): 1532-0464

ISSN (Electronic): 1532-0480

Publication date PMC-release: 23 April 2020

Publication date (Electronic): 23 April 2020

Electronic Location Identifier: 103433

Affiliations

[a ]Department of Biomedical Informatics, Columbia University, New York, NY, USA

[b ]Medicine - Biomedical Informatics Research, Stanford University, Stanford, CA, USA

Article

Publisher ID: S1532-0464(20)30061-7 Publisher ID: 103433

DOI: 10.1016/j.jbi.2020.103433

PMC ID: 7179504

PubMed ID: 32335224

SO-VID: f8c42b7c-568e-4738-8cbc-473435e5e82d

License:

Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.

Deep Phenotyping: Embracing Complexity and Temporality—Towards Scalability, Portability, and Interoperability

Read this article at

Abstract

Related collections

Novel Coronavirus Disease COVID-19

Most cited references 34

World Health Organization declares global emergency: A review of the 2019 novel coronavirus (COVID-19)

Frequency and Distribution of Chest Radiographic Findings in COVID-19 Positive Patients

Chest Radiographic and CT Findings of the 2019 Novel Coronavirus Disease (COVID-19): Analysis of Nine Patients Treated in Korea

Author and article information

Contributors

Journal

Affiliations

Article

History

Categories

Comments

Comment on this article

Similar content 11

Cited by 22

Most referenced authors 511