Development and multicenter validation of chest X-ray radiography interpretations based on natural language processing

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

Artificial intelligence can assist in interpreting chest X-ray radiography (CXR) data, but large datasets require efficient image annotation. The purpose of this study is to extract CXR labels from diagnostic reports based on natural language processing, train convolutional neural networks (CNNs), and evaluate the classification performance of CNN using CXR data from multiple centers

Methods

We collected the CXR images and corresponding radiology reports of 74,082 subjects as the training dataset. The linguistic entities and relationships from unstructured radiology reports were extracted by the bidirectional encoder representations from transformers (BERT) model, and a knowledge graph was constructed to represent the association between image labels of abnormal signs and the report text of CXR. Then, a 25-label classification system were built to train and test the CNN models with weakly supervised labeling.

Results

In three external test cohorts of 5,996 symptomatic patients, 2,130 screening examinees, and 1,804 community clinic patients, the mean AUC of identifying 25 abnormal signs by CNN reaches 0.866 ± 0.110, 0.891 ± 0.147, and 0.796 ± 0.157, respectively. In symptomatic patients, CNN shows no significant difference with local radiologists in identifying 21 signs (p > 0.05), but is poorer for 4 signs (p < 0.05). In screening examinees, CNN shows no significant difference for 17 signs (p > 0.05), but is poorer at classifying nodules (p = 0.013). In community clinic patients, CNN shows no significant difference for 12 signs (p > 0.05), but performs better for 6 signs (p < 0.001).

Conclusion

We construct and validate an effective CXR interpretation system based on natural language processing.

Plain language summary

Chest X-rays are accompanied by a report from the radiologist, which contains valuable diagnostic information in text format. Extracting and interpreting information from these reports, such as keywords, is time-consuming, but artificial intelligence (AI) can help with this. Here, we use a type of AI known as natural language processing to extract information about abnormal signs seen on chest X-rays from the corresponding report. We develop and test natural language processing models using data from multiple hospitals and clinics, and show that our models achieve similar performance to interpretation from the radiologists themselves. Our findings suggest that AI might help radiologists to speed up interpretation of chest X-ray reports, which could be useful not only in patient triage and diagnosis but also cataloguing and searching of radiology datasets.

Abstract

Zhang et al. develop a natural language processing approach, based on the BERT model, to extract linguistic information from chest X-ray radiography reports. The authors establish a 25-label classification system for abnormal findings described in the reports and validate their model using data from multiple sites.

Related collections

Most cited references 36

Record: found
Abstract: not found
Article: not found

Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach

David DeLong, Elizabeth Delong, Daniel L Clarke-Pearson (1988)

0 comments Cited 2716 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Fleischner Society: glossary of terms for thoracic imaging.

David Hansell, Alexander Bankier, Heber MacMahon … (2008)

Members of the Fleischner Society compiled a glossary of terms for thoracic imaging that replaces previous glossaries published in 1984 and 1996 for thoracic radiography and computed tomography (CT), respectively. The need to update the previous versions came from the recognition that new words have emerged, others have become obsolete, and the meaning of some terms has changed. Brief descriptions of some diseases are included, and pictorial examples (chest radiographs and CT scans) are provided for the majority of terms. (c) RSNA, 2008.

0 comments Cited 877 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography

Diego Ardila, Atilla P Kiraly, Sujeeth Bharadwaj … (2019)

With an estimated 160,000 deaths in 2018, lung cancer is the most common cause of cancer death in the United States1. Lung cancer screening using low-dose computed tomography has been shown to reduce mortality by 20-43% and is now included in US screening guidelines1-6. Existing challenges include inter-grader variability and high false-positive and false-negative rates7-10. We propose a deep learning algorithm that uses a patient's current and prior computed tomography volumes to predict the risk of lung cancer. Our model achieves a state-of-the-art performance (94.4% area under the curve) on 6,716 National Lung Cancer Screening Trial cases, and performs similarly on an independent clinical validation set of 1,139 cases. We conducted two reader studies. When prior computed tomography imaging was not available, our model outperformed all six radiologists with absolute reductions of 11% in false positives and 5% in false negatives. Where prior computed tomography imaging was available, the model performance was on-par with the same radiologists. This creates an opportunity to optimize the screening process via computer assistance and automation. While the vast majority of patients remain unscreened, we show the potential for deep learning models to increase the accuracy, consistency and adoption of lung cancer screening worldwide.

0 comments Cited 436 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Xueqian Xie:

ORCID: http://orcid.org/0000-0002-6669-0097

xiexueqian@hotmail.com

Journal

Journal ID (nlm-ta): Commun Med (Lond)

Journal ID (iso-abbrev): Commun Med (Lond)

Title: Communications Medicine

Publisher: Nature Publishing Group UK (London )

ISSN (Electronic): 2730-664X

Publication date (Electronic): 28 October 2021

Publication date PMC-release: 28 October 2021

Publication date Collection: 2021

Volume: 1

Electronic Location Identifier: 43

Affiliations

[1 ]GRID grid.16821.3c, ISNI 0000 0004 0368 8293, Radiology Department, Shanghai General Hospital, , Shanghai Jiao Tong University School of Medicine, ; Haining Rd.100, Shanghai, 200080 China

[2 ]GRID grid.412478.c, ISNI 0000 0004 1760 4628, Radiology Department, , Shanghai General Hospital of Nanjing Medical University, ; Haining Rd.100, Shanghai, 200080 China

[3 ]Winning Health Technology Ltd., Shouyang Rd., Lane 99, No. 9, Shanghai, 200072 China

[4 ]GRID grid.16821.3c, ISNI 0000 0004 0368 8293, Radiology Department, Shanghai Sixth People Hospital, , Shanghai Jiao Tong University School of Medicine, ; Yishan Rd. 600, Shanghai, 200233 China

[5 ]GRID grid.16821.3c, ISNI 0000 0004 0368 8293, Department of Computer Science and Engineering, , Shanghai Jiao Tong University, ; Dongchuan Rd. 800, Shanghai, 200240 China

[6 ]GRID grid.4494.d, ISNI 0000 0000 9558 4598, University of Groningen, University Medical Center Groningen, Department of Epidemiology, ; Hanzeplein 1, 9713 GZ Groningen, The Netherlands

[7 ]GRID grid.4494.d, ISNI 0000 0000 9558 4598, University of Groningen, University Medical Center Groningen, Department of Radiology, ; Hanzeplein 1, 9713 GZ Groningen, The Netherlands

Author information

Geertruida H. de Bock http://orcid.org/0000-0003-3104-4471

Xueqian Xie http://orcid.org/0000-0002-6669-0097

Article

Publisher ID: 43

DOI: 10.1038/s43856-021-00043-x

PMC ID: 9053275

SO-VID: bb7c24b7-5d92-4b21-8763-cdd8ad172bb7

License:

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

History

Date received : 28 May 2021

Date accepted : 23 September 2021

Funding

Funded by: FundRef https://doi.org/10.13039/501100001809, National Natural Science Foundation of China (National Science Foundation of China);

Award ID: 82001809

Award ID: 81971612

Award ID: 81471662

Award Recipient : Yaping Zhang Xueqian Xie

Funded by: FundRef https://doi.org/10.13039/501100004921, Shanghai Jiao Tong University (SJTU);

Award ID: ZH2018ZDB10

Award Recipient : Xueqian Xie

Funded by: FundRef https://doi.org/10.13039/501100002855, Ministry of Science and Technology of the People’s Republic of China (Chinese Ministry of Science and Technology);

Award ID: 2016YFE0103000

Award Recipient : Xueqian Xie

Custom metadata

Keywords: computational biology and bioinformatics,imaging

Data availability:

Keywords: computational biology and bioinformatics, imaging

Comments

Comment on this article

scite_

Cited by 3

See all cited by

Most referenced authors 803

See all reference authors

Development and multicenter validation of chest X-ray radiography interpretations based on natural language processing

Read this article at

Abstract

Background

Methods

Results

Conclusion

Plain language summary

Abstract

Related collections

Electron Channelling Contrast Imaging (ECCI)

Most cited references 36

Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach

Fleischner Society: glossary of terms for thoracic imaging.

End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography

Author and article information

Contributors

Journal

Affiliations

Author information

Article

History

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 193

Cited by 3

Most referenced authors 803