Design and Selection of Machine Learning Methods Using Radiomics and Dosiomics for Normal Tissue Complication Probability Modeling of Xerostomia

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Purpose

The purpose of this study is to investigate whether machine learning with dosiomic, radiomic, and demographic features allows for xerostomia risk assessment more precise than normal tissue complication probability (NTCP) models based on the mean radiation dose to parotid glands.

Material and methods

A cohort of 153 head-and-neck cancer patients was used to model xerostomia at 0–6 months (early), 6–15 months (late), 15–24 months (long-term), and at any time (a longitudinal model) after radiotherapy. Predictive power of the features was evaluated by the area under the receiver operating characteristic curve (AUC) of univariate logistic regression models. The multivariate NTCP models were tuned and tested with single and nested cross-validation, respectively. We compared predictive performance of seven classification algorithms, six feature selection methods, and ten data cleaning/class balancing techniques using the Friedman test and the Nemenyi post hoc analysis.

Results

NTCP models based on the parotid mean dose failed to predict xerostomia (AUCs < 0.60). The most informative predictors were found for late and long-term xerostomia. Late xerostomia correlated with the contralateral dose gradient in the anterior–posterior (AUC = 0.72) and the right–left (AUC = 0.68) direction, whereas long-term xerostomia was associated with parotid volumes (AUCs > 0.85), dose gradients in the right–left (AUCs > 0.78), and the anterior–posterior (AUCs > 0.72) direction. Multivariate models of long-term xerostomia were typically based on the parotid volume, the parotid eccentricity, and the dose–volume histogram (DVH) spread with the generalization AUCs ranging from 0.74 to 0.88. On average, support vector machines and extra-trees were the top performing classifiers, whereas the algorithms based on logistic regression were the best choice for feature selection. We found no advantage in using data cleaning or class balancing methods.

Conclusion

We demonstrated that incorporation of organ- and dose-shape descriptors is beneficial for xerostomia prediction in highly conformal radiotherapy treatments. Due to strong reliance on patient-specific, dose-independent factors, our results underscore the need for development of personalized data-driven risk profiles for NTCP models of xerostomia. The facilitated machine learning pipeline is described in detail and can serve as a valuable reference for future work in radiomic and dosiomic NTCP modeling.

Related collections

Most cited references 41

Record: found
Abstract: not found
Article: not found

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Yoav Benjamini, Yosef Hochberg (1995)

0 comments Cited 26765 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

The meaning and use of the area under a receiver operating characteristic (ROC) curve.

J A Hanley, B J McNeil, Marnix van Holsbeeck (1982)

A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented. It is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a randomly chosen non-diseased subject. Moreover, this probability of a correct ranking is the same quantity that is estimated by the already well-studied nonparametric Wilcoxon statistic. These two relationships are exploited to (a) provide rapid closed-form expressions for the approximate magnitude of the sampling variability, i.e., standard error that one uses to accompany the area under a smoothed ROC curve, (b) guide in determining the size of the sample required to provide a sufficiently reliable estimate of this area, and (c) determine how large sample sizes should be to ensure that one can statistically detect differences in the accuracy of diagnostic techniques.

0 comments Cited 4106 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

Wrappers for feature subset selection

Ron Kohavi, George H. John (1997)

0 comments Cited 1114 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Hubert S. Gabryś: URI : http://frontiersin.org/people/u/487590

Mark Bangert: URI : http://frontiersin.org/people/u/508375

Journal

Journal ID (nlm-ta): Front Oncol

Journal ID (iso-abbrev): Front Oncol

Journal ID (publisher-id): Front. Oncol.

Title: Frontiers in Oncology

Publisher: Frontiers Media S.A.

ISSN (Electronic): 2234-943X

Publication date (Electronic): 05 March 2018

Publication date Collection: 2018

Volume: 8

Electronic Location Identifier: 35

Affiliations

[1] ¹Department of Medical Physics in Radiation Oncology, German Cancer Research Center (DKFZ) , Heidelberg, Germany

[2] ²Medical Faculty of Heidelberg, Heidelberg University , Heidelberg, Germany

[3] ³Heidelberg Institute for Radiation Oncology (HIRO) , Heidelberg, Germany

[4] ⁴Institute of Computational Biology, Helmholtz Zentrum München , Neuherberg, Germany

[5] ⁵Clinical Cooperation Unit Radiation Oncology, German Cancer Research Center (DKFZ) , Heidelberg, Germany

[6] ⁶Department of Radiation Oncology, Heidelberg University Hospital , Heidelberg, Germany

Author notes

Edited by: Issam El Naqa, University of Michigan, United States

Reviewed by: John C. Roeske, Loyola University Medical Center, United States; John Austin Vargo, West Virginia University Hospitals, United States

*Correspondence: Hubert S. Gabryś, h.gabrys@ 123456dkfz.de ; Mark Bangert, m.bangert@ 123456dkfz.de

Specialty section: This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology

Article

DOI: 10.3389/fonc.2018.00035

PMC ID: 5844945

PubMed ID: 29556480

SO-VID: c235993c-e850-43a7-90b8-a737fff190dc

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

History

Date received : 21 November 2017

Date accepted : 01 February 2018

Page count

Figures: 8, Tables: 7, Equations: 17, References: 67, Pages: 20, Words: 12286

Comments

Comment on this article

scite_

Cited by 67

See all cited by

Most referenced authors 1,258

See all reference authors

Design and Selection of Machine Learning Methods Using Radiomics and Dosiomics for Normal Tissue Complication Probability Modeling of Xerostomia

Read this article at

Abstract

Purpose

Material and methods

Results

Conclusion

Related collections

Karger: Oncology

Most cited references 41

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

The meaning and use of the area under a receiver operating characteristic (ROC) curve.

Wrappers for feature subset selection

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 813

Cited by 67

Most referenced authors 1,258