Identification of Factors Associated With Variation in US County-Level Obesity Prevalence Rates Using Epidemiologic vs Machine Learning Models

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Key Points

Question

Which factors are associated with county-level variation in obesity prevalence, and how can they be identified using epidemiologic and machine learning methods?

Findings

This cross-sectional study of 3138 US counties found significant county-level variation in obesity prevalence, with US Census region, median household income, and percentage of population with some college education being most strongly associated with obesity prevalence. Machine learning models explain two-thirds more variation in obesity but were less interpretable than multivariate linear regression models.

Meaning

Machine learning models of county-level demographic, socioeconomic, health care, and environmental factors explain significantly more variation in obesity prevalence while being less interpretable.

Abstract

This cross-sectional study uses summarized statistical data and US Census data to compare epidemiologic and machine learning methods to examine associations of US county-level demographic, socioeconomic, health care, and environmental factors with regional variance in obesity prevalence.

Abstract

Importance

Obesity is a leading cause of high health care expenditures, disability, and premature mortality. Previous studies have documented geographic disparities in obesity prevalence.

Objective

To identify county-level factors associated with obesity using traditional epidemiologic and machine learning methods.

Design, Setting, and Participants

Cross-sectional study using linear regression models and machine learning models to evaluate the associations between county-level obesity and county-level demographic, socioeconomic, health care, and environmental factors from summarized statistical data extracted from the 2018 Robert Wood Johnson Foundation County Health Rankings and merged with US Census data from each of 3138 US counties. The explanatory power of the linear multivariate regression and the top performing machine learning model were compared using mean R ² measured in 30-fold cross validation.

Exposures

County-level demographic factors (population; rural status; census region; and race/ethnicity, sex, and age composition), socioeconomic factors (median income, unemployment rate, and percentage of population with some college education), health care factors (rate of uninsured adults and primary care physicians), and environmental factors (access to healthy foods and access to exercise opportunities).

Main Outcomes and Measures

County-level obesity prevalence in 2018, its association with each county-level factor, and the percentage of variation in county-level obesity prevalence explained by linear multivariate and gradient boosting machine regression measured with R ².

Results

Among the 3138 counties studied, the mean (range) obesity prevalence was 31.5% (12.8%-47.8%). In multivariate regressions, demographic factors explained 44.9% of variation in obesity prevalence; socioeconomic factors, 33.0%; environmental factors, 15.5%; and health care factors, 9.1%. The county-level factors with the strongest association with obesity were census region, median household income, and percentage of population with some college education. R ² values of univariate regressions of obesity prevalence were 0.238 for census region, 0.218 for median household income, and 0.160 for percentage of population with some college education. Multivariate linear regression and gradient boosting machine regression (the best-performing machine learning model) of obesity prevalence using all county-level demographic, socioeconomic, health care, and environmental factors had R ² values of 0.58 and 0.66, respectively ( P < .001).

Conclusions and Relevance

Obesity prevalence varies significantly between counties. County-level demographic, socioeconomic, health care, and environmental factors explain the majority of variation in county-level obesity prevalence. Using machine learning models may explain significantly more of the variation in obesity prevalence..

Related collections

Most cited references 16

Record: found
Abstract: found
Article: not found

Differences in Obesity Prevalence by Demographic Characteristics and Urbanization Level Among Adults in the United States, 2013-2016

Cheryl D Fryar, Craig Hales, David S. Freedman … (2018)

Question During 2013-2016, were there differences in the prevalence of obesity and severe obesity by demographics and urbanization level among US adults? Findings In this cross-sectional analysis that included 10 792 adults aged 20 years or older, differences were found in the prevalence of obesity and severe obesity by age group, race and Hispanic origin, and education level. The prevalence of obesity was significantly greater among women living in nonmetropolitan statistical areas (non-MSAs; 47.2%) compared with women living in large MSAs (38.1%), and the prevalence of severe obesity in non-MSAs was higher than in large MSAs among men (9.9% vs 4.1%, respectively) and women (13.5% vs 8.1%, respectively). Meaning Differences in age group, race and Hispanic origin, education level, or smoking status were not related to the differences in the prevalence of obesity and severe obesity by urbanization level. Importance Differences in obesity by sex, age group, race and Hispanic origin among US adults have been reported, but differences by urbanization level have been less studied. Objectives To provide estimates of obesity by demographic characteristics and urbanization level and to examine trends in obesity prevalence by urbanization level. Design, Setting, and Participants Serial cross-sectional analysis of measured height and weight among adults aged 20 years or older in the 2001-2016 National Health and Nutrition Examination Survey, a nationally representative survey of the civilian, noninstitutionalized US population. Exposures Sex, age group, race and Hispanic origin, education level, smoking status, and urbanization level as assessed by metropolitan statistical areas (MSAs; large: ≥1 million population). Main Outcomes and Measures Prevalence of obesity (body mass index [BMI] ≥30) and severe obesity (BMI ≥40) by subgroups in 2013-2016 and trends by urbanization level between 2001-2004 and 2013-2016. Results Complete data on weight, height, and urbanization level were available for 10 792 adults (mean age, 48 years; 51% female [weighted]). During 2013-2016, 38.9% (95% CI, 37.0% to 40.7%) of US adults had obesity and 7.6% (95% CI, 6.8% to 8.6%) had severe obesity. Men living in medium or small MSAs had a higher age-adjusted prevalence of obesity compared with men living in large MSAs (42.4% vs 31.8%, respectively; adjusted difference, 9.8 percentage points [95% CI, 5.1 to 14.5 percentage points]); however, the age-adjusted prevalence among men living in non-MSAs was not significantly different compared with men living in large MSAs (38.9% vs 31.8%, respectively; adjusted difference, 4.8 percentage points [95% CI, −2.9 to 12.6 percentage points]). The age-adjusted prevalence of obesity was higher among women living in medium or small MSAs compared with women living in large MSAs (42.5% vs 38.1%, respectively; adjusted difference, 4.3 percentage points [95% CI, 0.2 to 8.5 percentage points]) and among women living in non-MSAs compared with women living in large MSAs (47.2% vs 38.1%, respectively; adjusted difference, 4.7 percentage points [95% CI, 0.2 to 9.3 percentage points]). Similar patterns were seen for severe obesity except that the difference between men living in large MSAs compared with non-MSAs was significant. The age-adjusted prevalence of obesity and severe obesity also varied significantly by age group, race and Hispanic origin, and education level, and these patterns of variation were often different by sex. Between 2001-2004 and 2013-2016, the age-adjusted prevalence of obesity and severe obesity significantly increased among all adults at all urbanization levels. Conclusions and Relevance In this nationally representative survey of adults in the United States, the age-adjusted prevalence of obesity and severe obesity in 2013-2016 varied by level of urbanization, with significantly greater prevalence of obesity and severe obesity among adults living in nonmetropolitan statistical areas compared with adults living in large metropolitan statistical areas. This national survey study uses National Health and Nutrition Examination Survey data to examine trends in obesity and severe obesity among adults aged 20 years or older by age, sex, race, ethnicity, education level and urbanization level between 2001 and 2016.

0 comments Cited 155 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

The County Health Rankings: rationale and methods

Patrick L. Remington, Bridget B Catlin, Keith Gennuso (2015)

Background Annually since 2010, the University of Wisconsin Population Health Institute and the Robert Wood Johnson Foundation have produced the County Health Rankings—a “population health checkup” for the nation’s over 3,000 counties. The purpose of this paper is to review the background and rationale for the Rankings, explain in detail the methods we use to create the health rankings in each state, and discuss the strengths and limitations associated with ranking the health of communities. Methods We base the Rankings on a conceptual model of population health that includes both health outcomes (mortality and morbidity) and health factors (health behaviors, clinical care, social and economic factors, and the physical environment). Data for over 30 measures available at the county level are assembled from a number of national sources. Z-scores are calculated for each measure, multiplied by their assigned weights, and summed to create composite measure scores. Composite scores are then ordered and counties are ranked from best to worst health within each state. Results Health outcomes and related health factors vary significantly within states, with over two-fold differences between the least healthy counties versus the healthiest counties for measures such as premature mortality, teen birth rates, and percent of children living in poverty. Ranking within each state depicts disparities that are not apparent when counties are ranked across the entire nation. Discussion The County Health Rankings can be used to clearly demonstrate differences in health by place, raise awareness of the many factors that influence health, and stimulate community health improvement efforts. The Rankings draws upon the human instinct to compete by facilitating comparisons between neighboring or peer counties within states. Since no population health model, or rankings based off such models, will ever perfectly describe the health of its population, we encourage users to look to local sources of data to understand more about the health of their community.

0 comments Cited 108 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Percentage of body fat cutoffs by sex, age, and race-ethnicity in the US adult population from NHANES 1999-2004.

Myles Faith, Angelo Pietrobelli, Moonseong Heo … (2012)

To date, there is no consensus regarding adult cutoffs of percentage of body fat or estimated cutoffs on the basis of nationally representative samples with rigorous body-composition measurements. We developed cutoffs of percentage of body fat on the basis of the relation between dual-energy x-ray absorptiometry-measured fat mass and BMI (in kg/m(2)) stratified by sex, age, and race-ethnicity by using 1999-2004 NHANES data. A simple regression (percentage of body fat = β(0) + β(1) × 1 ÷ BMI) was fit for each combination of sex (men and women), 3 age groups (18-29, 30-49, and 50-84 y of age), and 3 race-ethnicity groups (non-Hispanic whites, non-Hispanic blacks, and Mexican Americans). Model fitting included a consideration of complex survey design and multiple imputations. Cutoffs of percentage of body fat were computed that corresponded to BMI cutoffs of 18.5, 25, 30, 35, and 40 on the basis of estimated prediction equations. R(2) ranged from 0.54 to 0.72 for men (n = 6544) and 0.58 to 0.79 for women (n = 6362). In men, the percentage of body fat that corresponded to a BMI of 18.5, 25, 30, 35, and 40 across age and racial-ethnic groups ranged from 12.2% to 19.0%, 22.6% to 28.0%, 27.5% to 32.3%, 31.0% to 35.3%, and 33.6% to 37.6%, respectively; the corresponding ranges in women were from 24.6% to 32.3%, 35.0% to 40.2%, 39.9% to 44.1%, 43.4% to 47.1%, and 46.1% to 49.4%, respectively. The oldest age group had the highest cutoffs of percentage of body fat. Non-Hispanic blacks had the lowest cutoffs of percentage of body fat. Cutoffs of percentage of body fat were higher in women than in men. Cutoffs of percentage of body fat that correspond to the current US BMI cutoffs are a function of sex, age, and race-ethnicity. These factors should be taken into account when considering the appropriateness of levels of percentage of body fat.

0 comments Cited 78 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): JAMA Netw Open

Journal ID (iso-abbrev): JAMA Netw Open

Journal ID (pmc): JAMA Netw Open

Title: JAMA Network Open

Publisher: American Medical Association

ISSN (Electronic): 2574-3805

Publication date (Electronic): 26 April 2019

Publication date Collection: April 2019

Publication date PMC-release: 26 April 2019

Volume: 2

Issue: 4

Electronic Location Identifier: e192884

Affiliations

[1 ]Department of Management Science and Engineering, Stanford University School of Engineering, Stanford, California

[2 ]Department of Preoperative Services, Lucile Packard Children’s Hospital Stanford, Stanford, California

[3 ]Medical Student, Stanford University School of Medicine, Stanford, California

[4 ]Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, California

Author notes

Article Information

Accepted for Publication: March 9, 2019.

Published: April 26, 2019. doi:10.1001/jamanetworkopen.2019.2884

Corresponding Author: Fatima Rodriguez, MD, MPH, Division of Cardiovascular Medicine, Stanford University, 870 Quarry Rd, Falk CVRC, Stanford, CA 94305-5406 ( frodrigu@ 123456stanford.edu ).

Author Contributions: Dr Scheinker had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Concept and design: Scheinker, Rodriguez.

Acquisition, analysis, or interpretation of data: All authors.

Drafting of the manuscript: All authors.

Critical revision of the manuscript for important intellectual content: All authors.

Statistical analysis: Scheinker, Valencia.

Obtained funding: Rodriguez.

Administrative, technical, or material support: Scheinker, Rodriguez.

Supervision: Scheinker, Rodriguez.

Conflict of Interest Disclosures: Dr Scheinker reported being an advisor to Carta Healthcare with equity. Dr Rodriguez reported receiving compensation from Novo Nordisk for event adjudication and stock from HealthPals outside the submitted work. No other disclosures were reported.

Funding/Support: Dr Rodriguez received funding from the McCormick Faculty Fellowship from Stanford University and career development award 1K01HL144607 from the National Heart, Lung, and Blood Institute.

Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Article

Publisher ID: zoi190128

DOI: 10.1001/jamanetworkopen.2019.2884

PMC ID: 6487629

PubMed ID: 31026030

SO-VID: 944cb196-d6dc-4fe4-be76-3788342f8d03

License:

This is an open access article distributed under the terms of the CC-BY License.

History

Date received : 25 November 2018

Date: 8 March 2019

Date accepted : 9 March 2019

Comments

Comment on this article

scite_

Cited by 13

See all cited by

Most referenced authors 793

See all reference authors

Identification of Factors Associated With Variation in US County-Level Obesity Prevalence Rates Using Epidemiologic vs Machine Learning Models

Read this article at

Key Points

Question

Findings

Meaning

Abstract

Abstract

Importance

Objective

Design, Setting, and Participants

Exposures

Main Outcomes and Measures

Results

Conclusions and Relevance

Related collections

Resource Identification

Most cited references 16

Differences in Obesity Prevalence by Demographic Characteristics and Urbanization Level Among Adults in the United States, 2013-2016

The County Health Rankings: rationale and methods

Percentage of body fat cutoffs by sex, age, and race-ethnicity in the US adult population from NHANES 1999-2004.

Author and article information

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 108

Cited by 13

Most referenced authors 793