7
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Data-driven discovery of changes in clinical code usage over time: a case-study on changes in cardiovascular disease recording in two English electronic health records databases (2001–2015)

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Objectives

          To demonstrate how data-driven variability methods can be used to identify changes in disease recording in two English electronic health records databases between 2001 and 2015.

          Design

          Repeated cross-sectional analysis that applied data-driven temporal variability methods to assess month-by-month changes in routinely collected medical data. A measure of difference between months was calculated based on joint distributions of age, gender, socioeconomic status and recorded cardiovascular diseases. Distances between months were used to identify temporal trends in data recording.

          Setting

          400 English primary care practices from the Clinical Practice Research Datalink (CPRD GOLD) and 451 hospital providers from the Hospital Episode Statistics (HES).

          Main outcomes

          The proportion of patients (CPRD GOLD) and hospital admissions (HES) with a recorded cardiovascular disease (CPRD GOLD: coronary heart disease, heart failure, peripheral arterial disease, stroke; HES: International Classification of Disease codes I20-I69/G45).

          Results

          Both databases showed gradual changes in cardiovascular disease recording between 2001 and 2008. The recorded prevalence of included cardiovascular diseases in CPRD GOLD increased by 47%–62%, which partially reversed after 2008. For hospital records in HES, there was a relative decrease in angina pectoris (−34.4%) and unspecified stroke (−42.3%) over the same time period, with a concomitant increase in chronic coronary heart disease (+14.3%). Multiple abrupt changes in the use of myocardial infarction codes in hospital were found in March/April 2010, 2012 and 2014, possibly linked to updates of clinical coding guidelines.

          Conclusions

          Identified temporal variability could be related to potentially non-medical causes such as updated coding guidelines. These artificial changes may introduce temporal correlation among diagnoses inferred from routine data, violating the assumptions of frequently used statistical methods. Temporal variability measures provide an objective and robust technique to identify, and subsequently account for, those changes in electronic health records studies without any prior knowledge of the data collection process.

          Related collections

          Most cited references35

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Data Resource Profile: Clinical Practice Research Datalink (CPRD)

          The Clinical Practice Research Datalink (CPRD) is an ongoing primary care database of anonymised medical records from general practitioners, with coverage of over 11.3 million patients from 674 practices in the UK. With 4.4 million active (alive, currently registered) patients meeting quality criteria, approximately 6.9% of the UK population are included and patients are broadly representative of the UK general population in terms of age, sex and ethnicity. General practitioners are the gatekeepers of primary care and specialist referrals in the UK. The CPRD primary care database is therefore a rich source of health data for research, including data on demographics, symptoms, tests, diagnoses, therapies, health-related behaviours and referrals to secondary care. For over half of patients, linkage with datasets from secondary care, disease-specific cohorts and mortality records enhance the range of data available for research. The CPRD is very widely used internationally for epidemiological research and has been used to produce over 1000 research studies, published in peer-reviewed journals across a broad range of health outcomes. However, researchers must be aware of the complexity of routinely collected electronic health records, including ways to manage variable completeness, misclassification and development of disease definitions for research.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Temporal trends and patterns in heart failure incidence: a population-based study of 4 million individuals

            Summary Background Large-scale and contemporary population-based studies of heart failure incidence are needed to inform resource planning and research prioritisation but current evidence is scarce. We aimed to assess temporal trends in incidence and prevalence of heart failure in a large general population cohort from the UK, between 2002 and 2014. Methods For this population-based study, we used linked primary and secondary electronic health records of 4 million individuals from the Clinical Practice Research Datalink (CPRD), a cohort that is representative of the UK population in terms of age and sex. Eligible patients were aged 16 years and older, had contributed data between Jan 1, 2002, and Dec 31, 2014, had an acceptable record according to CPRD quality control, were approved for CPRD and Hospital Episodes Statistics linkage, and were registered with their general practice for at least 12 months. For patients with incident heart failure, we extracted the most recent measurement of baseline characteristics (within 2 years of diagnosis) from electronic health records, as well as information about comorbidities, socioeconomic status, ethnicity, and region. We calculated standardised rates by applying direct age and sex standardisation to the 2013 European Standard Population, and we inferred crude rates by applying year-specific, age-specific, and sex-specific incidence to UK census mid-year population estimates. We assumed no heart failure for patients aged 15 years or younger and report total incidence and prevalence for all ages (>0 years). Findings From 2002 to 2014, heart failure incidence (standardised by age and sex) decreased, similarly for men and women, by 7% (from 358 to 332 per 100 000 person-years; adjusted incidence ratio 0·93, 95% CI 0·91–0·94). However, the estimated absolute number of individuals with newly diagnosed heart failure in the UK increased by 12% (from 170 727 in 2002 to 190 798 in 2014), largely due to an increase in population size and age. The estimated absolute number of prevalent heart failure cases in the UK increased even more, by 23% (from 750 127 to 920 616). Over the study period, patient age and multi-morbidity at first presentation of heart failure increased (mean age 76·5 years [SD 12·0] to 77·0 years [12·9], adjusted difference 0·79 years, 95% CI 0·37–1·20; mean number of comorbidities 3·4 [SD 1·9] vs 5·4 [2·5]; adjusted difference 2·0, 95% CI 1·9–2·1). Socioeconomically deprived individuals were more likely to develop heart failure than were affluent individuals (incidence rate ratio 1·61, 95% CI 1·58–1·64), and did so earlier in life than those from the most affluent group (adjusted difference −3·51 years, 95% CI −3·77 to −3·25). From 2002 to 2014, the socioeconomic gradient in age at first presentation with heart failure widened. Socioeconomically deprived individuals also had more comorbidities, despite their younger age. Interpretation Despite a moderate decline in standardised incidence of heart failure, the burden of heart failure in the UK is increasing, and is now similar to the four most common causes of cancer combined. The observed socioeconomic disparities in disease incidence and age at onset within the same nation point to a potentially preventable nature of heart failure that still needs to be tackled. Funding British Heart Foundation and National Institute for Health Research.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research

              Objective To review the methods and dimensions of data quality assessment in the context of electronic health record (EHR) data reuse for research. Materials and methods A review of the clinical research literature discussing data quality assessment methodology for EHR data was performed. Using an iterative process, the aspects of data quality being measured were abstracted and categorized, as well as the methods of assessment used. Results Five dimensions of data quality were identified, which are completeness, correctness, concordance, plausibility, and currency, and seven broad categories of data quality assessment methods: comparison with gold standards, data element agreement, data source agreement, distribution comparison, validity checks, log review, and element presence. Discussion Examination of the methods by which clinical researchers have investigated the quality and suitability of EHR data for research shows that there are fundamental features of data quality, which may be difficult to measure, as well as proxy dimensions. Researchers interested in the reuse of EHR data for clinical research are recommended to consider the adoption of a consistent taxonomy of EHR data quality, to remain aware of the task-dependence of data quality, to integrate work on data quality assessment from other fields, and to adopt systematic, empirically driven, statistically based methods of data quality assessment. Conclusion There is currently little consistency or potential generalizability in the methods used to assess EHR data quality. If the reuse of EHR data for clinical research is to become accepted, researchers should adopt validated, systematic methods of EHR data quality assessment.
                Bookmark

                Author and article information

                Journal
                BMJ Open
                BMJ Open
                bmjopen
                bmjopen
                BMJ Open
                BMJ Publishing Group (BMA House, Tavistock Square, London, WC1H 9JR )
                2044-6055
                2020
                13 February 2020
                : 10
                : 2
                : e034396
                Affiliations
                [1 ] departmentInstitute of Health Informatics , University College London , London, UK
                [2 ] Health Data Research UK , London, UK
                [3 ] departmentInstituto de Aplicaciones de las Tecnologías de la Información y de las Comunicaciones Avanzadas (ITACA) , Universitat Politècnica de València , Valencia, Spain
                Author notes
                [Correspondence to ] Patrick Rockenschaub; patrick.rockenschaub.15@ 123456ucl.ac.uk
                Author information
                http://orcid.org/0000-0002-6499-7933
                http://orcid.org/0000-0003-0542-0816
                http://orcid.org/0000-0003-2817-0178
                Article
                bmjopen-2019-034396
                10.1136/bmjopen-2019-034396
                7045100
                32060159
                c26907b5-e807-4adc-abab-1c8764443efe
                © Author(s) (or their employer(s)) 2020. Re-use permitted under CC BY. Published by BMJ.

                This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.

                History
                : 18 September 2019
                : 08 January 2020
                : 17 January 2020
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100000269, Economic and Social Research Council;
                Award ID: ES/P008321/1
                Funded by: Spanish National Plan for Scientific and Technical Research and Innovation;
                Award ID: DPI2016-80054-R
                Funded by: FundRef http://dx.doi.org/10.13039/100004440, Wellcome Trust;
                Award ID: 206602/Z/17/Z
                Funded by: FundRef http://dx.doi.org/10.13039/100010677, H2020 Health;
                Award ID: 727560
                Award ID: 825750
                Categories
                Health Informatics
                Original Research
                1506
                1702
                Custom metadata
                unlocked

                Medicine
                electronic health records,data quality,clinical coding,cardiovascular disease,statistics & research methods

                Comments

                Comment on this article