+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.



          To review the methods and dimensions of data quality assessment in the context of electronic health record (EHR) data reuse for research.

          Materials and methods

          A review of the clinical research literature discussing data quality assessment methodology for EHR data was performed. Using an iterative process, the aspects of data quality being measured were abstracted and categorized, as well as the methods of assessment used.


          Five dimensions of data quality were identified, which are completeness, correctness, concordance, plausibility, and currency, and seven broad categories of data quality assessment methods: comparison with gold standards, data element agreement, data source agreement, distribution comparison, validity checks, log review, and element presence.


          Examination of the methods by which clinical researchers have investigated the quality and suitability of EHR data for research shows that there are fundamental features of data quality, which may be difficult to measure, as well as proxy dimensions. Researchers interested in the reuse of EHR data for clinical research are recommended to consider the adoption of a consistent taxonomy of EHR data quality, to remain aware of the task-dependence of data quality, to integrate work on data quality assessment from other fields, and to adopt systematic, empirically driven, statistically based methods of data quality assessment.


          There is currently little consistency or potential generalizability in the methods used to assess EHR data quality. If the reuse of EHR data for clinical research is to become accepted, researchers should adopt validated, systematic methods of EHR data quality assessment.

          Related collections

          Most cited references 94

          • Record: found
          • Abstract: found
          • Article: not found

          Validation of information recorded on general practitioner based computerised data resource in the United Kingdom.

           L E Derby,  S Jick,  H Jick (1991)
          To determine the extent of agreement between clinical information recorded on surgery computers of selected general practitioners and similar information in manual records of letters received from hospital consultants and kept in the general practitioners' files. Hospital consultants' letters in the manual records of selected general practitioners were photocopied and the consultants' clinical diagnoses were compared with diagnoses recorded on computer. General practices in the United Kingdom using computers provided by VAMP Health for recording clinical information. 2491 patients who received one of three non-steroidal anti-inflammatory drugs and who attended 58 practices whose computer recorded data were considered after a preliminary review to be of satisfactory quality. Among 1191 patients for whom consultants' letters were forwarded a clinical diagnosis reflecting the diagnosis noted on a consultant letter was present on the computer record for 1038 (87%). Clinical information available on the computer records of the general practitioners who participated in this study is satisfactory for many clinical studies.
            • Record: found
            • Abstract: found
            • Article: not found

            Review: electronic health records and the reliability and validity of quality measures: a review of the literature.

            Previous reviews of research on electronic health record (EHR) data quality have not focused on the needs of quality measurement. The authors reviewed empirical studies of EHR data quality, published from January 2004, with an emphasis on data attributes relevant to quality measurement. Many of the 35 studies reviewed examined multiple aspects of data quality. Sixty-six percent evaluated data accuracy, 57% data completeness, and 23% data comparability. The diversity in data element, study setting, population, health condition, and EHR system studied within this body of literature made drawing specific conclusions regarding EHR data quality challenging. Future research should focus on the quality of data from specific EHR components and important data attributes for quality measurement such as granularity, timeliness, and comparability. Finally, factors associated with poor or variability in data quality need to be better understood and effective interventions developed.
              • Record: found
              • Abstract: found
              • Article: not found

              Systematic review of scope and quality of electronic patient record data in primary care.

              To systematically review measures of data quality in electronic patient records (EPRs) in primary care. Systematic review of English language publications, 1980-2001. Bibliographic searches of medical databases, specialist medical informatics databases, conference proceedings, and institutional contacts. Studies selected according to a predefined framework for categorising review papers. Reference standards and measurements used to judge quality. Bibliographic searches identified 4589 publications. After primary exclusions 174 articles were classified, 52 of which met the inclusion criteria for review. Selected studies were primarily descriptive surveys. Variability in methods prevented meta-analysis of results. Forty eight publications were concerned with diagnostic data, 37 studies measured data quality, and 15 scoped EPR quality. Reliability of data was assessed with rate comparison. Measures of sensitivity were highly dependent on the element of EPR data being investigated, while the positive predictive value was consistently high, indicating good validity. Prescribing data were generally of better quality than diagnostic or lifestyle data. The lack of standardised methods for assessment of quality of data in electronic patient records makes it difficult to compare results between studies. Studies should present data quality measures with clear numerators, denominators, and confidence intervals. Ambiguous terms such as "accuracy" should be avoided unless precisely defined.

                Author and article information

                J Am Med Inform Assoc
                J Am Med Inform Assoc
                Journal of the American Medical Informatics Association : JAMIA
                BMJ Group (BMA House, Tavistock Square, London, WC1H 9JR )
                Jan-Feb 2013
                1 January 2013
                : 20
                : 1
                : 144-151
                Department of Biomedical Informatics, Columbia University, New York, New York, USA
                Author notes
                [Correspondence to ] Nicole Gray Weiskopf, Department of Biomedical Informatics, Columbia University, 622 W 168th Street, VC-5, New York, NY 10032, USA; nicole.weiskopf@
                Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to

                This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: and

                Focus on Data Sharing
                Custom metadata


                Comment on this article