Blog
About

  • Record: found
  • Abstract: found
  • Article: found
Is Open Access

Statistical Methods Used to Test for Agreement of Medical Instruments Measuring Continuous Variables in Method Comparison Studies: A Systematic Review

Read this article at

Bookmark
      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

      Abstract

      Background

      Accurate values are a must in medicine. An important parameter in determining the quality of a medical instrument is agreement with a gold standard. Various statistical methods have been used to test for agreement. Some of these methods have been shown to be inappropriate. This can result in misleading conclusions about the validity of an instrument. The Bland-Altman method is the most popular method judging by the many citations of the article proposing this method. However, the number of citations does not necessarily mean that this method has been applied in agreement research. No previous study has been conducted to look into this. This is the first systematic review to identify statistical methods used to test for agreement of medical instruments. The proportion of various statistical methods found in this review will also reflect the proportion of medical instruments that have been validated using those particular methods in current clinical practice.

      Methodology/Findings

      Five electronic databases were searched between 2007 and 2009 to look for agreement studies. A total of 3,260 titles were initially identified. Only 412 titles were potentially related, and finally 210 fitted the inclusion criteria. The Bland-Altman method is the most popular method with 178 (85%) studies having used this method, followed by the correlation coefficient (27%) and means comparison (18%). Some of the inappropriate methods highlighted by Altman and Bland since the 1980s are still in use.

      Conclusions

      This study finds that the Bland-Altman method is the most popular method used in agreement research. There are still inappropriate applications of statistical methods in some studies. It is important for a clinician or medical researcher to be aware of this issue because misleading conclusions from inappropriate analyses will jeopardize the quality of the evidence, which in turn will influence quality of care given to patients in the future.

      Related collections

      Most cited references 55

      • Record: found
      • Abstract: found
      • Article: not found

      Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed.

      Results of reliability and agreement studies are intended to provide information about the amount of error inherent in any diagnosis, score, or measurement. The level of reliability and agreement among users of scales, instruments, or classifications is widely unknown. Therefore, there is a need for rigorously conducted interrater and intrarater reliability and agreement studies. Information about sample selection, study design, and statistical analysis is often incomplete. Because of inadequate reporting, interpretation and synthesis of study results are often difficult. Widely accepted criteria, standards, or guidelines for reporting reliability and agreement in the health care and medical field are lacking. The objective was to develop guidelines for reporting reliability and agreement studies. Eight experts in reliability and agreement investigation developed guidelines for reporting. Fifteen issues that should be addressed when reliability and agreement are reported are proposed. The issues correspond to the headings usually used in publications. The proposed guidelines intend to improve the quality of reporting. Copyright © 2011 Elsevier Inc. All rights reserved.
        Bookmark
        • Record: found
        • Abstract: found
        • Article: not found

        Validation and reproducibility of food frequency questionnaire for Korean genome epidemiologic study.

        To evaluate validity and reliability of the food-frequency questionnaire (FFQ) developed for the Korean Genome Epidemiologic Study (KoGES). FFQ was administered twice at 1-year interval (first FFQ (FFQ1) at the beginning and second FFQ (FFQ2) at the end of the study) and diet records (DRs) were collected for 3 days during each of the four seasons from December 2002 to May 2004 for those who attended the health examination center. At the end of the study period, we collected the 12-day DRs of 124 participants. The nutrient intakes from the DRs were compared with both FFQ1 and FFQ2. The intakes of energy and some nutrients estimated from FFQ1 and FFQ2 were different from those assessed by the DRs. Especially, the consumption of carbohydrates was higher in FFQ1 and FFQ2 than in the DRs. The de-attenuated, age, sex and energy intake adjusted correlation coefficients between the FFQ2 and the 12-day DRs in Korean population ranged between 0.23 (Vitamin A) and 0.64 (carbohydrate). The median for all nutrients was 0.39. The correlations were similar when we compared nutrient densities of both methods. Joint classification of calorie-adjusted nutrient intakes assessed by FFQ2 and 12-day DRs by quartile ranged from 25.8% (vitamin A) to 39.5% (carbohydrate, iron) for exact concordance. Except vitamin A, the proportion of subjects classified into distant quartile was less than 7% in all nutrients. The median of correlations between the two FFQs 1 year apart were 0.45 for all nutrient intakes and 0.39 for nutrient densities. We conclude that the FFQ we have developed appears to be an acceptable tool for assessing the nutrient intakes in this population. Further studies for calibration of the FFQ collected from multicenters participating in the KoGES are needed. This study was supported by the budget of the National Genome Research Institute, Korea National Institute of Health (2002-347-6111-221).
          Bookmark
          • Record: found
          • Abstract: found
          • Article: not found

          A note on the use of the intraclass correlation coefficient in the evaluation of agreement between two methods of measurement.

          The intraclass correlation coefficient (rI) has been advocated as a statistic for assessing agreement or consistency between two methods of measurement, in conjunction with a significance test of the difference between means obtained by the two methods. We show that neither technique is appropriate for assessing the interchangeability of measurement methods. We describe an alternative approach based on estimation of the mean and standard deviation of differences between measurements by the two methods.
            Bookmark

            Author and article information

            Affiliations
            [1 ]Julius Centre University of Malaya, Department of Social and Preventive Medicine, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
            [2 ]Department of Applied Statistics, Faculty of Economics and Administration, University of Malaya, Kuala Lumpur, Malaysia
            University of East Piedmont, Italy
            Author notes

            Conceived and designed the experiments: RZ AB RI NAI. Performed the experiments: RZ AB RI NAI. Analyzed the data: RZ AB RI NAI. Contributed reagents/materials/analysis tools: RZ AB RI NAI. Wrote the paper: RZ AB RI NAI.

            Contributors
            Role: Editor
            Journal
            PLoS One
            PLoS ONE
            plos
            plosone
            PLoS ONE
            Public Library of Science (San Francisco, USA )
            1932-6203
            2012
            25 May 2012
            : 7
            : 5
            3360667
            22662248
            PONE-D-12-05103
            10.1371/journal.pone.0037908
            (Editor)
            Zaki et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
            Counts
            Pages: 7
            Categories
            Research Article
            Mathematics
            Statistics
            Biostatistics
            Statistical Methods
            Medicine
            Clinical Research Design
            Statistical Methods
            Systematic Reviews
            Diagnostic Medicine
            Test Evaluation
            Drugs and Devices
            Medical Devices
            Epidemiology
            Clinical Epidemiology
            Epidemiological Methods
            Non-Clinical Medicine
            Evidence-Based Medicine
            Public Health
            Preventive Medicine

            Uncategorized

            Comments

            Comment on this article