104
views
0
recommends
+1 Recommend
0 collections
    1
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A comparison of Cohen’s Kappa and Gwet’s AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Rater agreement is important in clinical research, and Cohen’s Kappa is a widely used method for assessing inter-rater reliability; however, there are well documented statistical problems associated with the measure. In order to assess its utility, we evaluated it against Gwet’s AC1 and compared the results.

          Methods

          This study was carried out across 67 patients (56% males) aged 18 to 67, with a mean SD of 44.13 ± 12.68 years. Nine raters (7 psychiatrists, a psychiatry resident and a social worker) participated as interviewers, either for the first or the second interviews, which were held 4 to 6 weeks apart. The interviews were held in order to establish a personality disorder (PD) diagnosis using DSM-IV criteria. Cohen’s Kappa and Gwet’s AC1 were used and the level of agreement between raters was assessed in terms of a simple categorical diagnosis (i.e., the presence or absence of a disorder). Data were also compared with a previous analysis in order to evaluate the effects of trait prevalence.

          Results

          Gwet’s AC1 was shown to have higher inter-rater reliability coefficients for all the PD criteria, ranging from .752 to 1.000, whereas Cohen’s Kappa ranged from 0 to 1.00. Cohen’s Kappa values were high and close to the percentage of agreement when the prevalence was high, whereas Gwet’s AC1 values appeared not to change much with a change in prevalence, but remained close to the percentage of agreement. For example a Schizoid sample revealed a mean Cohen’s Kappa of .726 and a Gwet’s AC1of .853 , which fell within the different level of agreement according to criteria developed by Landis and Koch, and Altman and Fleiss.

          Conclusions

          Based on the different formulae used to calculate the level of chance-corrected agreement, Gwet’s AC1 was shown to provide a more stable inter-rater reliability coefficient than Cohen’s Kappa. It was also found to be less affected by prevalence and marginal probability than that of Cohen’s Kappa, and therefore should be considered for use with inter-rater reliability analysis.

          Related collections

          Most cited references16

          • Record: found
          • Abstract: found
          • Article: not found

          Inter-rater reliability of the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID I) and Axis II Disorders (SCID II).

          This study simultaneously assessed the inter-rater reliability of the Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders Axis I (SCID I) and Axis II disorders (SCID II) in a mixed sample of n = 151 inpatients and outpatients, and non-patient controls. Audiotaped interviews were assessed by independent second raters blind for the first raters' scores and diagnoses. Categorical inter-rater reliability was assessed for 12 Axis I disorders of SCID I, while both categorical and dimensional inter-rater reliability was tested for all Axis II disorders. Results revealed moderate to excellent inter-rater agreement of the Axis I disorders, while most categorically and dimensionally measured personality disorders showed excellent inter-rater agreement. Copyright © 2010 John Wiley & Sons, Ltd.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            High agreement but low kappa: II. Resolving the paradoxes.

            An omnibus index offers a single summary expression for a fourfold table of binary concordance among two observers. Among the available other omnibus indexes, none offers a satisfactory solution for the paradoxes that occur with p0 and kappa. The problem can be avoided only by using ppos and pneg as two separate indexes of proportionate agreement in the observers' positive and negative decisions. These two indexes, which are analogous to sensitivity and specificity for concordance in a diagnostic marker test, create the paradoxes formed when the chance correction in kappa is calculated as a product of the increment in the two indexes and the increment in marginal totals. If only a single omnibus index is used to compared different performances in observer variability, the paradoxes of kappa are desirable since they appropriately "penalize" inequalities in ppos and pneg. For better understanding of results and for planning improvements in the observers' performance, however, the omnibus value of kappa should always be accompanied by separate individual values of ppos and pneg.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              The Kappa Statistic: A Second Look

                Bookmark

                Author and article information

                Journal
                BMC Med Res Methodol
                BMC Med Res Methodol
                BMC Medical Research Methodology
                BioMed Central
                1471-2288
                2013
                29 April 2013
                : 13
                : 61
                Affiliations
                [1 ]Department of Psychiatry, Faculty of Medicine, Chiang Mai University, Chiang Mai 50200, Thailand
                [2 ]California School of Professional Psychology, Alliant International University, San Francisco, California, USA
                [3 ]Statistical Consultant Advanced Analytics, LLC PO Box 2696, Gaithersburg, Maryland, USA
                Article
                1471-2288-13-61
                10.1186/1471-2288-13-61
                3643869
                23627889
                7a27e84b-551c-424b-912b-f4a3d83d6933
                Copyright ©2013 Wongpakaran et al.; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 31 August 2012
                : 26 April 2013
                Categories
                Research Article

                Medicine
                inter-rater reliability,coefficients,cohen’s kappa,gwet’s ac1,personality disorders
                Medicine
                inter-rater reliability, coefficients, cohen’s kappa, gwet’s ac1, personality disorders

                Comments

                Comment on this article