The 12-item General Health Questionnaire (GHQ-12): translation and validation study of the Iranian version

      The objective of this study was to translate and to test the reliability and validity of the 12-item General Health Questionnaire (GHQ-12) in Iran.


      Using a standard 'forward-backward' translation procedure, the English language version of the questionnaire was translated into Persian (Iranian language). Then a sample of young people aged 18 to 25 years old completed the questionnaire. In addition, a short questionnaire containing demographic questions and a single measure of global quality of life was administered. To test reliability the internal consistency was assessed by Cronbach's alpha coefficient. Validity was performed using convergent validity. Finally, the factor structure of the questionnaire was extracted by performing principal component analysis using oblique factor solution.


      In all 748 young people entered into the study. The mean age of respondents was 21.1 (SD = 2.1) years. Employing the recommended method of scoring (ranging from 0 to 12), the mean GHQ score was 3.7 (SD = 3.5). Reliability analysis showed satisfactory result (Cronbach's alpha coefficient = 0.87). Convergent validity indicated a significant negative correlation between the GHQ-12 and global quality of life scores as expected (r = -0.56, P < 0.0001). The principal component analysis with oblique rotation solution showed that the GHQ-12 was a measure of psychological morbidity with two-factor structure that jointly accounted for 51% of the variance.


      The study findings showed that the Iranian version of the GHQ-12 has a good structural characteristic and is a reliable and valid instrument that can be used for measuring psychological well being in Iran.

      Most cited references 26

      The validity of two versions of the GHQ in the WHO study of mental illness in general health care.

      In recent years the 12-item General Health Questionnaire (GHQ-12) has been extensively used as a short screening instrument, producing results that are comparable to longer versions of the GHQ. The validity of the GHQ-12 was compared with the GHQ-28 in a World Health organization study of psychological disorders in general health care. Results are presented for 5438 patients interviewed in 15 centres using the primary care version of the Composite International Diagnostic Instrument, or CIDI-PC. Results were uniformly good, with the average area under the ROC curve 88, range from 83 to 95. Minor variations in the criteria used for defining a case made little difference to the validity of the GHQ, and complex scoring methods offered no advantages over simpler ones. The GHQ was translated into 10 other languages for the purposes of this study, and validity coefficients were almost as high as in the original language. There was no tendency for the GHQ to work less efficiently in developing countries. Finally gender, age and educational level are shown to have no significant effect on the validity of the GHQ. If investigators wish to use a screening instrument as a case detector, the shorter GHQ is remarkably robust and works as well as the longer instrument. The latter should only be preferred if there is an interest in the scaled scores provided in addition to the total score.
        Why GHQ threshold varies from one place to another.

        No convincing explanation has been forthcoming for the variation in best threshold to adopt for the GHQ in different settings. Data dealing with the GHQ and the CIDI in 15 cities from a recent WHO study was subjected to further analysis. The mean number of CIDI symptoms for those with single diagnoses, or those with multiple diagnoses, does not vary between cities. However, the best threshold is found to be related to the prevalence both of single and of multiple diagnoses in a centre. Variations in the diagnoses to be included in the 'gold standard' did not account for the variation observed. There was a strong relationship between area under the ROC curve (as a measure of the discriminatory power of the GHQ) and the best threshold, with higher thresholds being associated with superior performance of the GHQ. The items on the GHQ-12 that provided most discrimination between cases and non-cases varied from one centre to another. The GHQ threshold is partly determined by the prevalence of multiple diagnoses, with higher thresholds being associated by higher rates of both single and multiple diagnosis. The mean GHQ score for the whole population of respondents provides a rough guide to the best threshold. In those centres where the discriminatory power of the GHQ is lowest, it is necessary to use a low threshold as a way of ensuring that sensitivity is protected, but the positive predictive value of the GHQ is then lower. Some of the variation between centres is due to variation in the discriminatory power of different items.
          The stability of the factor structure of the General Health Questionnaire.

          Different versions of the General Health Questionnaire (GHQ), including the GHQ-12 and GHQ-28 have been subjected to factor analysis in a variety of countries. The World Health Organization study of psychological disorders in general health care offered the opportunity to investigate the factor structure of both GHQ versions in 15 different centres. The factor structures of the GHQ-12 and GHQ-28 extracted by principal component analysis were compared in participating centres. The GHQ-12 was completed by 26,120 patients and 5,273 patients completed the GHQ-28. The factor structure of the GHQ-28 found in Manchester in this study was compared with that found in the earlier study in 1979. For the GHQ-12, substantial factor variation between centres was found. After rotation, two factors expressing depression and social dysfunction could be identified. For the GHQ-28, factor variance was less. In general, the original C (social dysfunction) and D (depression) scales of the GHQ-28 were more stable than the A (somatic symptoms) and B (anxiety) scales. Multiple cross-loadings occurred in both versions of the GHQ suggesting correlation of the extracted factors. In Manchester, the factor structure of the GHQ had changed since its development. Validity as a case detector was not affected by factor variance. These findings confirm that despite factor variation for the GHQ-12, two domains, depression and social dysfunction, appear across the 15 centres. In the scaled GHQ-28, two of the scales were remarkably robust between the centres. The cross-correlation between the other two subscales, probably reflects the strength of the relationship between anxiety and somatic symptoms existing in different locations.

            Author and article information

            [1 ]Iranian Institute for Health Sciences Research, Tehran, Iran
            Health Qual Life Outcomes
            Health and Quality of Life Outcomes
            BioMed Central (London )
            13 November 2003
            : 1
            : 66
            Copyright © 2003 Montazeri et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.

