Although the PHQ-9 is widely used in primary care, little is known about its performance in quantifying improvement. The original validation study of the PHQ-9 defined clinically significant change as a post-treatment score of ≤9 combined with improvement of 50%, but it is unclear how this relates to other theoretically informed methods of defining successful outcome. We compared a range of definitions of clinically significant change (original definition, asymptomatic criterion, reliable and clinically significant change criteria a, b and c) in a clinical trial of a community-level depression intervention. Randomised Control Trial of collaborative care for depression. Levels of agreement were calculated between the standard definition, other definitions, and gold-standard diagnostic interview. The standard definition showed good agreement (kappa>0.60) with the other definitions and had moderate, though acceptable, agreement with the diagnostic interview (kappa=0.58). The standard definition corresponded closely to reliable and clinically significant change criterion c, the recommended method of quantifying improvement when clinical and non-clinical distributions overlap. The absence of follow-up data meant that an asymptomatic criterion rather than remission or recovery criteria were used. The close agreement between the standard definition and reliable and clinically significant change criterion c provides some support for the standard definition of improvement. However, it may be preferable to use a reliable change index rather than 50% improvement. Remission status, based on the asymptomatic range and a lower PHQ-9 score, may provide a useful additional category of clinical change. Copyright © 2010 Elsevier B.V. All rights reserved.