Although the severity of haemophilic arthropathy is commonly assessed using established radiographic scoring systems, there is limited available information about their inter- and intra-observer reliability. The purpose of the present study was to establish the inter-observer reliability (IEOR) and intra-observer reliability (IAOR) of three different methods available for the classification of haemophilic arthropathy, including the Arnold and Hilgartner classification, a modification to the Arnold and Hilgartner system described by Luck et al., and the classification described by Pettersson et al. Antero-posterior and lateral radiographs of 54 haemophilic joints were included for the analysis. To determine the IEOR for each one of the three radiographic systems, the radiographs were randomly evaluated by four observers, including two orthopaedic surgeons, one orthopaedic resident and one haematologist. For the determination of IAOR, all four reviewers repeated the assessment in a similar fashion, after a period of at least 2 weeks. IEOR and IAOR for the three classification systems was established using kappa (kappa) statistics. A Spearman rank correlation was used to determine the similarities between each reviewer's own interpretative scales. The IEOR was low for the Arnold and Hilgartner system (kappa = 0.35, P < or = 0.001) and the Luck system (kappa = 0.38, P < or = 0.001), but even lower for the Pettersson system (kappa = 0.06, P = 0.1). For the Pettersson system, particularly low kappa values were observed for the presence or absence of osteoporosis (kappa = 0.11, P = 0.0027), enlarged epiphysis (kappa = 0.10, P = 0.0039), erosion of joint margins (kappa = 0.11, P = 0.0018), and joint deformity (kappa = 0.16, P = 0.00001). However, a relatively high Spearman rank correlation for all three scales [r(s) = 0.75 (P < 0.001) for Arnold and Hilgartner system, r(s) = 0.74 (P < 0.001) for the Luck system and r(s) = 0.81 (P < 0.001) for Pettersson system] indicated an overall, general agreement among the reviewers with regard to the severity of the haemophilic arthropathy. There was a moderate IAOR value for both, the Arnold and Hilgartner system (kappa = 0.57, P = 0.00001) and the Luck system (kappa = 0.62, P = 0.00001) with a low IAOR value for the Pettersson system [kappa = 0.22, P = 0.00001). Currently available radiographic scoring systems for haemophilic arthropathy have low inter- and intra-observer reliability rates. Improvements, either through education or modification of the scoring systems, are critical in an era where correlations between clinical and radiographic scores have received significant attention.