Bristol Rabbit Pain Scale (BRPS): clinical utility, validity and reliability

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

The Bristol Rabbit Pain Scale (BRPS) was developed using a combination of methods, focus groups and behavioural observation, that led to a composite pain scale of six categories (Demeanour, Locomotion, Posture, Ears, Eyes and Grooming) with four intensities of pain (0, 1, 2, and 3), and a total score of 0–18. The aim of this study was to assess the clinical utility, validity and reliability of the BRPS.

Materials and methods

The clinical utility of the BRPS was tested using a questionnaire composed of ten questions each on a five-point Likert scale ranging from one (strongly disagree) to five (strongly agree). The respondents, (veterinary surgeons and veterinary nurses), were asked to assess up to four rabbits in acute pain, using the novel pain. They then completed the questionnaire which asked whether the BRPS was easy and quick to use and whether it provided information that was clinically useful. The questionnaire was tested for internal reliability using the Cronbach’s alpha reliability coefficient. The construct validity (how well the tool measures the concept it was designed for) was measured by observers blindly rating 20 rabbits pre- and post-surgery whilst the criterion validity (the degree to which the tool correlates with a gold standard) was assessed by correlating BRPS scores with scores using a numerical rating scale (NRS) with a total score of 0–10. Inter-rater reliability was tested by quantifying the agreement in the pain scores given by nine participants when assessing the same 40 video clips. The intra-rater reliability was measured by testing how consistent the participants were when rating the same clips one month later.

Results

The median score of the ten questions of the clinical utility test was 4 (range 2–5). The Cronbach’s alpha reliability coefficient of the clinical utility test was good (α = 0.811) demonstrating good internal consistency. The median (range) pain score of the BRPS and the NRS were 3 (0–14) and 0 (0–8) before surgery and 12 (1–18) and 7 (0–10) after surgery respectively. The BRPS demonstrated high construct validity (Z = -11.452; p < 0.001) and there was a strong correlation between the BRPS and the NRS (Rho = 0.851; p < 0.001) indicating high criterion validity. The inter-rater and the intra-rater agreements were α = 0.863 and α = 0.861 respectively, which is considered good.

Conclusions

This study showed that the BRPS is a suitable tool for quantifying pain in rabbits in a clinically useful, valid and reliable way.

Related collections

Most cited references 26

Record: found
Abstract: found
Article: found

Is Open Access

Making sense of Cronbach's alpha

Mohsen Tavakol, Reg Dennick (2011)

Medical educators attempt to create reliable and valid tests and questionnaires in order to enhance the accuracy of their assessment and evaluations. Validity and reliability are two fundamental elements in the evaluation of a measurement instrument. Instruments can be conventional knowledge, skill or attitude tests, clinical simulations or survey questionnaires. Instruments can measure concepts, psychomotor skills or affective values. Validity is concerned with the extent to which an instrument measures what it is intended to measure. Reliability is concerned with the ability of an instrument to measure consistently. 1 It should be noted that the reliability of an instrument is closely associated with its validity. An instrument cannot be valid unless it is reliable. However, the reliability of an instrument does not depend on its validity. 2 It is possible to objectively measure the reliability of an instrument and in this paper we explain the meaning of Cronbach’s alpha, the most widely used objective measure of reliability. Calculating alpha has become common practice in medical education research when multiple-item measures of a concept or construct are employed. This is because it is easier to use in comparison to other estimates (e.g. test-retest reliability estimates) 3 as it only requires one test administration. However, in spite of the widespread use of alpha in the literature the meaning, proper use and interpretation of alpha is not clearly understood. 2 , 4 , 5 We feel it is important, therefore, to further explain the underlying assumptions behind alpha in order to promote its more effective use. It should be emphasised that the purpose of this brief overview is just to focus on Cronbach’s alpha as an index of reliability. Alternative methods of measuring reliability based on other psychometric methods, such as generalisability theory or item-response theory, can be used for monitoring and improving the quality of OSCE examinations 6 - 10 , but will not be discussed here. What is Cronbach alpha? Alpha was developed by Lee Cronbach in 1951 11 to provide a measure of the internal consistency of a test or scale; it is expressed as a number between 0 and 1. Internal consistency describes the extent to which all the items in a test measure the same concept or construct and hence it is connected to the inter-relatedness of the items within the test. Internal consistency should be determined before a test can be employed for research or examination purposes to ensure validity. In addition, reliability estimates show the amount of measurement error in a test. Put simply, this interpretation of reliability is the correlation of test with itself. Squaring this correlation and subtracting from 1.00 produces the index of measurement error. For example, if a test has a reliability of 0.80, there is 0.36 error variance (random error) in the scores (0.80×0.80 = 0.64; 1.00 – 0.64 = 0.36). 12 As the estimate of reliability increases, the fraction of a test score that is attributable to error will decrease. 2 It is of note that the reliability of a test reveals the effect of measurement error on the observed score of a student cohort rather than on an individual student. To calculate the effect of measurement error on the observed score of an individual student, the standard error of measurement must be calculated (SEM). 13 If the items in a test are correlated to each other, the value of alpha is increased. However, a high coefficient alpha does not always mean a high degree of internal consistency. This is because alpha is also affected by the length of the test. If the test length is too short, the value of alpha is reduced. 2 , 14 Thus, to increase alpha, more related items testing the same concept should be added to the test. It is also important to note that alpha is a property of the scores on a test from a specific sample of testees. Therefore investigators should not rely on published alpha estimates and should measure alpha each time the test is administered. 14 Use of Cronbach’s alpha Improper use of alpha can lead to situations in which either a test or scale is wrongly discarded or the test is criticised for not generating trustworthy results. To avoid this situation an understanding of the associated concepts of internal consistency, homogeneity or unidimensionality can help to improve the use of alpha. Internal consistency is concerned with the interrelatedness of a sample of test items, whereas homogeneity refers to unidimensionality. A measure is said to be unidimensional if its items measure a single latent trait or construct. Internal consistency is a necessary but not sufficient condition for measuring homogeneity or unidimensionality in a sample of test items. 5 , 15 Fundamentally, the concept of reliability assumes that unidimensionality exists in a sample of test items 16 and if this assumption is violated it does cause a major underestimate of reliability. It has been well documented that a multidimensional test does not necessary have a lower alpha than a unidimensional test. Thus a more rigorous view of alpha is that it cannot simply be interpreted as an index for the internal consistency of a test. 5 , 15 , 17 Factor Analysis can be used to identify the dimensions of a test. 18 Other reliable techniques have been used and we encourage the reader to consult the paper “Applied Dimensionality and Test Structure Assessment with the START-M Mathematics Test” and to compare methods for assessing the dimensionality and underlying structure of a test. 19 Alpha, therefore, does not simply measure the unidimensionality of a set of items, but can be used to confirm whether or not a sample of items is actually unidimensional. 5 On the other hand if a test has more than one concept or construct, it may not make sense to report alpha for the test as a whole as the larger number of questions will inevitable inflate the value of alpha. In principle therefore, alpha should be calculated for each of the concepts rather than for the entire test or scale. 2 , 3 The implication for a summative examination containing heterogeneous, case-based questions is that alpha should be calculated for each case. More importantly, alpha is grounded in the ‘tau equivalent model’ which assumes that each test item measures the same latent trait on the same scale. Therefore, if multiple factors/traits underlie the items on a scale, as revealed by Factor Analysis, this assumption is violated and alpha underestimates the reliability of the test. 17 If the number of test items is too small it will also violate the assumption of tau-equivalence and will underestimate reliability. 20 When test items meet the assumptions of the tau-equivalent model, alpha approaches a better estimate of reliability. In practice, Cronbach’s alpha is a lower-bound estimate of reliability because heterogeneous test items would violate the assumptions of the tau-equivalent model. 5 If the calculation of “standardised item alpha” in SPSS is higher than “Cronbach’s alpha”, a further examination of the tau-equivalent measurement in the data may be essential. Numerical values of alpha As pointed out earlier, the number of test items, item inter-relatedness and dimensionality affect the value of alpha. 5 There are different reports about the acceptable values of alpha, ranging from 0.70 to 0.95. 2 , 21 , 22 A low value of alpha could be due to a low number of questions, poor inter-relatedness between items or heterogeneous constructs. For example if a low alpha is due to poor correlation between items then some should be revised or discarded. The easiest method to find them is to compute the correlation of each test item with the total score test; items with low correlations (approaching zero) are deleted. If alpha is too high it may suggest that some items are redundant as they are testing the same question but in a different guise. A maximum alpha value of 0.90 has been recommended. 14 Summary High quality tests are important to evaluate the reliability of data supplied in an examination or a research study. Alpha is a commonly employed index of test reliability. Alpha is affected by the test length and dimensionality. Alpha as an index of reliability should follow the assumptions of the essentially tau-equivalent approach. A low alpha appears if these assumptions are not meet. Alpha does not simply measure test homogeneity or unidimensionality as test reliability is a function of test length. A longer test increases the reliability of a test regardless of whether the test is homogenous or not. A high value of alpha (> 0.90) may suggest redundancies and show that the test length should be shortened. Conclusions Alpha is an important concept in the evaluation of assessments and questionnaires. It is mandatory that assessors and researchers should estimate this quantity to add validity and accuracy to the interpretation of their data. Nevertheless alpha has frequently been reported in an uncritical way and without adequate understanding and interpretation. In this editorial we have attempted to explain the assumptions underlying the calculation of alpha, the factors influencing its magnitude and the ways in which its value can be interpreted. We hope that investigators in future will be more critical when reporting values of alpha in their studies.

0 comments Cited 1608 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Sample size used to validate a scale: a review of publications on newly-developed patient reported outcomes measures

Emmanuelle Anthoine, Leïla Moret, Antoine Regnault … (2014)

Purpose New patient reported outcome (PRO) measures are regularly developed to assess various aspects of the patients’ perspective on their disease and treatment. For these instruments to be useful in clinical research, they must undergo a proper psychometric validation, including demonstration of cross-sectional and longitudinal measurement properties. This quantitative evaluation requires a study to be conducted on an appropriate sample size. The aim of this research was to list and describe practices in PRO and proxy PRO primary psychometric validation studies, focusing primarily on the practices used to determine sample size. Methods A literature review of articles published in PubMed between January 2009 and September 2011 was conducted. Three selection criteria were applied including a search strategy, an article selection strategy, and data extraction. Agreements between authors were assessed, and practices of validation were described. Results Data were extracted from 114 relevant articles. Within these, sample size determination was low (9.6%, 11/114), and were reported as either an arbitrary minimum sample size (n = 2), a subject to item ratio (n = 4), or the method was not explicitly stated (n = 5). Very few articles (4%, 5/114) compared a posteriori their sample size to a subject to item ratio. Content validity, construct validity, criterion validity and internal consistency were the most frequently measurement properties assessed in the validation studies. Approximately 92% of the articles reported a subject to item ratio greater than or equal to 2, whereas 25% had a ratio greater than or equal to 20. About 90% of articles had a sample size greater than or equal to 100, whereas 7% had a sample size greater than or equal to 1000. Conclusions The sample size determination for psychometric validation studies is rarely ever justified a priori. This emphasizes the lack of clear scientifically sound recommendations on this topic. Existing methods to determine the sample size needed to assess the various measurement properties of interest should be made more easily available. Electronic supplementary material The online version of this article (doi:10.1186/s12955-014-0176-2) contains supplementary material, which is available to authorized users.

0 comments Cited 209 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

Validity and reliability in quantitative studies.

Alison Twycross, Roberta Heale (2015)

0 comments Cited 118 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

L. Benato: livia.benato@bristol.ac.uk

Journal

Journal ID (nlm-ta): BMC Vet Res

Journal ID (iso-abbrev): BMC Vet Res

Title: BMC Veterinary Research

Publisher: BioMed Central (London )

ISSN (Electronic): 1746-6148

Publication date (Electronic): 9 September 2022

Publication date PMC-release: 9 September 2022

Publication date Collection: 2022

Volume: 18

Electronic Location Identifier: 341

Affiliations

[1 ]GRID grid.5337.2, ISNI 0000 0004 1936 7603, Animal Welfare and Behaviour, School of Veterinary Sciences, , University of Bristol, ; Langford, UK

[2 ]Highcroft Veterinary Referrals, 615 Wells Road, Whitchurch, Bristol, BS14 9BE UK

Article

Publisher ID: 3434

DOI: 10.1186/s12917-022-03434-x

PMC ID: 9461217

PubMed ID: 36085033

SO-VID: 0e01e15d-a1cb-4588-a082-7ee7b962f4e1

License:

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

History

Date received : 23 March 2022

Date accepted : 26 July 2022

Custom metadata

ScienceOpen disciplines: Veterinary medicine

Keywords: rabbit,pain,pain scale,validity,reliability,clincal utility

Data availability:

ScienceOpen disciplines: Veterinary medicine

Keywords: rabbit, pain, pain scale, validity, reliability, clincal utility

Comments

Comment on this article

scite_

Cited by 5

See all cited by

Most referenced authors 239

See all reference authors

Bristol Rabbit Pain Scale (BRPS): clinical utility, validity and reliability

Read this article at

Abstract

Background

Materials and methods

Results

Conclusions

Related collections

Horses and Humans Research Foundation

Most cited references 26

Making sense of Cronbach's alpha

Sample size used to validate a scale: a review of publications on newly-developed patient reported outcomes measures

Validity and reliability in quantitative studies.

Author and article information

Contributors

Journal

Affiliations

Article

History

Categories

Custom metadata

Comments

Comment on this article

Similar content 108

Cited by 5

Most referenced authors 239