The reliability paradox: Why robust cognitive tasks do not produce reliable individual differences

Hedge, Craig; Powell, Georgina; Sumner, Petroc

doi:10.3758/s13428-017-0935-1

ScienceOpen: research and publishing network

For Publishers

For Researchers

Blog
About

Search
Advanced search

views

recommends

Record: found
Abstract: found
Article: not found

The reliability paradox: Why robust cognitive tasks do not produce reliable individual differences

research-article

Author(s): Craig Hedge , Georgina Powell , Petroc Sumner

Publication date (Electronic): 19 July 2017

Journal: Behavior Research Methods

Publisher: Springer US

Keywords: Reliability, Individual differences, Reaction time, Difference scores, Response control

Read this article at

ScienceOpenPublisher PMC

Bookmark

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Individual differences in cognitive paradigms are increasingly employed to relate cognition to brain structure, chemistry, and function. However, such efforts are often unfruitful, even with the most well established tasks. Here we offer an explanation for failures in the application of robust cognitive paradigms to the study of individual differences. Experimental effects become well established – and thus those tasks become popular – when between-subject variability is low. However, low between-subject variability causes low reliability for individual differences, destroying replicable correlations with other factors and potentially undermining published conclusions drawn from correlational relationships. Though these statistical issues have a long history in psychology, they are widely overlooked in cognitive psychology and neuroscience today. In three studies, we assessed test-retest reliability of seven classic tasks: Eriksen Flanker, Stroop, stop-signal, go/no-go, Posner cueing, Navon, and Spatial-Numerical Association of Response Code (SNARC). Reliabilities ranged from 0 to .82, being surprisingly low for most tasks given their common use. As we predicted, this emerged from low variance between individuals rather than high measurement variance. In other words, the very reason such tasks produce robust and easily replicable experimental effects – low between-participant variability – makes their use as correlational tools problematic. We demonstrate that taking such reliability estimates into account has the potential to qualitatively change theoretical conclusions. The implications of our findings are that well-established approaches in experimental psychology and neuropsychology may not directly translate to the study of individual differences in brain structure, chemistry, and function, and alternative metrics may be required.

Electronic supplementary material

The online version of this article (doi:10.3758/s13428-017-0935-1) contains supplementary material, which is available to authorized users.

Related collections

Most cited references 98

Record: found
Abstract: not found
Article: not found

Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median

Christophe Leys, Christophe Ley, Olivier Klein … (2013)

0 comments Cited 781 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

The mental representation of parity and number magnitude.

Stanislas Dehaene, Serge Bossini, Pascal Giraux (1993)

0 comments Cited 454 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition.

Edward Vul, Christine Harris, Piotr Winkielman … (2009)

Functional magnetic resonance imaging (fMRI) studiesofemotion, personality, and social cognition have drawn much attention in recent years, with high-profile studies frequently reporting extremely high (e.g., >.8) correlations between brain activation and personality measures. We show that these correlations are higher than should be expected given the (evidently limited) reliability of both fMRI and personality measures. The high correlations are all the more puzzling because method sections rarely contain much detail about how the correlations were obtained. We surveyed authors of 55 articles that reported findings of this kind to determine a few details on how these correlations were computed. More than half acknowledged using a strategy that computes separate correlations for individual voxels and reports means of only those voxels exceeding chosen thresholds. We show how this nonindependent analysis inflates correlations while yielding reassuring-looking scattergrams. This analysis technique was used to obtain the vast majority of the implausibly high correlations in our survey sample. In addition, we argue that, in some cases, other analysis problems likely created entirely spurious correlations. We outline how the data from these studies could be reanalyzed with unbiased methods to provide accurate estimates of the correlations in question and urge authors to perform such reanalyses. The underlying problems described here appear to be common in fMRI research of many kinds-not just in studies of emotion, personality, and social cognition.

0 comments Cited 354 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Craig Hedge: hedgec@cardiff.ac.uk

Journal

Journal ID (nlm-ta): Behav Res Methods

Journal ID (iso-abbrev): Behav Res Methods

Title: Behavior Research Methods

Publisher: Springer US (New York )

ISSN (Print): 1554-351X

ISSN (Electronic): 1554-3528

Publication date (Electronic): 19 July 2017

Publication date PMC-release: 19 July 2017

Publication date (Print): 2018

Volume: 50

Issue: 3

Pages: 1166-1186

Affiliations

ISNI 0000 0001 0807 5670, GRID grid.5600.3, School of Psychology, , Cardiff University, ; Park Place, Cardiff, CF10 3AT UK

Article

Publisher ID: 935

DOI: 10.3758/s13428-017-0935-1

PMC ID: 5990556

PubMed ID: 28726177

SO-VID: 53f150c9-8997-4e4f-9b8d-e260f2634c15

License:

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

History

Funding

Funded by: FundRef http://dx.doi.org/10.13039/100004440, Wellcome Trust;

Award ID: 104943/Z/14/Z

Funded by: FundRef http://dx.doi.org/10.13039/501100000269, Economic and Social Research Council;

Award ID: ES/K002325/1

Custom metadata

ScienceOpen disciplines: Clinical Psychology & Psychiatry

Keywords: reliability,individual differences,reaction time,difference scores,response control

Data availability:

ScienceOpen disciplines: Clinical Psychology & Psychiatry

Keywords: reliability, individual differences, reaction time, difference scores, response control

The reliability paradox: Why robust cognitive tasks do not produce reliable individual differences

Read this article at

Abstract

Electronic supplementary material

Related collections

Measurement of Glucocorticoid Receptor Signaling in Major Depression

Most cited references 98

Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median

The mental representation of parity and number magnitude.

Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition.

Author and article information

Contributors

Journal

Affiliations

Article

History

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 21

Cited by 351

Most referenced authors 1,214