Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Fitting generalised linear models (GLMs) with more than one predictor has become the standard method of analysis in evolutionary and behavioural research. Often, GLMs are used for exploratory data analysis, where one starts with a complex full model including interaction terms and then simplifies by removing non-significant terms. While this approach can be useful, it is problematic if significant effects are interpreted as if they arose from a single a priori hypothesis test. This is because model selection involves cryptic multiple hypothesis testing, a fact that has only rarely been acknowledged or quantified. We show that the probability of finding at least one ‘significant’ effect is high, even if all null hypotheses are true (e.g. 40% when starting with four predictors and their two-way interactions). This probability is close to theoretical expectations when the sample size ( N) is large relative to the number of predictors including interactions ( k). In contrast, type I error rates strongly exceed even those expectations when model simplification is applied to models that are over-fitted before simplification (low N/ k ratio). The increase in false-positive results arises primarily from an overestimation of effect sizes among significant predictors, leading to upward-biased effect sizes that often cannot be reproduced in follow-up studies (‘the winner's curse’). Despite having their own problems, full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone. We favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results.

Related collections

Most cited references 23

Record: found
Abstract: not found
Article: not found

Null Hypothesis Testing: Problems, Prevalence, and an Alternative

David R Anderson, Kenneth P Burnham, William Thompson (2000)

0 comments Cited 420 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Conclusions beyond support: overconfident estimates in mixed models

Holger Schielzeth, Wolfgang Forstmeier (2008)

Mixed-effect models are frequently used to control for the nonindependence of data points, for example, when repeated measures from the same individuals are available. The aim of these models is often to estimate fixed effects and to test their significance. This is usually done by including random intercepts, that is, intercepts that are allowed to vary between individuals. The widespread belief is that this controls for all types of pseudoreplication within individuals. Here we show that this is not the case, if the aim is to estimate effects that vary within individuals and individuals differ in their response to these effects. In these cases, random intercept models give overconfident estimates leading to conclusions that are not supported by the data. By allowing individuals to differ in the slopes of their responses, it is possible to account for the nonindependence of data points that pseudoreplicate slope information. Such random slope models give appropriate standard errors and are easily implemented in standard statistical software. Because random slope models are not always used where they are essential, we suspect that many published findings have too narrow confidence intervals and a substantially inflated type I error rate. Besides reducing type I errors, random slope models have the potential to reduce residual variance by accounting for between-individual variation in slopes, which makes it easier to detect treatment effects that are applied between individuals, hence reducing type II errors as well.

0 comments Cited 383 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Individual recognition: it is good to be different.

Elizabeth Tibbetts, James Dale (2007)

Individual recognition (IR) behavior has been widely studied, uncovering spectacular recognition abilities across a range of taxa and modalities. Most studies of IR focus on the recognizer (receiver). These studies typically explore whether a species is capable of IR, the cues that are used for recognition and the specializations that receivers use to facilitate recognition. However, relatively little research has explored the other half of the communication equation: the individual being recognized (signaler). Provided there is a benefit to being accurately identified, signalers are expected to actively broadcast their identity with distinctive cues. Considering the prevalence of IR, there are probably widespread benefits associated with distinctiveness. As a result, selection for traits that reveal individual identity might represent an important and underappreciated selective force contributing to the evolution and maintenance of genetic polymorphisms.

0 comments Cited 258 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

: +46-18-4712827 , forstmeier@orn.mpg.de

: holger.schielzeth@ebc.uu.se

Journal

Journal ID (nlm-ta): Behav Ecol Sociobiol

Title: Behavioral Ecology and Sociobiology

Publisher: Springer-Verlag (Berlin/Heidelberg )

ISSN (Print): 0340-5443

ISSN (Electronic): 1432-0762

Publication date (Electronic): 19 August 2010

Publication date PMC-release: 19 August 2010

Publication date (Print): January 2011

Volume: 65

Issue: 1

Pages: 47-55

Affiliations

[1 ]Max Planck Institute for Ornithology, Eberhard-Gwinner-Str., 82319 Seewiesen, Germany

[2 ]Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden

Author notes

Communicated by L. Garamszegi

Article

Publisher ID: 1038

DOI: 10.1007/s00265-010-1038-5

PMC ID: 3015194

PubMed ID: 21297852

SO-VID: f941ed47-d0d3-469f-8c57-709c5227e156

History

Date received : 12 May 2010

Date revision received : 23 July 2010

Date accepted : 29 July 2010

Custom metadata

ScienceOpen disciplines: Ecology

Keywords: model selection,multiple regression,bonferroni correction,generalised linear models,effect size estimation,multiple testing,publication bias,parameter estimation

Data availability:

ScienceOpen disciplines: Ecology

Keywords: model selection, multiple regression, bonferroni correction, generalised linear models, effect size estimation, multiple testing, publication bias, parameter estimation

Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse

Read this article at

Abstract

Related collections

Ticks and tick-borne pathogens

Most cited references 23

Null Hypothesis Testing: Problems, Prevalence, and an Alternative

Conclusions beyond support: overconfident estimates in mixed models

Individual recognition: it is good to be different.

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Categories

Custom metadata

Comments

Comment on this article

Similar content 167

Cited by 447

Most referenced authors 441