Posts Tagged ‘Erroneous statistics in science’

Significance of differences of significance: Erroneous statistics in neuroscience

September 10, 2011

Experimental work where a difference between tests is observed must also be analysed statistically to show that the difference observed is significant. But when the significance of difference observed in one group of tests is compared to that observed in another group, then the significance of the difference of the differences is often wrongly analysed according to a new paper in Nature Neuroscience.

The authors  analysed 513 behavioral, systems and cognitive neuroscience articles in five top-ranking journals (Science, Nature, Nature Neuroscience, Neuron and The Journal of Neuroscience). Of the 157 papers where this error could have been made 78 used the correct procedure and 79 used the incorrect procedure. Suspecting that the problem could be more widespread they “reviewed an additional 120 cellular and molecular neuroscience articles published in Nature Neuroscience in 2009 and 2010 (the first five Articles in each issue)”. They did not find a single study that used the correct statistical procedure to compare effect sizes. In contrast, they found at least 25 studies that used the erroneous procedure and explicitly or implicitly compared significance levels.

Erroneous analyses of interactions in neuroscience: a problem of significance by Sander Nieuwenhuis, Birte U Forstmann & Eric-Jan Wagenmakers, Nature Neuroscience 14, 1105–1107 (2011) doi:10.1038/nn.2886

(These) statements illustrate a statistical error that is common in the neuroscience literature. The researchers who made these statements wanted to claim that one effect (for example, the training effect on neuronal activity in mutant mice) was larger or smaller than the other effect (the training effect in control mice). To support this claim, they needed to report a statistically significant interaction (between amount of training and type of mice), but instead they reported that one effect was statistically significant, whereas the other effect was not. Although superficially compelling, the latter type of statistical reasoning is erroneous because the difference between significant and not significant need not itself be statistically significant.

Full paper is here: PDF Nieuwenhuis et al 

AbstractIn theory, a comparison of two experimental effects requires a statistical test on their difference. In practice, this comparison is often based on an incorrect procedure involving two separate tests in which researchers conclude that effects differ when one effect is significant (P < 0.05) but the other is not (P > 0.05). We reviewed 513 behavioral, systems and cognitive neuroscience articles in five top-ranking journals (Science, Nature, Nature Neuroscience, Neuron and The Journal of Neuroscience) and found that 78 used the correct procedure and 79 used the incorrect procedure. An additional analysis suggests that incorrect analyses of interactions are even more common in cellular and molecular neuroscience. We discuss scenarios in which the erroneous procedure is particularly beguiling.

The authors conclude

It is interesting that this statistical error occurs so often, even in journals of the highest standard. Space constraints and the need for simplicity may be the reasons why the error occurs in journals such as Nature and Science. Reporting interactions in an analysis of variance design may seem overly complex when one is writing for a general readership. Perhaps, in some cases, researchers choose to report the difference between significance levels because the corresponding interaction effect is not significant. Peer reviewers should help authors avoid such mistakes. … Indeed, people are generally tempted to attribute too much meaning to the difference between significant and not significant. For this reason, the use of confidence intervals may help prevent researchers from making this statistical error. Whatever the reasons for the error, its ubiquity and potential effect suggest that researchers and reviewers should be more aware that the difference between significant and not significant is not itself necessarily significant.

 

Advertisements

%d bloggers like this: