Posts Tagged ‘Statistics’

Statistician’s challenge for proof of global warming still stands

September 6, 2016

I have posted earlier in November 2015 about Doug Keenan’s challenge.

Nobody has taken up the challenge as yet.

Instead the hierarchy have merely tried to ignore his challenge, or to challenge the challenge itself.

Dough Keenan now has an addendum to his challenge:

18 August 2016
A paper by Lovejoy et al. was published in Geophysical Research Letters. The paper is about the Contest.

The paper is based on the assertion that “Keenan claims to have used a stochastic model with some realism”; the paper then argues that the Contest model has inadequate realism. The paper provides no evidence that I have claimed that the Contest model has adequate realism; indeed, I do not make such a claim. Moreover, my critique of the IPCC statistical analyses (discussed above) argues that no one can choose a model with adequate realism. Thus, the basis for the paper is invalid. The lead author of the paper, Shaun Lovejoy, was aware of that, but published the paper anyway.

When doing statistical analysis, the first step is to choose a model of the process that generated the data. The IPCC did indeed choose a model. I have only claimed that the model used in the Contest is more realistic than the model chosen by the IPCC. Thus, if the Contest model is unrealistic (as it is), then the IPCC model is even more unrealistic. Hence, the IPCC model should not be used. Ergo, the statistical analyses in the IPCC Assessment Report are untenable, as the critique argues.

For an illustration, consider the following. Lovejoy et al. assert that the Contest model implies a typical temperature change of 4 °C every 6400 years—which is too large to be realistic. Yet the IPCC model implies a temperature change of about 41 °C every 6400 years. (To confirm this, see Section 8 of the critique and note that 0.85×6400/133 = 41.) Thus, the IPCC model is far more unrealistic than the Contest model, according to the test advocated by Lovejoy et al. Hence, if the test advocated by Lovejoy et al. were adopted, then the IPCC statistical analyses are untenable.


Rosling’s health, wealth and statistics

January 31, 2014

Hans Rosling’s tour de force: 200 countries, 200 years, 4 minutes.

This is not new and I think I first saw it about 3 years ago.

But it is worth looking at not just for the content but also for the power of the presentation.

Just a reminder that the world is feeding more people than ever before, we are living longer than ever before, and things are not as black as some alarmists would have us think. And by 2100 total population will be declining.

The glass is more than half-full.

A much longer (20 minutes) presentation is also well worth watching.

Why averaging climate models is meaningless

June 14, 2013

This comment/ essay by rgbatduke on WUWT is well worth reading and digesting.

“this is a point that is stunningly ignored — there are a lot of different models out there, all supposedly built on top of physics, and yet no two of them give anywhere near the same results!”

A professional taking amateurs to task!

(Note! See also his follow-up comments here and here rgbatduke would seem to be Professor R G Brown of Duke University?)

rgbatduke says:

Saying that we need to wait for a certain interval in order to conclude that “the models are wrong” is dangerous and incorrect for two reasons. First — and this is a point that is stunningly ignored — there are a lot of different models out there, all supposedly built on top of physics, and yet no two of them give anywhere near the same results!

This is reflected in the graphs Monckton publishes above, where the AR5 trend line is the average over all of these models and in spite of the number of contributors the variance of the models is huge. It is also clearly evident if one publishes a “spaghetti graph” of the individual model projections (as Roy Spencer recently did in another thread) — it looks like the frayed end of a rope, not like a coherent spread around some physics supported result.

Note the implicit swindle in this graph — by forming a mean and standard deviation over model projections and then using the mean as a “most likely” projection and the variance as representative of the range of the error, one is treating the differences between the models as if they are uncorrelated random variates causing >deviation around a true mean!.

Say what?

This is such a horrendous abuse of statistics that it is difficult to know how to begin to address it. One simply wishes to bitch-slap whoever it was that assembled the graph and ensure that they never work or publish in the field of science or statistics ever again. One cannot generate an ensemble of independent and identically distributed models that have different code. One might, possibly, generate a single model that generates an ensemble of predictions by using uniform deviates (random numbers) to seed
“noise” (representing uncertainty) in the inputs.

What I’m trying to say is that the variance and mean of the “ensemble” of models is completely meaningless, statistically because the inputs do not possess the most basic properties required for a meaningful interpretation. They are not independent, their differences are not based on a random distribution of errors, there is no reason whatsoever to believe that the errors or differences are unbiased (given that the only way humans can generate unbiased anything is through the use of e.g. dice or other objectively random instruments).


“Irreproducible results and spurious claims” in neuroscience

April 26, 2013

The practice of science in today’s “publish or die” world together with the headlong pursuit of funding leaves me somewhat cynical.

My gut feeling has always been that it is the “social sciences” which are plagued most by the “irreproducible study” sickness but it seems to be prevalent across many more disciplines than I would have thought. Poor studies in neuroscience – it would seem – are followed by “meta-studies” to summarise the poor studies and are in turn followed by analysis to prove that the studies are not significant. And poor studies with irreproducible results would seem to be the norm and not the exception.

Gary Stix blogs at The Scientific American:

Brain Lobes : image Scientific American

New Study: Neuroscience Research Gets an “F” for Reliability

Brain studies are  the current darling of the sciences, research capable of garnering  tens or even hundreds of millions in new funding for ambitious new projects, the kind of money that was once reserved only for big physics projects.

Except the house of neuroscience, which attracts tens of thousands of attendees each year to the annual meeting of the Society for Neuroscience, may be built on a foundation of clay. Those are the implications of an analysis published online  April 10 in Nature Reviews Neuroscience, which questions the reliability of much of the research in the field.

The study—led by researchers at the University of Bristol—looked at 48 neuroscience meta-analyses (studies of studies) from 2011 and found that their statistical power reaches only 21 percent, meaning that there is only about a one in five chance that any effect being investigated by the researchers—whether a compound acts as an anti-depressant in rat brains, for instance—will be discovered. Anything that does turn up, moreover, is more likely to be false. …..

John Ioannidis of Stanford University School of Medicine, says ….. “Neuroscience has tremendous potential and it is a very exciting field. However, if it continues to operate with very small studies, its results may not be as credible as one would wish. A combination of small studies with the high popularity of a highly-funded, bandwagon-topic is a high-risk combination and may lead to a lot of irreproducible results and spurious claims for discoveries that are out of proportion.”

Update: Moses Chao, a former president of the Society for Neuroscience and a professor of cell biology at New York University Medical School, got back to me with a comment after I posted the blog, which is excerpted here:

“I agree that many published papers in neuroscience  are based upon small effects or changes.  One issue is that many studies have not been blinded.  There have been numerous reports in my field which have not been reproduced, some dealing with small molecule receptor agonists.  This has set back progress.  The lack of reproducibility is one of the reasons that pharmaceutical companies have reduced their effort in neuroscience research. But irreproducibility also applies to other fields, such as cancer…

Significance of differences of significance: Erroneous statistics in neuroscience

September 10, 2011

Experimental work where a difference between tests is observed must also be analysed statistically to show that the difference observed is significant. But when the significance of difference observed in one group of tests is compared to that observed in another group, then the significance of the difference of the differences is often wrongly analysed according to a new paper in Nature Neuroscience.

The authors  analysed 513 behavioral, systems and cognitive neuroscience articles in five top-ranking journals (Science, Nature, Nature Neuroscience, Neuron and The Journal of Neuroscience). Of the 157 papers where this error could have been made 78 used the correct procedure and 79 used the incorrect procedure. Suspecting that the problem could be more widespread they “reviewed an additional 120 cellular and molecular neuroscience articles published in Nature Neuroscience in 2009 and 2010 (the first five Articles in each issue)”. They did not find a single study that used the correct statistical procedure to compare effect sizes. In contrast, they found at least 25 studies that used the erroneous procedure and explicitly or implicitly compared significance levels.

Erroneous analyses of interactions in neuroscience: a problem of significance by Sander Nieuwenhuis, Birte U Forstmann & Eric-Jan Wagenmakers, Nature Neuroscience 14, 1105–1107 (2011) doi:10.1038/nn.2886

(These) statements illustrate a statistical error that is common in the neuroscience literature. The researchers who made these statements wanted to claim that one effect (for example, the training effect on neuronal activity in mutant mice) was larger or smaller than the other effect (the training effect in control mice). To support this claim, they needed to report a statistically significant interaction (between amount of training and type of mice), but instead they reported that one effect was statistically significant, whereas the other effect was not. Although superficially compelling, the latter type of statistical reasoning is erroneous because the difference between significant and not significant need not itself be statistically significant.

Full paper is here: PDF Nieuwenhuis et al 

AbstractIn theory, a comparison of two experimental effects requires a statistical test on their difference. In practice, this comparison is often based on an incorrect procedure involving two separate tests in which researchers conclude that effects differ when one effect is significant (P < 0.05) but the other is not (P > 0.05). We reviewed 513 behavioral, systems and cognitive neuroscience articles in five top-ranking journals (Science, Nature, Nature Neuroscience, Neuron and The Journal of Neuroscience) and found that 78 used the correct procedure and 79 used the incorrect procedure. An additional analysis suggests that incorrect analyses of interactions are even more common in cellular and molecular neuroscience. We discuss scenarios in which the erroneous procedure is particularly beguiling.

The authors conclude

It is interesting that this statistical error occurs so often, even in journals of the highest standard. Space constraints and the need for simplicity may be the reasons why the error occurs in journals such as Nature and Science. Reporting interactions in an analysis of variance design may seem overly complex when one is writing for a general readership. Perhaps, in some cases, researchers choose to report the difference between significance levels because the corresponding interaction effect is not significant. Peer reviewers should help authors avoid such mistakes. … Indeed, people are generally tempted to attribute too much meaning to the difference between significant and not significant. For this reason, the use of confidence intervals may help prevent researchers from making this statistical error. Whatever the reasons for the error, its ubiquity and potential effect suggest that researchers and reviewers should be more aware that the difference between significant and not significant is not itself necessarily significant.


%d bloggers like this: