Like many, I read the recent paper about decreasing explanatory power in ecology with skeptical interest; it’s a cool paper and I guarantee it will make you think. The authors scraped a load of r2 and p-values from ecology papers over the last few decades, and plot the average r2 and count of p-values through time. They find that the average r2 (the explanatory power) of ecological studies is declining through time. I’m a bit of a fan of Bayesian stats, so I find the idea that p-values are a measure of hypothesis testing a bit galling, but I decided to take a look at the data for myself.

Below is their figure 3, which shows the trend in mean r2 through time, and has an r2 of 62%.

…which seems fine, until you open up the data it’s from and plot the data underlying those mean yearly numbers:

…which, to me, contains an awful lot of scatter that isn’t otherwise apparent. Let’s ignore the data before 1969 (although including them changes nothing substantial). Take a look at a density plot of the same data with a best-fit line through it:

Good news! The regression is still significant (it should be, there are 10,759 points in this plot…), but the r2 is 4%. 4% is quite a lot less than the 62% the authors obtained when they averaged out their variation at the year-level. The within-year variation is so large that I don’t think this decline, while statistically real, is something we could use to make predictions. The authors tried to control for this (I sense) in their regression by weighting according to how many values made up each average. I don’t think that goes far enough, because we have the original data to work with (why average), and sample size is not the same as confidence – it would be better to weight by the means’ standard errors. I’m also not convinced a mean (or linear regression) describes this kind of bounded data very well, but I could be convinced otherwise.

In summary, there is a decline in explanatory power in ecology, but the explanatory power of that decline (…) is small and so I don’t think we should get too worked up about it. By all means talk about what this decline means, but if the r2 of the r2s is 4%… do we need to freak out?

### Like this:

Like Loading...

*Related*

It’s exactly this kind of tinkering by kids like you that’s making our r2’s go down over time, part of the general erosion of *values* in our society. Get a high r2 and stick to it, dammit! And where are my glasses???!!

AIC? AICc? Where will it end? The deviance of it all! :p

This just means you can’t say a lot about a random R^2 drawn from a random paper, based just on the year of publication. This would be asking a lot I think. On the other hand, you can say a surprising amount about what the mean R^2 for a year will be, based solely on the year. It does seem fair to ask whether the yearly mean is the best way to summarize this statistic, based on its distribution (it actually looks vaguely bimodal, which is interesting).

Anyway, it was good to see the distribution of the data. That was the first thing I wondered about when I read the paper. I expect many different re-analyses of this data, since it was made publicly available, and it is of broad interest to a very data-curious group of researchers (ecologists). I am working on my own in fact.

I think you’re right in asking the question whether the mean is really the thing we should be trying to model here. I also think you’re right to draw attention to how forward-thinking they have been in releasing all the data. Good luck with your analyses…!

I think Russell above has a good point, and I want to follow up on something I mentioned in my twitter response to the post. It’s easier now to do statistical tests than it was even ten years ago, so we’re likely to see more tests overall. That means people will report all their negative tests in support of a central hypothesis (this didn’t work, and this didn’t work, but this did). I think you can see it as well, look at the ballooning of near 0 values in the last decade. To me that seems like the driving trend. I suspect that if you drop everything below 0.1 you’d be looking at a nonsignificant trend.

Hmm, how many of those r2’s are 1.0 (and 0.0)? Model selection anyone?

I can’t read the original paper – did they pick a standard set of journals publishing at the same rate across the whole time period?

Yeah, three journals – they were pretty standardised, but as you can see from the second figure not that standardised very early on.

I am having a look at the raw data right now. I was also curious about this. Looks like the proportion of papers published by the three journals is roughly constant across the period, so the declining R^2 cannot be explained by variation in the average R^2 between journals, along with an increase in publication by a low R^2 journal. Interestingly, Journal of Animal Ecology has a considerably lower average R^2 than Journal of Ecology and Ecology do, but it hasn’t inceased its publication rate relative to the other two journals in recent times. I will post my own blog post with these results soon.

Pingback: Friday links: is ecology’s explanatory power really declining that much, John Harte vs. Tony Ives, and more | Dynamic Ecology

Pingback: Stuff online, conservation and consternation edition | Denim & Tweed

This is a great discussion. I just want to point out that this discussion was made possible by the fact that the authors of the original study put out an interesting idea and analysis, and then shared their data. Even if people end up deciding their conclusions were wrong, the discussion they have launched is greatly advancing the field through a healthy scientific process. Making data available like this should be the norm and people who do it should be given more credit (and people who don’t should get less).

I agree!

Nice one, Will!

I wonder now how many studies found an r2 > 0.9 with a p-value > 0.05.

I think this speaks to a larger sample size issue – I imagine the number of data points has been increasing over time (though I’ve no idea how you would reliably pull that information out of a large set of papers)

Great post! I love the forensic analyses and statistics.

I have a question: When you said “’I’m also not convinced a mean (or linear regression) describes this kind of bounded data very well”. I did a Google search for “bounded data” and I’m not finding much. Can anyone point me towards literature that explains bounded vs unbounded data and which statistical analysis are (or are not) appropriate.

I think a good starting point would be logistic regression – try something like “glm(y ~ x, family=binomial)” in R. I hope that helps!

…Another issue is not the analysis per se but the data used. Does the pattern (lets say whatever it is in fact) help us to say something about the data quality, the quality of questions we ask or even the match between data and questions. I can imagine (and I blame myself here as well) that with increasing amount of scientific research (we have nowadays a hell of a lot more papers, PhD’s, PostDocs = slaves than decades ago) the pressure to publish might also lead to an increasing decoupling from scientific questions and suitable data….a totally different story but still a possible reason for this figure…which I believe in any way is true…see you soon Will!…

Pingback: Spatially varying selection shapes life history clines among populations of Drosophila melanogaster from sub-Saharan Africa | PEGE Journal Club