Typically, our goal is to control the false positive rate at 5 percent. This is quite easy to do when you’re running a single analysis with standard statistics. In the case of fMRI data, the brain is divided into 100,000 to 200,000 subunits called voxels, and each voxel contains a set of data, consisting of each participant’s brain activation across space. If you’re running 100,000 analyses, with a known error control of 5 percent, that is a very large number of false positives (5,000). Therefore, fMRI analyses require special strategies for determining where statistically significant effects are located in the brain while controlling the false positive rate.
A paper came out recently suggesting that conclusions from many neuroscience studies could be wrong because of a statistical error found in a commonly used computer program. How has this factored into the Center’s research?
The paper, “Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates” by Eklund, Nichols and Knutsson that appeared in the journal PNAS early this year really caused quite a stir. This is a really important paper for our field, and I think it has had a positive impact. First, it is worth mentioning that some splashy wording that was used in the original manuscript overstated the problem, and the authors have submitted a correction. The original wording implied that the validity of 40,000 fMRI studies was questionable, while the revised wording instead focuses on questioning weakly significant findings. In fact, the estimate of the number of questionable results, estimated by Tom Nichols (coauthor of that work), is closer to 3,500.
An important finding in the paper was that one of the most popular software packages used to analyze fMRI data, AFNI, had an error in it that increased the number of false positives. This error was quickly fixed by the AFNI developers. Another finding estimated very large false positive rates for an ad hoc method that is commonly used, although it has no foundation in statistics for determining statistical significance. The rest of the findings provide the information we need to control false positives according to the software we’re using.
So, what does this mean for the fMRI work in this lab? First, focusing on the past. The Cluster Failure results apply to when a whole brain analysis is used, requiring 100,000 tests or more, but many of our analyses focus on a specific region of the brain or a “region of interest,” where only a single model is run and controlling false positives is much easier. Have there been whole brain analyses published by our lab that used the flawed version of AFNI? Since I joined the lab in July of 2014, I have not seen anybody use this method on any projects I have collaborated on.
Moving forward, how does this change things? This new information supplies us with a couple of easy-to-implement strategies for ensuring false positive rates are controlled. We were employing some of these strategies in some of our work previously, but now all of our whole brain analyses will have carefully controlled false positive rates.