Bayes strikes again (sort of): Psych journal bans p-values and NHST
The journal Basic and Applied Social Psychology has stated a "ban" on p-values. The specific rules for publication are:
- p-values and classic null hypothesis significance testing (NHST) procedures are deemed to be inadequate and are to be replaced with rigorous descriptive statistics and effect sizes.
- You can still submit your manuscripts with NHST stats included but if they are to go out for review, you have to remove all mention of your NHST statistics before they reach the reviewers.
- There was a mention about trying to "replace" NHST with confidence interval reporting. That seems to fall under the same scrutiny as NHST to the editorial board at BASP, so they do not advocate for such statistical analyses.
- Bayesian statistics were the only inferential statistical method mentioned that were not totally crapped on. The editorial board believes the Bayesian approach also has weaknesses, as you are approximating data and outcomes when you lack data or outcomes, but can be addressed using different Bayesian methods -- or even going as far as to say the Bayesian method (which is seemingly interpreted as alternative hypothesis probability testing) could indeed be true in some cases.
Since inferential stats aren't required (and are seemingly looked at as unimportant), does that mean it will be easier to publish papers? Trafimow and Marks believes "no" -- and even says it's contrary to what should happen. They state that what should happen is that we are no longer p-hacking our way to a significant finding just to please the stats-heathans, while we toss away our amazing-but-not-significant findings (which may have strong effect sizes).
They also state that in the world of psychology, the p-value and NHST have distorted what is actually important in psychological science -- and that's the science. In social psychology, things aren't necessarily clean cut -- and the stats shouldn't be clean cut either. And that sort of makes sense -- why should we be looking for reds and blues in a sea of purple?
In psychology, we use fancy statistical techniques because sometimes the differences are so small, it's hard to say with certainty that an effect is occurring. It's unlike cellular biology, where you excite a neuron and see an immediate response. You don't need heavy stats with effect sizes and different inferential statistical approaches to cellular level research -- it's clearly reacting due to stimulation. For social psychology, it's a bit different. For example, this Prisoner's Dilemma study has female-male partners cooperating 15% more than male-male partners. For psychology, that's a great finding. For other sciences, 15% of the time is extremely weak.
And yet, we all have been subjected to NHST and p-value thresholds. Whether your finding is 15% different, 50% different, or 99.999999% different, if you didn't cross the p-value threshold of 0.05, your finding is essentially worthless. "Anecdotal". A mere mention.
As I wrote about before, we have been doing NHST for over a century now and only in the last 20 years have Bayesian statistics began to come into the light. It's important to know that Bayesian statistics aren't a panacea for psychological statistics but should definitely be more strongly looked at and understood if the world of psychological research is headed away from classical statistical testing and towards modern statistical testing.
I'm personally a fan of moving away from p-values because it indeed does put the emphasis back on methodology and control and, as Trafimow and Marks put it, creativity. If we are training a new generation of scientists to be even more rigorous than our previous generations, take away the gold standard p-values and see how well we break down methodology and critique your controls for a study. Stronger methods should end up leading the p-value-centric to their precious <0.05 anyway, right?