Remember: p-values Are Not Effect Sizes

Authors: John Mount and Nina Zumel. The p-value is a valid frequentist statistical concept that is much abused and mis-used in practice. In this article I would like to call out a few features of p-values that can cause problems in evaluating summaries. Keep in mind: p-values are useful and […]

Be careful evaluating model predictions

One thing I teach is: when evaluating the performance of regression models you should not use correlation as your score. This is because correlation tells you if a re-scaling of your result is useful, but you want to know if the result in your hand is in fact useful. For […]

The unfortunate one-sided logic of empirical hypothesis testing

I’ve been thinking a bit on statistical tests, their absence, abuse, and limits. I think much of the current “scientific replication crisis” stems from the fallacy that “failing to fail” is the same as success (in addition to the forces of bad luck, limited research budgets, statistical naivetÃ©, sloppiness, pride, […]

Proofing statistics in papers

Recently saw a really fun article making the rounds: “The prevalence of statistical reporting errors in psychology (1985â€“2013)”, Nuijten, M.B., Hartgerink, C.H.J., van Assen, M.A.L.M. et al., Behav Res (2015), doi:10.3758/s13428-015-0664-2. The authors built an R package to check psychology papers for statistical errors. Please read on for how that […]

Drowning in insignificance

Some researchers (in both science and marketing) abuse a slavish view of p-values to try and falsely claim credibility. The incantation is: “we achieved p = x (with x ≤ 0.05) so you should trust our work.” This might be true if the published result had been performed as a […]

Worry about correctness and repeatability, not p-values

In data science work you often run into cryptic sentences like the following: Age adjusted death rates per 10,000 person years across incremental thirds of muscular strength were 38.9, 25.9, and 26.6 for all causes; 12.1, 7.6, and 6.6 for cardiovascular disease; and 6.1, 4.9, and 4.2 for cancer (all […]