Our group has done a lot of work with non-standard calling conventions in R. Our tools work hard to eliminate non-standard calling (as is the purpose of wrapr::let()), or at least make it cleaner and more controllable (as is done in the wrapr dot pipe). And even so, we still […]
Some days I see R as an eclectic programming language preferred by scientists. “Programming languages as people.” From Leftover Salad (David Marino). Other days I see it more like the following.
Kudos to Professor Andrew Gelman for telling a great joke at his own expense: Stupid-ass statisticians don’t know what a goddam confidence interval is. He brilliantly burlesqued a frustrating common occurrence many people say they “have never seen happen.” One of the pains of writing about data science is there […]
dplyr is one of the most popular R packages. It is powerful and important. But is it in fact easily comprehensible?
Here is an absolutely horrible way to confuse yourself and get an inflated reported R-squared on a simple linear regression model in R. We have written about this before, but we found a new twist on the problem (interactions with categorical variable encoding) which we would like to call out […]
There are a number of statistical principles that are perhaps more honored in the breach than in the observance. For fun I am going to name a few, and show why they are not always the “precision surgical knives of thought” one would hope for (working more like large hammers).
R picked up a nifty way to organize sequential calculations in May of 2014: magrittr by Stefan Milton Bache and Hadley Wickham. magrittr is now quite popular and also has become the backbone of current dplyr practice. If you read my last article on assignment carefully you may have noticed […]
R has a number of assignment operators (at least “<-“, “=“, and “->“; plus “<<-” and “->>” which have different semantics). The R-style guides routinely insist on “<-” as being the only preferred form. In this note we are going to try to make the case for “->” when using […]
I’ve been thinking a bit on statistical tests, their absence, abuse, and limits. I think much of the current “scientific replication crisis” stems from the fallacy that “failing to fail” is the same as success (in addition to the forces of bad luck, limited research budgets, statistical naiveté, sloppiness, pride, […]