A bit more on the let wrapper from our replyr R package.
Consider the common following problem: compute for a data set (say the infamous iris example data set) per-group ranks. Suppose we want the rank of iris Sepal.Lengths on a per-Species basis. Frankly this is an “ugh” problem for many analysts: it involves all at the same time grouping, ordering, and […]
Statisticians and data scientists want a neat world where data is arranged in a table such that every row is an observation or instance, and every column is a variable or measurement. Getting to this state of “ready to model format” (often called a denormalized form by relational algebra types) […]
Imagine that in the course of your analysis, you regularly require summaries of numerical values. For some applications you want the mean of that quantity, plus/minus a standard deviation; for other applications you want the median, and perhaps an interval around the median based on the interquartile range (IQR). In […]
When writing reusable code or packages you often do not know the names of the columns or variables you need to work over. This is what I call “parametric treatment of variables.” This can be a problem when using R libraries that assume you know the variable names. The R […]
It is a bit of a shock when R dplyr users switch from using a tbl implementation based on R in-memory data.frames to one based on a remote database or service. A lot of the power and convenience of the dplyr notation is hard to maintain with these more restricted […]