One thing that is sure to get lost in my long note on macros in R is just how concise and powerful macros are. The problem is macros are concise, but they do a lot for you. So you get bogged down when you explain the joke. Let’s try to […]
Introduction This note shares an experiment comparing the performance of a number of data processing systems available in R. Our notional or example problem is finding the top ranking item per group (group defined by three string columns, and order defined by a single numeric column). This is a common […]
The data.table R package is really good at sorting. Below is a comparison of it versus dplyr for a range of problem sizes.
Not a full R article, but a quick note demonstrating by example the advantage of being able to collect many expressions and pack them into a single extend_se() node.
Introduction In this note we will show how to speed up work in R by partitioning data and process-level parallelization. We will show the technique with three different R packages: rqdatatable, data.table, and dplyr. The methods shown will also work with base-R and other packages. For each of the above […]
We are pleased to announce that seplyr version 0.5.8 is now available on CRAN. seplyr is an R package that provides a thin wrapper around elements of the dplyr package and (now with version 0.5.8) the tidyr package. The intent is to give the part time R user the ability […]
R tip: first organize your tasks in terms of data, values, and desired transformation of values, not initially in terms of concrete functions or code. I know I write a lot about coding in R. But it is in the service of supporting statistics, analysis, predictive analytics, and data science. […]
Another R tip. Need to replace a name in some R code or make R code re-usable? Use wrapr::let().
There are a number of easy ways to avoid illegible code nesting problems in R. In this R tip we will expand upon the above statement with a simple example.
I think this is the R Tip that is going to be the most controversial yet. Its potential pitfalls include: it is a style prescription (which makes it different than and less immediately useful than something of the nature of R Tip: Force Named Arguments), and it is heterodox (this […]