To make getting started with rquery (an advanced query generator for R) easier we have re-worked the package README for various data-sources (including SparkR!).
Recently Hadley Wickham prescribed pronouncing the magrittr pipe as “then” and using right-assignment as follows: I am not sure if it is a good or bad idea. But let’s play with it a bit, and perhaps readers can submit their experience and opinions in the comments section.
While developing the RcppDynProg R package I took a little extra time to port the core algorithm from C++ to both R and Python. This means I can time the exact same algorithm implemented nearly identically in each of these three languages. So I can extract some comparative “apples to […]
In our last note we used wrapr::qe() to help quote expressions. In this note we will discuss quoting and code-capturing interfaces (interfaces that capture user source code) a bit more.
Pipelines in R are popular, the most popular one being magrittr as used by dplyr. This note will discuss the advanced re-usable piping systems: rquery/rqdatatable operator trees and wrapr function object pipelines. In each case we have a set of objects designed to extract extra power from the wrapr dot-arrow […]
This note is a comment on some of the timings shared in the dplyr-0.8.0 pre-release announcement. The original published timings were as follows: With performance metrics: measurements are marketing. So let’s dig in the above a bit.
In August of 2003 Thomas Lumley added bquote() to R 1.8.1. This gave R and R users an explicit Lisp-style quasiquotation capability. bquote() and quasiquotation are actually quite powerful. Professor Thomas Lumley should get, and should continue to receive, a lot of credit and thanks for introducing the concept into […]
According to a KDD poll fewer respondents (by rate) used only R in 2017 than in 2016. At the same time more respondents (by rate) used only Python in 2017 than in 2016. Let’s take this as an excuse to take a quick look at what happens when we try […]
We recently saw a great recurring R question: “how do you use one column to choose a different value for each row?” That is: how do you use a column as an index? Please read on for some idiomatic base R, data.table, and dplyr solutions.
R tip: consider using radix sort.