Rule 42 Software
Continuing (and hopefully ending) our quick series on software pathologies I would like to follow-up The Hyper Dance with “Rule 42 Software.”
Continuing (and hopefully ending) our quick series on software pathologies I would like to follow-up The Hyper Dance with “Rule 42 Software.”
A lot of machine learning, statistical, plotting, and analytics algorithms over-sell a small evil trick I call “the hyper dance.”
Here are a few isolation inspired “applications” (in the theoretical or mathematical sense of the term) of the spicy soup combinatorial design.
In our recent note What is new for rquery December 2019 we mentioned an ugly processing pipeline that translates into SQL of varying size/quality depending on the query generator we use. In this note we try a near-relative of that query in the data_algebra.
I was working through Kyle Miller‘s excellent note: “Tail call recursion in Python”, and decided to experiment with variations of the techniques. The idea is: one may want to eliminate use of the Python language call-stack in the case of a “tail calls” (a function call where the result is […]
It usually gives us a chuckle when we find some natural and seemingly easy data science question is NP-hard. For instance we have written that variable pruning is NP-hard when one insists on finding a minimal sized set of variables (and also why there are no obvious methods for exact […]
Here at Win-Vector LLC we like permutation tests. Our team has written on them (for example: How Do You Know if Your Data Has Signal?) and they are used to estimate significances in our sigr and WVPlots R packages. For example permutation methods are used to estimate the significance reported […]
“Base R” (call it “Pure R”, “Good Old R”, just don’t call it “Old R” or late for dinner) can be fast for in-memory tasks. This is despite the commonly repeated claim that: “packages written in C/C++ are (edit: “always”) faster than R code.” The benchmark results of “rquery: Fast […]
Win-Vector LLC recently announced the rquery R package, an operator based query generator. In this note I want to share some exciting and favorable initial rquery benchmark timings.
In this article I will discuss array indexing, operators, and composition in depth. If you work through this article you should end up with a very deep understanding of array indexing and the deep interpretation available when we realize indexing is an instance of function composition (or an example of […]