I was working through Kyle Miller‘s excellent note: “Tail call recursion in Python”, and decided to experiment with variations of the techniques. The idea is: one may want to eliminate use of the Python language call-stack in the case of a “tail calls” (a function call where the result is […]
We at Win-Vector LLC have some big news. We are finally porting a streamlined version of our R vtreat variable preparation package to Python. vtreat is a great system for preparing messy data for supervised machine learning. The new implementation is based on Pandas, and we are experimenting with pushing […]
Here is simple modeling problem in R. We want to fit a linear model where the names of the data columns carrying the outcome to predict (y), the explanatory variables (x1, x2), and per-example row weights (wt) are given to us as string values in variables.
What R users now call piping, popularized by Stefan Milton Bache and Hadley Wickham, is inline function application (this is notationally similar to, but distinct from the powerful interprocess communication and concurrency tool introduced to Unix by Douglas McIlroy in 1973). In object oriented languages this sort of notation for […]
The (matter of opinion) claim: “When the use of C++ is very limited and easy to avoid, perhaps it is the best option to do that […]” (source discussed here) got me thinking: does our own RcppDynProg package actually use C++ in a significant way? Could/should I port it to […]
There is a lot of unnecessary worry over “Non Standard Evaluation” (NSE) in R versus “Standard Evaluation” (SE, or standard “variables names refer to values” evaluation). This very author is guilty of over-discussing the issue. But let’s give this yet another try. The entire difference between NSE and regular evaluation […]
From https://twitter.com/sharon000/status/1107771331012108288: From https://tidyr.tidyverse.org/dev/articles/pivot.html (text by Hadley Wickham): For some time, it’s been obvious that there is something fundamentally wrong with the design of spread() and gather(). Many people don’t find the names intuitive and find it hard to remember which direction corresponds to spreading and which to gathering. It […]
We recently commented on excess package dependencies as representing risk in the R package ecosystem. The question remains: how much risk? Is low dependency a mere talisman, or is there evidence it is a good practice (or at least correlates with other good practices)?
I would like to once again recommend our readers to our note on wrapr::let(), an R function that can help you eliminate many problematic NSE (non-standard evaluation) interfaces (and their associate problems) from your R programming tasks. The idea is to imitate the following lambda-calculus idea: let x be y […]
Dirk Eddelbuettel just shared an important point on software and analyses: dependencies are hard to manage risks. If your software or research depends on many complex and changing packages, you have no way to establish your work is correct. This is because to establish the correctness of your work, you […]