Menu Home

My opinion on “5” == 5

Every programmer should have an opinion on what the outcomes of the expressions like “5” == 5 should be, and perhaps even a guess as to what the answer is in their most familiar programming language. In my opinion SQL gets it right. For example, we get the following in […]

Bilingual Data Science

I’d like to share a new talk on bilingual data science. It is limited to R and Python, so it is a bit of a “we play all kinds of music, both Country and Western.” It has what I feel is a really neat example how I used Jetbrains Intellij […]

Variable Utility is not Intrinsic

There is much ado about variable selection or variable utility valuation in supervised machine learning. In this note we will try to disarm some possibly common fallacies, and to set reasonable expectations about how variable valuation can work. Introduction In general variable valuation is estimating the utility that a column […]

The Data Scientist as The Bus Driver

Let’s please stop saying somebody isn’t a data scientist if they haven’t memorized the innards of one obscure machine learning algorithm, or blow the right smoke during an interoo (“Kangaroo interview”, thanks Jim Ruppert for this term!). Let us, instead, think of the data scientist as the bus driver. It […]

My Opinion on R’s Upcoming Pipe

R‘s upcoming pipe appears to be currently proposed as a syntactic transform of the form: a |> f(…) -> f(a, …) a |> f() -> f(a) There is a current active discussion on this prototype and some interesting points come up. Note the current proposal appears to disallow a |> […]