#
Author Archives

### jmount

Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.

Let’s please stop saying somebody isn’t a data scientist if they haven’t memorized the innards of one obscure machine learning algorithm, or blow the right smoke during an interoo (“Kangaroo interview”, thanks Jim Ruppert for this term!). Let us, instead, think of the data scientist as the bus driver. It […]

Estimated reading time: 1 minute

I am sharing some rough notes (in R and Python) here on how while dot(a, b) fulfills “Mercer’s condition” (by definition!, and I’ll just informally call these beasts a “Mercer Kernel”), the seemingly harmless variations abs(dot(a, b)) relu(dot(a, b)) are not Mercer Kernels (relu(x) = max(0, x) = (abs(x) + […]

Estimated reading time: 2 minutes

I am sharing a new free video where I work through a great common argument that bounds expected excess generalization error as a ratio of model complexity (in rows) over training set size (again in rows), independent of problem dimension. (link) For more of my notes on support vector machines […]

Estimated reading time: 34 seconds

I have a new math chalk talk to share: “The Real Numbers.” Here I go into some of the terrifying true nature of our common model for continuous quantities. (link)

Estimated reading time: 20 seconds

In addition to adding a base-R pipe it appears a new base-R function builders is in the works (in addition to “function”). R is a very versatile language, with a great ability to accept user-level or package extensions. What I mean by this is, user code and package code (which […]

Estimated reading time: 2 minutes

R‘s upcoming pipe appears to be currently proposed as a syntactic transform of the form: a |> f(…) -> f(a, …) a |> f() -> f(a) There is a current active discussion on this prototype and some interesting points come up. Note the current proposal appears to disallow a |> […]

Estimated reading time: 8 minutes