I am sharing some rough notes (in R and Python) here on how while dot(a, b) fulfills “Mercer’s condition” (by definition!, and I’ll just informally call these beasts a “Mercer Kernel”), the seemingly harmless variations abs(dot(a, b)) relu(dot(a, b)) are not Mercer Kernels (relu(x) = max(0, x) = (abs(x) + […]

Estimated reading time: 2 minutes

I am sharing a new free video where I work through a great common argument that bounds expected excess generalization error as a ratio of model complexity (in rows) over training set size (again in rows), independent of problem dimension. (link) For more of my notes on support vector machines […]

Estimated reading time: 34 seconds

In addition to adding a base-R pipe it appears a new base-R function builders is in the works (in addition to “function”). R is a very versatile language, with a great ability to accept user-level or package extensions. What I mean by this is, user code and package code (which […]

Estimated reading time: 2 minutes

R‘s upcoming pipe appears to be currently proposed as a syntactic transform of the form: a |> f(…) -> f(a, …) a |> f() -> f(a) There is a current active discussion on this prototype and some interesting points come up. Note the current proposal appears to disallow a |> […]

Estimated reading time: 8 minutes

It looks like R is getting an official pipe operator (ref). R doesn’t work under an RFC process, so we hear about these things and they are discussed on the R-devel mailing list. I’ve written on this topic before (ref), and I have taped some new comments. This sort of […]

Estimated reading time: 1 minute

Our book, Practical Data Science with R, just had its first year anniversary! The book is doing great, if you are working with R and data I recommend you check it out. (link)

Estimated reading time: 22 seconds

Introduction We’ve been writing on the distribution density shapes expected for probability models in ROC (receiver operator characteristic) plots, double density plots, and normal/logit-normal densities frameworks. I thought I would re-approach the issue with a specific family of examples.

Estimated reading time: 12 minutes