I felt a bit guilty explaining a Kelly/Thorp style card betting system without discussing why these ideas don’t work on fair coin games. So I have “writeup for engineers” on the martingale theory of such games. This has example code, so one could try to come up with a betting […]

Estimated reading time: 34 seconds

I have up what I think is a really neat tutorial on how to plot multiple curves on a graph in Python, using seaborn and data_algebra. It is great way to show some data shaping theory convenience functions we have developed. Please check it out.

Estimated reading time: 23 seconds

Introduction Teaching basic data science, machine learning, and statistics is great due to the questions. Students ask brilliant questions, as they see what holes are present in your presentation and scaffolding. The students are not yet conditioned to ask only what you feel is easy to answer or present. They […]

Estimated reading time: 23 minutes

I’d like to share a new talk on bilingual data science. It is limited to R and Python, so it is a bit of a “we play all kinds of music, both Country and Western.” It has what I feel is a really neat example how I used Jetbrains Intellij […]

Estimated reading time: 51 seconds

I am sharing some rough notes (in R and Python) here on how while dot(a, b) fulfills “Mercer’s condition” (by definition!, and I’ll just informally call these beasts a “Mercer Kernel”), the seemingly harmless variations abs(dot(a, b)) relu(dot(a, b)) are not Mercer Kernels (relu(x) = max(0, x) = (abs(x) + […]

Estimated reading time: 2 minutes

I am sharing a new free video where I work through a great common argument that bounds expected excess generalization error as a ratio of model complexity (in rows) over training set size (again in rows), independent of problem dimension. (link) For more of my notes on support vector machines […]

Estimated reading time: 34 seconds

I am working on a promising new series of notes: common data science fallacies and pitfalls. (Probably still looking for a good name for the series!) I thought I would share a few thoughts on it, and hopefully not jinx it too badly.

Estimated reading time: 4 minutes

Win Vector LLC has been developing and delivering a lot of “statistics, machine learning, and data science for engineers” intensives in the past few years. These are bootcamps, or workshops, designed to help software engineers become more comfortable with machine learning and artificial intelligence tools. The current thinking is: not […]

Estimated reading time: 2 minutes