I’ve been tinkering a lot recently with the data_algebra, and just released version 0.7.0 to PyPi. In this note I’ll touch on what the data algebra is, what the new features are, and my plans going forward.
Statistics is the science of relating summaries of observable samples to the unobserved summaries of the populations they are drawn from. I try to explain that with an example in this video. (link)
I’d like to share my latest “data science bite“: A/B Testing.
Nina Zumel and John Mount will be speaking at the online University of San Francisco Seminar Series in Data Science! How and why to use probability models to outperform decision rules Friday April 30, 2021 12:30pm – 2pm Pacific Time See here for full details and to RSVP In this […]
I am trying a new idea: “data science bites.” Data science bites are small articles and videos explaining only one idea each. This first one explains what supervised machine learning is, without going into the details of how it is realized. (link)
I felt a bit guilty explaining a Kelly/Thorp style card betting system without discussing why these ideas don’t work on fair coin games. So I have “writeup for engineers” on the martingale theory of such games. This has example code, so one could try to come up with a betting […]
I demonstrate a Kelly/Thorp betting system for the simple card game of guessing if the next card from a standard deck is red or black. I have a video of the play here. And a derivation of the betting strategy in R is here. A derivation of the proof you […]
I have up what I think is a really neat tutorial on how to plot multiple curves on a graph in Python, using seaborn and data_algebra. It is great way to show some data shaping theory convenience functions we have developed. Please check it out.
I have a new math chalk talk up: The Game of Infinity Questions. This is back to establishing the “reasonableness” of Kolmogorov’s Axiom of continuity (in his actual formulation of his axiomatization of probability). Remember, his argument is “it is a bit off to have strong opinions on infinite processes, […]
In R it has always been incorrect to call order() on a data.frame. Such a call doesn’t return a sort-order of the rows, and previously did not return an error. For example. d <- data.frame( x = c(2, 2, 3, 3, 1, 1), y = 6:1) knitr::kable(d) x y 2 […]