Author Archives
jmount
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.
For an article on A/B testing that I am preparing, I asked my partner Dr. Nina Zumel if she could do me a favor and write some code to produce the diagrams. She prepared an excellent parameterized diagram generator. However being the author of the book Practical Data Science with […]
Estimated reading time: 5 minutes
I’d like to share a great new feature in the wvpy package (available at PyPi). This package is useful in converting Jupiter notebooks to/from python, and also in rendering many parameterized notebooks. The idea is to make Jupyter notebook easier to use in production. The latest feature is an extension […]
Estimated reading time: 2 minutes
(Still on my math streak.) 1994 had an exciting moment when Fred Galvin solved the 1979 Jeff Dinitz conjecture on list-coloring Latin squares. Latin squares are a simple predecessor to puzzles such as Soduko. A Latin square is an n by n grid of the integers 0 through n-1 (called […]
Estimated reading time: 9 minutes
Statistics is the science of relating summaries of observable samples to the unobserved summaries of the populations they are drawn from. I try to explain that with an example in this video. (link)
Estimated reading time: 22 seconds
I am trying a new idea: “data science bites.” Data science bites are small articles and videos explaining only one idea each. This first one explains what supervised machine learning is, without going into the details of how it is realized. (link)
Estimated reading time: 27 seconds
I felt a bit guilty explaining a Kelly/Thorp style card betting system without discussing why these ideas don’t work on fair coin games. So I have “writeup for engineers” on the martingale theory of such games. This has example code, so one could try to come up with a betting […]
Estimated reading time: 34 seconds
I demonstrate a Kelly/Thorp betting system for the simple card game of guessing if the next card from a standard deck is red or black. I have a video of the play here. And a derivation of the betting strategy in R is here. A derivation of the proof you […]
Estimated reading time: 42 seconds
I have up what I think is a really neat tutorial on how to plot multiple curves on a graph in Python, using seaborn and data_algebra. It is great way to show some data shaping theory convenience functions we have developed. Please check it out.
Estimated reading time: 23 seconds
I have a new math chalk talk up: The Game of Infinity Questions. This is back to establishing the “reasonableness” of Kolmogorov’s Axiom of continuity (in his actual formulation of his axiomatization of probability). Remember, his argument is “it is a bit off to have strong opinions on infinite processes, […]
Estimated reading time: 1 minute