Author Archives
jmount
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.
I demonstrate a Kelly/Thorp betting system for the simple card game of guessing if the next card from a standard deck is red or black. I have a video of the play here. And a derivation of the betting strategy in R is here. A derivation of the proof you […]
Estimated reading time: 42 seconds
I have up what I think is a really neat tutorial on how to plot multiple curves on a graph in Python, using seaborn and data_algebra. It is great way to show some data shaping theory convenience functions we have developed. Please check it out.
Estimated reading time: 23 seconds
I have a new math chalk talk up: The Game of Infinity Questions. This is back to establishing the “reasonableness” of Kolmogorov’s Axiom of continuity (in his actual formulation of his axiomatization of probability). Remember, his argument is “it is a bit off to have strong opinions on infinite processes, […]
Estimated reading time: 1 minute
In R it has always been incorrect to call order() on a data.frame. Such a call doesn’t return a sort-order of the rows, and previously did not return an error. For example. d <- data.frame( x = c(2, 2, 3, 3, 1, 1), y = 6:1) knitr::kable(d) x y 2 […]
Estimated reading time: 2 minutes
The wrapr R package supplies a number of substantial programming tools, including the S3/S4 compatible dot-pipe, unpack/pack object tools, and many more. It also supplies a number of formatting and parsing convenience tools: qc() (“quoting concatenate”): quotes strings, giving value-oriented interfaces much of the incidental convenience of non-standard evaluation (NSE) […]
Estimated reading time: 2 minutes
Introduction Teaching basic data science, machine learning, and statistics is great due to the questions. Students ask brilliant questions, as they see what holes are present in your presentation and scaffolding. The students are not yet conditioned to ask only what you feel is easy to answer or present. They […]
Estimated reading time: 23 minutes
Continuing (and hopefully ending) our quick series on software pathologies I would like to follow-up The Hyper Dance with “Rule 42 Software.”
Estimated reading time: 4 minutes
I’d like to share a new talk on bilingual data science. It is limited to R and Python, so it is a bit of a “we play all kinds of music, both Country and Western.” It has what I feel is a really neat example how I used Jetbrains Intellij […]
Estimated reading time: 51 seconds