I’d like to share an introduction to my data science chalk talk series (video link, series link)
I have a new short video lecture to share: “Classification as Censored Regression.”
I recently shared a bit of the history of The Science of Data Analysis. I thought I would follow that up with a quick chalk talk titled “What is Statistics?” (link)
I am re-reading from the great statistician John W. Tukey’s paper: Tukey, John W. “The Future of Data Analysis.” Ann. Math. Statist. 33 (1962), no. 1, pp. 1–67. doi:10.1214/aoms/1177704711. https://projecteuclid.org/euclid.aoms/1177704711 I’ve taken the liberty of pulling out some quotes that are very relevant to the usual “data science is not […]
I am excited to share my new free video lecture: Estimating the Odds with Bayes’ Law. (link)
I’d like to share a new fee mini-lecture on avoiding Simpson’s Paradox when analyzing A/B test results.
Introduction We’ve been writing on the distribution density shapes expected for probability models in ROC (receiver operator characteristic) plots, double density plots, and normal/logit-normal densities frameworks. I thought I would re-approach the issue with a specific family of examples.
Our group has written a lot on calibration of models and even conditional calibration of models. In our last note we mentioned the possibility of “fully calibrated models.” This note is an example of a probability model that is calibrated in the traditional sense, but not fully calibrated in a […]
The double density plot contains a lot of useful information. This is a plot that shows the distribution of a continuous model score, conditioned on the binary categorical outcome to be predicted. As with most density plots: the y-axis is an abstract quantity called density picked such that the area […]