Menu Home

Author Archives

jmount

Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.

On The Decomposition of Variance

I am conducting another machine learning / AI bootcamp this week. Starting one of these always makes me want to get more statistical commentaries down, just in case I need one. These classes have to move fast, and also move correctly. In this case I want to write about decomposition […]

A Gruesome Example of Bayes’ Law

Here is an incredibly clear, but unfortunately gruesome, example of a variation of Bayes’ Law. A good teachable point. Consider the recent CDC article “Community and Close Contact Exposures Associated with COVID-19 Among Symptomatic Adults ≥18 Years in 11 Outpatient Health Care Facilities.” It states: Adults with positive SARS-CoV-2 test […]

The Intercept Fallacy

A common mis-understanding of linear regression and logistic regression is that the intercept is thought to encode the unconditional mean or the training data prevalence. This is easily seen to not be the case. Consider the following example in R. library(wrapr) We set up our example data. # build our […]

New WVPlot: ROCPlotPairList

We have a new R WVPlots plot: ROCPlotPairList. It is useful for comparing the ROC/AUC of multiple models on the same data set. library(WVPlots) set.seed(34903490) x1 <- rnorm(50) x2 <- rnorm(length(x1)) x3 <- rnorm(length(x1)) y <- 0.2*x2^2 + 0.5*x2 + x1 + rnorm(length(x1)) frm <- data.frame( x1 = x1, x2 […]