Menu Home

Doing Better than the Average

The standard way to estimate the an expected value of a population from a sample of values v1 … vn is to compute the average (1/n) sumi = 1…nvi. It is well known in statistics that for grouped data, there are other estimators that can have smaller expected square error. […]

How Much Data Do You Need?

Introduction A common question in analytics, statistics, and data science projects is: how much data do you need? This question actually has very specific and clear answers! A first good answer is “it is good to have a lot.” Let’s dig deeper and get some additional more detailed quantitative answers. […]

Thinking About Linear Regression

Introduction I want to spend some time thinking out loud about linear regression. As a data science consultant and teacher I spend a lot of time using linear regression and teaching linear regression. I have found each of these pursuits can degenerate into mere doctrine or instructions. “do this,” “expect […]