Menu Home

Data science project planning

Given the range of wants, diverse data sources, required innovation and methods it often feels like data science projects are immune to planning, scoping and tracking. Without a system to break a data science project into smaller observable components you greatly increase your risk of failure. As a followup to […]

Added worked example to logistic regression project

We have added a worked example to the README of our experimental logistic regression code. The Logistic codebase is designed to support experimentation on variations of logistic regression including: A pure Java implementation (thus directly usable in Java server environments). A simple multinomial implementation (that allows more than two possible […]

On Being a Data Scientist

When people ask me what it means to be a data scientist, I used to answer, “it means you don’t have to hold my hand.” By which I meant that as a data scientist (a consulting data scientist), I can handle the data collection, the data cleaning and wrangling, the […]

How robust is logistic regression?

Logistic Regression is a popular and effective technique for modeling categorical outcomes as a function of both continuous and categorical variables. The question is: how robust is it? Or: how robust are the common implementations? (note: we are using robust in a more standard English sense of performs well for […]