Menu Home

Wondering How To Think About Data Science

I just got back from a workshop meeting called Digital Transformation of Decision Analysis.

This was a workshop organized by Eyas Raddad, David Matheson, and John-Mark Agosta. It was sponsored by The Society of Decision Professionals and Microsoft. Microsoft generously hosted at their new Experience Center at the Microsoft Silicon Valley Campus.

It was a great experience, and I learned a lot. The workshop was fairly brilliant in concentrating on guided exercises, instead of presentations and debates.

I was invited based on my data science background and visibility. I, of course, arrogantly went in thinking how data science would fix decision analysis and improve decision quality. I came out wondering how decision analysis could fix data science.

One of the suggestions I came away with was to read Spetzler, Winter, Meyer “Decision Quality” (Wiley 2016). I am only starting on the book, but it comes across as a very good and very effective advocacy argument for decision analysis and decision quality as social and business methodologies. It appears to be a book that teaches how to think about decision analysis, without having to learn all of the field first.

This got me to thinking, is there a similar book for data science? This would be a book that provides the tools for management to reason about data science as “Decision Quality” provides tools for management to reason about decision analysis. Not a book that merely introduces or teaches the topic, but a book that allows one to think about the topic.

Some good candidate books include O’Neil, Shutt “Doing Data Science”, (O’Reilly 2013), Howard, Gugger “Deep Learning for Coders with fastai & PyTorch” (O’Reilly 2020), or even O’Neil “Weapons of Math Destruction” (Crown, 2016). However, these books have different goals than what I described earlier.

I think perhaps the best example of the desired type of book applied to data science topics (or statistics), may in fact be Kohavi, Tang, Xu “Trustworthy Online Controlled Experiments” (Cambridge 2020). This book teaches the business implications of what data scientists call A/B testing, and is as close to the goals of “Decision Quality” that I am aware of (though it is presented on a more technical level). This doesn’t address the breadth of common data science task, or even a lot of applications of even supervised machine learning. However, it does discuss probably the most important points of data science as it is currently practiced: the backward looking nature of the practice, cleanliness of data, and Goodhart style issues.

Given the current pro data science environment, we don’t so much need a book that sells the idea of data science, but one that helps one think about data science. Or, I am more and more interested in tools to perform decision analysis around the use of data science.

Categories: Opinion

Tagged as:

John Mount

1 reply

  1. Statistical decision theory was a big research thread through the 1980s in statistics, mostly been kept alive in the reinforcement learning niche. Maybe it’s time to revisit it more generally?

%d bloggers like this: