## Short Data Science Video: Parameterized Juypter Notebooks

I am sharing a new short data science video: Parameterized Juypter Notebooks. It is an example from the wvpy package showing how to programmatically re-run the same notebook with many different inputs. If you are doing data science in Python, this may help you with your projects. link

## Yet Another Data Transform Tutorial

I am sharing yet another data transform tutorial here! It is about coordinatized data, the larger theory encompassing pivot and un-pivot. The example is in Python, but we also supply a similar package for R users.

## Data Algebra over Polars Ready for Production Use

The data algebra is a system for composing data manipulation tasks in Python. In the data algebra, operator pipelines (or even directed acyclic graphs) are the primary objects. Applying operations composes small data pipelines into larger ones. This allows the fluid specification, inspection, and sharing of data processing and data […]

## Experimenting with Polars for Data in Python

I’ve just started experimenting with the Polars data frame library in Python. I really like the programmable API it exposes. In fact I am starting an experimental adapter from the data algebra to Polars. When this is complete one can use the data algebra to run the same data transform […]

## An Effective Personal Jupyter Data Science Workflow

I would like to share what I have found to be a very effective personal Jupyter workflow for data science development. DALL-E “An Effective Personal Jupyter Data Science Workflow” Jupyter (nee IPython) workbooks are JSON documents that allow a data scientist to mix: code, markdown, results, images, and graphs. They […]

## Worry Over Columns, not Rows

I say: if you are a data scientist or working on an analytics project, worry over columns not rows. In analytics “rows” are instances, and “columns” are possible measurements. For example: each click on a website might generate a row recording the visit, and this row would be populated with […]

## I am recommending a Data Science book I just started reading

I (John Mount) am recommending a book that I just started reading. The publisher Manning recently reached out to me and asked if I would accept a free copy of Effective Data Science Infrastructure by Ville Tuulos in exchange for considering helping to promote it. No obligation to promote it, […]

## Wondering How To Think About Data Science

I just got back from a workshop meeting called Digital Transformation of Decision Analysis. This was a workshop organized by Eyas Raddad, David Matheson, and John-Mark Agosta. It was sponsored by The Society of Decision Professionals and Microsoft. Microsoft generously hosted at their new Experience Center at the Microsoft Silicon […]

## Plotting Multiple Curves in Python

I have up what I think is a really neat tutorial on how to plot multiple curves on a graph in Python, using seaborn and data_algebra. It is great way to show some data shaping theory convenience functions we have developed. Please check it out.