Nina Zumel and I have a two new tutorials on fluid data wrangling/shaping. They are written in a parallel structure, with the R version of the tutorial being almost identical to the Python version of the tutorial.

This reflects our opinion on the “which is better for data science R or Python?” They both are great. So start with one, and expect to eventually work with both (if you are lucky).

Each of these tutorials include link to our new “design a fluid data transform in under 2 minutes” instructional video.

The video is unlikely to make sense without reading the articles (and possibly some of the linked backing tutorials). But for the prepared mind this video can be an “Aha!” moment.

Once you get your head around the concept (which takes much longer than a minute!): you can see how we take an example input/output pair and annotate them to become the data transform specification. This can be ground breaking, as it encourages you to spend all of your time thinking about the data. It is easy to copy/paste the specific detailed commands after you have the specification in place.

Categories: data science Practical Data Science Pragmatic Data Science Pragmatic Machine Learning Statistics Tutorials

### jmount

Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.