Menu Home

cdata Update

The R package cdata now has version 0.7.0 available from CRAN.

cdata is a data manipulation package that subsumes many higher order data manipulation operations including pivot/un-pivot, spread/gather, or cast/melt. The record to record transforms are specified by drawing a table that expresses the record structure (called the “control table” and also the link between the key concepts of row-records and block-records).

What can be quickly specified and achieved using these concepts and notations is amazing and quite teachable. These transforms can be run in-memory or in remote database or big-data systems (such as Spark).

The concepts are taught in Nina Zumel’s excellent tutorial.

Untitled

And in John Mount’s quick screencast/lecture.

link, slides

The 0.7.0 update adds local versions of the operators in addition to the Spark and database implementations. These methods should now be a bit safer for in-memory complex/annotated types such as dates and times.

Categories: data science Opinion Practical Data Science Pragmatic Data Science Pragmatic Machine Learning Statistics

Tagged as:

jmount

Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.

%d bloggers like this: