Data science is often a case of brining the tools to the problems and data, instead of insisting on bringing the problems and data to the tools.
To support cross-language data science we have been working on cross-language tools, documentation, and training.
For example:
vtreat
data preparation package for supervised machine learning available both for vtreat
R users and for vtreat
Python users. Video lectures: advanced data preparation for R users video, and advanced data preparation for Python users video.
We have task-oriented cross-linked documentation:
- Regression:
R
regression example, fit/prepare
interface,
R
regression example, design/prepare/experiment
interface,
Python
regression
example. - Classification:
R
classification example, fit/prepare
interface,
R
classification example, design/prepare/experiment
interface,
Python
classification
example. - Unsupervised tasks:
R
unsupervised example, fit/prepare
interface,
R
unsupervised example, design/prepare/experiment
interface,
Python
unsupervised
example. - Multinomial classification:
R
multinomial classification
example, fit/prepare
interface,
R
multinomial classification example, design/prepare/experiment
interface,
Python
multinomial classification
example.
Categories: Administrativia Opinion Tutorials
jmount
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.