Menu Home

wvpy Clean Up

Just a quick administrative note. To lower the number of dependencies in our Jupyter to Python converter (text and video tutorial here) I have moved the other data science tools (and their dependencies) out of the wvpy package and into a new package named wvu (“Win Vector University”). This will, […]

Method Warnings

Introduction The data algebra is a Python system for designing data transformations that can be used in Pandas or SQL. The new 1.3.0 version introduces a lot of early checking and warnings to make designing data transforms more convenient and safer. An Example I’d like to demonstrate some of these […]

How to Re-Map Many Columns in a Database

Introduction A surprisingly tricky problem in doing data science or analytics in the database are situations where one has to re-map a large number of columns. This occurs, for example, in the vtreat data preparation system. In the vtreat case, a large number of the variable encodings reduce to table-lookup […]

Data Algebra Method Catalog

The data algebra is a system for specifying data transformations in Pandas or SQL databases. To use it, we advise checking out the README and introduction. These document what data operators are the basis of data algebra transformation construction and composition. I have now added a catalog of what expression […]