The data algebra is a system for specifying data transformations in Pandas or SQL databases.
To use it, we advise checking out the README and introduction. These document what data operators are the basis of data algebra transformation construction and composition.
I have now added a catalog of what expression methods are available. This is a “Pandas first list” (method names and semantics close to Pandas), and shows for which databases we have SQL translations. Lately our primary SQL target is BigQuery, as targets that emphasize scale are a good complement to Pandas functionality.
With this catalog I think the data algebra and its worked examples become more approachable.
Categories: Coding Pragmatic Data Science Tutorials