I’ve just started experimenting with the Polars data frame library in Python.
I really like the programmable API it exposes. In fact I am starting an experimental adapter from the data algebra to Polars. When this is complete one can use the data algebra to run the same data transform in Pandas, SQL, or Polars.
Here is my first experiment: de-duping a data frame using window functions: DupRows.ipynb. This depends on an incomplete and not released to PyPi version of the data algebra adapter. It was a very positive initial experiment.
Categories: Exciting Techniques Opinion
Hi John, I’m very impressed by your data algebra project. I haven’t used R much in awhile but one thing I always miss about it is the method chaining in dplyr. These days, I mostly write SQL, so I was very excited to see your project support output to SQL code. I look forward to playing with this project to see how it helps me “write” complex SQL by thinking more sequentially. Thanks for your gift to the world!
What a kind thing to say. I use the tool a lot myself, hopefully it can evolve to a benefit for others.