What does HackerNews think of modin?
Modin: Scale your Pandas workflows by changing a single line of code
Language:
Python
Yeah, tried Polars a couple of times: the API seems worse than Pandas to me too. eg the decision only to support autoincrementing integer indexes seems like it would make debugging "hmmm, that answer is wrong, what exactly did I select?" bugs much more annoying. Polars docs write "blazingly fast" all over them but I doubt that is a compelling point for people using single-node dataframe libraries. It isn't for me.
Modin (https://github.com/modin-project/modin) seems more promising at this point, particularly since a migration path for standing Pandas code is highly desirable.
would give https://github.com/modin-project/modin a shot
Since this is an article about speed and data processing on a Python blog, I'll just point out that the RiseLab team that created Ray and RLlib also has a library called Modin, which is distributed Pandas. You just import modin as pd without changing the code: https://github.com/modin-project/modin
Modin is an alternative pandas implementation for distributed processing using Ray or Dask:
To be clear, this doesn't speed-up pandas, per se. It uses a different library (Modin) as a drop-in replacement:
https://github.com/modin-project/modin
Modin uses Ray, a distributed computation library. There was a similar article on HN a year ago that hyped "making pandas faster" by replacing it with Ray:
Why link to a blog post instead of the Modin [1] project directly, which is the reason for the speed improvement?
Also the title "Pandas got 3X faster" seems to contradict the conclusion in the article, which reports the result was < 2x faster.