What does HackerNews think of huey?

a little task queue for python

Language: Python

#61 in Python
I've considered using Nextflow for bioinformatics pipelines but have yet to take the plunge.

At work, I develop a proteomics pipeline that is composed of huey¹ tasks (Python library; simple alternative to Celery) which either use subprocess to call out to some external tool, or are just pure python. It runs in a worker container which is managed by Docker swarm, and all containers pull jobs from redis. For our scale, it works great. However, I don't have control over the resource utilization of individual steps, and in the past I've had issues with the pipeline blocking as a result of how I was chaining tasks together. I think something like Nextflow would remove these limitations, but one thing I think I would miss is the ability to debug individual pipeline steps locally with an interactive debugger. As far as I can tell, Nextflow has logging/tracing facilities but nothing quite like an interactive debugger. I'd be happy to be told I'm wrong, or even that I'm doing it wrong.

Other reasons I'd like to start using Nextflow:

- my homebrew pipeline would be easier to setup/share

- there are some efforts in the proteomics community to develop Nextflow pipelines (eg. QuantMS²). I think it would to have a shared language to express pipelines, and it would make benchmarking simpler.

___

¹ https://github.com/coleifer/huey/

² https://docs.quantms.org/en/latest/

You probably dont need a message queue if you have redis. And quite a lot of code surrounding it. Which also makes it a message queue. Example: [0]

What you might mean is that you might not need a complicated server setup e.g. Kafka for simple message queues?

[0]:https://github.com/coleifer/huey/

I love Huey [1]. I switched everything over from Celery to it. It’s so simple it’s a godsend, even supports SQLite queue

[1] https://github.com/coleifer/huey

You might also like huey (1).

[1] https://github.com/coleifer/huey

I was looking into python task queuing recently, here's some other choices from my notes:

https://github.com/malthe/pq - postgres queue based on rq and ruby queue_classic

https://github.com/coleifer/huey - sqlite, redis and in memory by coleifer (peewee creator). Possible to implement postgres storage layer simply.

https://github.com/closeio/tasktiger - flexible redis-based python task queue alternative to celery

https://dramatiq.io/ - actor based python job queue on redis/rabbitMQ

https://github.com/GoogleCloudPlatform/psq - gcp pub/sub based task queue

Interesting that coleifer wrote the comment. I really respect their open source code. It’s no BS in their task cue program https://github.com/coleifer/huey
> These kinds of dismissive comments make me wonder if the writer ever created anything of substance in their life.

Ad hominem [0].

And he has. I use the things he has built in production. He has contributed to Python community (and SQLite) to a great extent. Some of the amazing things he built [1], [2], [3].

[0] - https://yourlogicalfallacyis.com/ad-hominem

[1] - https://github.com/coleifer/peewee

[2] - https://github.com/coleifer/huey

[3] - https://github.com/coleifer/sqlite-web

Why, my own of course! Peewee[1] is a lightweight ORM, and Huey[2] is a task queue.

[1]: https://github.com/coleifer/peewee

[2]: https://github.com/coleifer/huey