What does HackerNews think of citus?

Distributed PostgreSQL as an extension

Language: C

#3 in Database
#4 in PostgreSQL
#4 in PostgreSQL
#1 in MySQL
#2 in Kubernetes
#6 in SQL
Citus: https://github.com/citusdata/citus

BTW, Citus license is GNU Affero General Public License (github lists “conditions: same license”) and hydra is Apache. How is that possible if the latter is a fork? There’s probably something about these licenses I’m not aware of and I’m curious.

As always: it depends. For some workloads something like Citus [1] might allow you stay within the PostgreSQL ecosystem even when you are trying to do OLAP.

[1] https://github.com/citusdata/citus

To be accurte, it's not completely Micrsoft agnostic, it make use case for a PSQL extension citus[1], the company behind this extension has been acquired by Microsoft two years ago[2].

[1]: https://github.com/citusdata/citus

[2]: https://blogs.microsoft.com/blog/2019/01/24/microsoft-acquir...

Maybe I'm reading it wrong, but that looks like yet another multi-master high-availability system, which is also an important property for a system to have but more or less unrelated tow hat I was talking about.

As far as open-source add-ons for true horizontal scaling, I think [Citus](https://github.com/citusdata/citus) is the most well-known and sophisticated, but I don't have enough experience with it in production to have a particularly strong opinion yet. It might work quite well, but I still fundamentally doubt that it will ever be as good as a system that was designed to support horizontal sharding from the ground up.

True, it is developed by Microsoft and available as a service on Azure. It is also open source, actively maintained and improved, and it's a PG13-compatible Postgres extension that adds both distributed database capabilities and columnar storage. :)

https://github.com/citusdata/citus

(Citus engineer)

Citus, cited at the end, is a column store (single node or distributed)

https://github.com/citusdata/citus

> Leaves me more time to work on tuning out database to keep up!

As your using postgres have you looked at citus[0] at all?

[0] https://github.com/citusdata/citus

Citus Data extension can help you scale horizontally.

https://github.com/citusdata/citus

I work at Citus (now Microsoft) so my opinion is biased but I think Citus [1] codebase is a really good example.

It borrows all the best practices from PostgreSQL the naming of variables and functions are more self-explaining in general.

I also believe that the practices around PRs and code reviews are also good examples.

[1] https://github.com/citusdata/citus

> Yes, PostgreSQL is a mature project. But that has little or no bearing by itself on what degree it can scale; Linux is just about as old, yet it still drives most of the infrastructure we're talking about.

I knew someone here would misinterpret why I mentioned its age. As I said, it's not what it was designed for. Perhaps it has bolted on things in the 20+ years since it was first conceived, but it wasn't purpose-built for scale.

> To conflate project age with current robustness is to indulge in fallacious thinking

It only makes sense to talk about "robustness" in the context something was designed to work in. I can't abuse a system and call it unrobust. Of course Postgres is robust. You're twisting what I was saying.

> There are good examples of PostgreSQL running at scale, including at companies operating at scale such as Instagram

I'd actually be keen to learn more about how Postgres works at Instagram scale without their own customisations to make it work in that setting.

If Postgres works well in that setting why do things like this exist? https://github.com/citusdata/citus

citus is all opensource now - https://www.citusdata.com/blog/2016/03/24/citus-unforks-goes...

https://github.com/citusdata/citus

Which is why Postgres is creeping up on both scylladb and cassandra.

Plus PG 10 will have declarative partitioning built in. Pretty cool.

Roadmaps ( PostgreSQL ) 2016-2017-...

* Postgres Professional roadmap ( Pluggable storages, Multimaster cluster with sharding, Effective partitioning, Adaptive query planning, Page-level data compression, Connection pooling, Native querying for jsonb with indexing support, ....) https://wiki.postgresql.org/wiki/Postgres_Professional_roadm...

* EnterpriseDB database server roadmap ( Parallelism, Replication, Vertical Scalability, Performance ) https://wiki.postgresql.org/wiki/EnterpriseDB_database_serve...

====

And "Scalable PostgreSQL for real-time workloads https://www.citusdata.com " --> https://github.com/citusdata/citus

NEW: Scalable & Open Source PostgreSQL extension https://www.citusdata.com ( based on PG9.4 / PG9.5 )

Github: https://github.com/citusdata/citus

HN: https://news.ycombinator.com/item?id=11353322 "Citus Unforks from PostgreSQL, Goes Open Source (citusdata.com)" ( 24th March, 2016 )

"What is Citus?

- Open-source PostgreSQL extension (not a fork)

- Scalable across multiple hosts through sharding and replication

- Distributed engine for query parallelization

- Highly available in the face of host failures "

"Citus provides users real-time responsiveness over large datasets, most commonly seen in rapidly growing event systems or with time series data . Common uses include powering real-time analytic dashboards, exploratory queries on events as they happen, session analytics, and large data set archival and reporting." https://www.citusdata.com/blog/17-ozgun-erdogan/403-citus-un...