This is pretty interesting, because the effects of migrating from lz4 to zstd were:
- Total storage usage reduced by ~21% (for our dataset, this is on the order of petabytes)
- Average write operation duration decreased by 50% on our fullest machines
- No observable query performance effects
It seems like the better compression ratio and resulting reduced IO more than makes up for increased CPU compared to lz4. I wish they had mentioned the actual effect on CPU.
Compare to the recent thread "The LZ4 introduced in PostgreSQL 14 provides faster compression" [0] where the loudest voices were saying that zstd would not work due to increased CPU. This is a different layer (filesystem compression vs db compression), but this article represents an interesting data point in the conversation.
Author here - it's difficult to provide a single number to summarize what we've observed re: CPU, but one data point is that average CPU utilization across our cluster increased from ~40% to ~50%. This effect is more pronounced during NA daylight hours.
Worth noting that part of the reason this is relatively low impact for our read queries is that the hot portion of our dataset is usually in Postgres page cache where the data is already decompressed (we see a 95-98% cache hit rate under normal conditions). We've noticed the impact more for operations that involve large scans - in particular, backups and index builds have become more expensive.
how/why did you choose Postgres over MariaDB? I am facing such a decision now.
After working for years with both, I'd say that PostgreSQL is much more friendlier to the developer and more pleasant to work with. In any area: documentation, features, error messages, available SQL features, available extensions, available docs, available books.
One tiny example: I prefer to work with databases using CLI interfaces (mysql and psql).
psql CLI is a tool which is pleasant to use, has no bugs in the interface and it even gets improvements from time to time.
mysql CLI is awful to use (e.g. doesn't display long lines properly, has difficulties with history editing, etc) and looks like there wasn't a single improvement since 1996 (I'm sure there were, I just never felt the effect of such improvements).
Actually, there was a significant regression. Many many years ago, Oracle decided to drop the support for the gpl-licensed readline altogether, likely because they can't ship it with MySQL Enterprise. To this day, Percona still carries a small patch to add that functionality back, which is great because I wouldn't touch any CLI without readline.
https://github.com/hanslub42/rlwrap
Note that I've never tried it myself with the mysql/mariadb CLI, but I have used it with other tools, and it's brilliant.