The major news are:
- SQL/PGQ - A Graph Query Language
- JSON improvements (a JSON type, simplified notations)
Peter Eisentraut gives a nice overview here: https://peter.eisentraut.org/blog/2023/04/04/sql-2023-is-fin...
I'm a pretty average SQL user, but I've heard expert consultants say before that they could do many more things with SQL databases that developers like me would have maybe grabbed a different tool for, like a graph database. So this then makes me wonder, once there's even broader adoption through PGQ, is that going to be a killer for niche databases like Neo4j, in favor of, say, Postgres?
Graph databases are about as different from RDBMSes storage-wise, as column-stores are from row-stores. It comes down to how you plan to shard data and distribute queries when data doesn't fit on a single node.
Using a graph DB with many underlying KV-store nodes, you can have a single graph spread over many machines representing e.g. Facebook's social graph, and run a query which "chases around" edges between vertices that live on different nodes, to solve that query, while ensuring that as little of that has to happen as possible — both by rebalancing vertices so that data is sharded at low-connection-degree points in the graph; and by consolidating the steps of queries that occur on the same node into single batch queries, such that the whole thing becomes (close to) a single map/reduce step.
There's nothing in Postgres that knows how to do that; if you had e.g. a graph stored in a Citus hypertable, and did a recursive CTE over it to do graph search, then you'd get pretty dang bad perf.
> It comes down to how you plan to shard data and distribute queries when data doesn't fit on a single node.
A problem everbody would love to have but pretty much nobody actually has.
> A problem everbody would love to have
Except the people who do have it and need to keep their business running off of one postgres instance.
For example, at one place just over a year ago they were well into this territory. One of weird problems for them was with pgBadger's memory usage (https://github.com/darold/pgbadger). That's written in perl, which doesn't seem to do garbage collection well. So even on a reporting node with a few hundred GB's of ram, it could take more than 24 hours to do a "monthly" reporting run to analyse PG usage for the time period.
There wasn't a solution in place at the time I left, so they're probably still having the issue... ;)