Materialize is a really interesting solution, and I love for what it stands.

But the documentation is missing more details about the architecture overview. A single update could cause many gigabytes of data to shift on the materialize view, and I do not understand how Materialize would handle that scale.

What you're asking about is the magic at the heart of Materialize. We're built atop an open-source incremental compute framework called Differential Dataflow [0] that one of our co-founders has been working on for ten years or so.

The basic insight is that for many computations, when an update arrives, the amount of incremental compute that must be performed is tiny. If you're computing `SELECT count(1) FROM relation`, a new row arriving just increments the count by one. If you're computing a `WHERE` clause, you just need to check whether the update satisfies the predicate or not. Of course, things get more complicated with operators like `JOIN`, and that's where Differential Dataflow's incremental join algorithms really shine.

It's true that there are some computations that are very expensive to maintain incrementally. For example, maintaining an ordered query like

    SELECT * FROM relation ORDER BY col
would be quite expensive, because the arrival of a new value will change the ordering of all values that sort greater than the new value.

Materialize can still be quite a useful tool here, though! You can use Materialize to incrementally-maintain the parts of your queries that are cheap to incrementally maintain, and execute the other parts of your query ad hoc. This is in fact how `ORDER BY` already works in Materialize. A materialized view never maintains ordering, but you can request a sort when you fetch the contents of that view by using an `ORDER BY` clause in your `SELECT` statement. For example:

    CREATE MATERIALIZED VIEW v AS SELECT complicated FROM t1, t2, ... -- incrementally maintained
    SELECT * FROM v ORDER BY col LIMIT 5                              -- order and limit computed ad hoc, but still fast
[0]: https://github.com/TimelyDataflow/differential-dataflow