Nitpick, but relational does not mean joins, it means tables/rows of tuples. A "relational document database" which is the slogan of Fauna it seems is a contradiction in terms.
That’s technically correct, and I think the author would say he’s aware of that definition.

The article as I read it is trying to make a broader point, that there are underlying mathematical principles that inspired Codd’s relational model.

I’ve never had cause to explore it, but my understanding is that there’s nothing in those principles that require tables/rows of tuples.

One goal of the article seems to be to inspire a curiosity in knowledgeable readers: what happens if you build a document database that also supports the same mathematical principles that inspired the relational model?

A relation is by definition a set of tuples (informally called a table where the tuples are the rows).

Codds relational database model adds the further constraint that nested tables are not allowed (first normal form), instead representing relationships through foreign keys.

Codds motivation for disallowing nested tables is that it makes query languages much simpler. He develops relational algebra which is the foundation behind SQL, which is why SQL does not allow nested tables.

Document databases does not follow first normal form and allows nested structures, so they cannot be queried with relational algebra, since it doesnt have a way to “drill down” into nested structures.

It is unclear to me what “mathematical principles” remain if you remove the notion of relations from the relational model.

Out of my depth here (no experience) but "Codds relational database model adds the further constraint that nested tables are not allowed" may be wrong. He allowed nested stuff, it's just that SQL didn't support it.

Can anyone elucidate? Please don't shout that I'm wrong because there was something there in his first paper.

No, he explicitly disallows nested relations. This is the definition of first normal form.

Hierarchical databases (which predate relational) can be understood as nested relations, and Codds first example of normalization is how to extract the nested relations in such a database into seperate tables and instead express the relationships through foreign keys.

Thanks for a polite disagreement, but I believe you are wrong (not saying you are!). IIRC Codd defined relation valued attributes and also associated operators Group and Ungroup. https://www.oreilly.com/library/view/sql-and-relational/9781...

also https://shark.armchair.mb.ca/~erwin/RA_Intro.htm

"

Relations are, themselves, values too, and relation attributes can therefore be declared to be of another relation type. Such attributes are called 'Relation-valued attributes' (RVA's for short).

In the RA, two operators are available that allow us to manipulate relations in connection with RVA's : GROUP and UNGROUP

"

Like I said, I'm a bit out of my depth here so take the above as evidence rather than proof that such things existed, but I'm pretty sure I saw this, hand-drawn, in one of Codd's original papers.

.

Edit: you are right

"Codd proposed a normal form thathe called first normal form (1NF), and he included a requirement for 1NF in his definitions for 2NF,3NF, and subsequently BCNF. Under 1NF as he defined it, relation-valued attributes were “outlawed”;that is to say, a relvar having such an attribute was not in 1NF."

https://fliphtml5.com/qprz/cxon/basic/201-235

No, it doesn't mean he's right. The "normal forms" could merely be suggestions for a database designer, not a technical limitation enforced by the software itself.

No one has provided convincing evidence that Codd intended to exclude nested tables entirely. People seem to be conflating i) good database design, as suggested by Codd ii) the feature-set of a DBMS, also as suggested by Codd.

> The "normal forms" could merely be suggestions for a database designer, not a technical limitation enforced by the software itself.

I think most of the motivation for normal forms is to avoid 'update anomalies', which is essentially, don't represent the same information in two places in your base relation variables (aka tables in SQL). So you can have repeated values or nested relations in queries, and you can have them in base tables which are morally normalized, as long as there's no possibility that these lead to the same information being recorded in two distinct places.

When people talk about 'denormalizing' and it's justified, I think it's breaking this rule about representing information in two or more places in exchange for performance. If you do this, the application programmer has to be careful to keep these multiple locations in sync - a kind of consistency you don't have to think about in a clean database design. I think that database management software in general cannot enforce normalisation - it can only make it easier or more difficult to use it with normalized databases.

In theory, the DBMS itself could directly support 'physical denormalization' and make this performance optimisation easier to implement and transparent to the application code. I think some SQL DBMSs have attempted to do things like this.

(Posted under a different account because I'm being slow-posted again by HN)

> In theory, the DBMS itself could directly support 'physical denormalization' and make this performance optimisation easier to implement and transparent to the application code. I think some SQL DBMSs have attempted to do things like this.

Automatically managed, application-transparent, physical denormalisation entirely managed by the database is something I am very, very interested in. Unfortunately I've been able to find pretty well nothing to describe what it would look like and how it would be done. If you can provide any links that would be so incredibly helpful!

It gets mentioned in the Date/Darwen books as being the right way to do things, but no actual information seems to be given.

> Automatically managed, application-transparent, physical denormalisation entirely managed by the database is something I am very, very interested in.

Sounds a bit like Noria: https://github.com/mit-pdos/noria