Sounds really interesting.
Does it support efficient random access?
We currently are exploring a seekable format for Zstandard right now. What use case do you have in mind -- network or at rest? If you want to file a github PR describing how you'd like it to work, that will help guide our implementation. We have a few in mind based on internal Facebook use cases, but the more different needs we know about, the more general purpose the result will be.
I have always thought that the 7-Zip format is interesting in the way it (or the reference implementation of the compressor, at least) groups files by extension, which I guess helps compression by making it more likely that chunks of files that are common within a filetype end up in the dictionary before it fills up with chunks from all filetypes. Do you have any thoughts about this? Have you looked at the 7-Zip format?