Please pardon my ignorance...
My understanding is that there are 1000s of different compression algorithms, each with their own pros/cons dependent on the type and characteristics of the file. And yet we still try to pick the "generically best" codec for a given file (ex. PNG) and then use that everywhere.
Why don't we have context-dependent compression instead?
I'm imagining a system that scans objects before compression, selects the optimal algorithm, and then encodes the file. The selected compression algorithm could be prefixed for easy decompression.
Compare a single black image that's 1x1 to one that's 1000x1000. PNGs are 128bytes and 6KB, respectively. However, Run Length Encoding would compress the latter to a comparable size as the former.
There are a variety of use cases that dictate which algorithm is going to perform best. For example, you might use Zstandard -19 if you are compressing something once and transferring it over a slow network to millions of people. You might use LZ4 if you are generating a unique large piece of data interactively for thousands of concurrent users, because it compresses faster than Zstandard. Basically, if you're constrained by network bandwidth, Zstandard; if you're constrained by CPU, LZ4.
There are then legacy formats that have stuck around long past their sell-by date, like gzip. People are used to using gzip, so you see it everywhere, but it's slower and compresses worse than Zstandard, so there is no reason why you'd ever use it except for compatibility with legacy systems. (bzip2, 7z, xz, snappy, etc. also live in this "no reason to use in 2023" space.)
Take a look at performance measurements here: https://jolynch.github.io/posts/use_fast_data_algorithms/. For example, gzip can get a compression ratio of 0.41 at 21MiB/s, while Zstandard does 0.38 (better) at 134MiB/s. (Meanwhile, lz4 produces outputs nearly twice as large as Zstandard, but compresses almost 3x faster and decompresses 2.5x faster.)
Lossy compression is even more complicated because the compression algorithms take advantage of "nobody will notice" in a way that's data dependent; so music, video, and photographs all have their own special algorithms.