One idea I've thought about in this same area is 'compressing' the URLs via Unicode. Suppose you limit the page contents to just [a-zA-Z0-9] and about 15 special characters, and space/newline. Documents would be rendered without any formatting aside from what you can do with a monospace font (all caps, white space, tables, etc.).
This allows about 80 unique characters - about 6.5 bits per character. There are dozens of thousands of visible Unicode characters, at least 8 bits each. Using modern compression techniques you can probably get around 4 or maybe even 5 characters of the real text into one visible Unicode character. This would compress the messages so that you can share them over SMS/Twitter (split into a few parts perhaps) or Facebook without posting giant pages of text.
The data could start with a 'normal' not-rare-Unicode character indicating the language, and the same character at the end of the stream to make it easy to copy paste:
aa
Multiple translation tables could exist for different languages. Japanese and Indic languages have ~50 alphabet characters but no capitals, so I think you would be able to get similar compression regardless of input language.Users would go to a website that does decompression with JS and just paste in the compressed text to view the plaintext. And vice versa. If the compression format is open then anyone can make and host a [de]compressor.
Version codes may be useful to add, so that more efficient compression techniques don't break backwards compatibility. The first release of the English compression algorithm would start and end with 'aa', a better algorithm released a few years later would use 'ab', etc. Similarly the version code could be used to indicate a more restricted set of characters to allow for better compression - [a-z0-9] and . , space newline is 30 characters which is only 5 bits per character before any further compression.
There are few other implementations of the same idea out there, too. Here's another one: https://github.com/qntm/base65536 ; since it uses a power-of-two sized alphabet, it avoids some of the work mine has to do around partial output bytes. That link also links to some other ideas & implementations
The exact metric I cared about at the time was compressing for screen-space, not for actual byte-for-byte space. I wrote it after spending some time in a situation where doing a proper scp was just a PITA, unfortunately, but copy/pasting into a terminal is almost always a thing that works. But I was also dealing with screen/tmux, which makes scrolling difficult, so I wanted the output to fit in a screen. From that, you can implement a very poor man's scp with tar -cz FILES | base64 ; the base-unicode bit replaces the base64.
Mine doesn't use Huffman coding as it wants to stream data / it leans on gzip for doing that.