But as a software developer, it’s always fun to think about edge cases, and squeezing almost 5KB into a 280-“character” tweet is fun
This makes me wonder if anyone has created a version of base64 that uses the vast, sprawling space of unicode to take advantage of these glyph-count-based restrictions.
If they have, I hope they called it uuuniencode.
https://github.com/qntm/base2048
It can store 385 bytes per tweet. This link includes a bit more technical explanation of how Twitter counts characters towards the limit. Apparently, using the entire range of unicode characters does not improve compression because of the double weighting of emojis and other characters as described in TFA. It links to a base131072 encoding which can only store 297 bytes per tweet.