Disclaimer: This is a fun thought experiment. I'm not looking for actionable results, or advocating for relying on any of this comment for actual security. I'm clearly not a cryptographer; I just think it would be interesting to talk about here, and maybe more educated people could comment on how well these approaches might mitigate the exploits in the article. Play with me in this space.

I'm curious if people have any interesting ideas on how to add some seasoning to MD5 to make it more secure. That is, simple, intuitive things you can do in combination with MD5 such that all the pieces in your scheme are still easily understood and don't amount to a new hash algorithm that can only be understood as a black box. Pretend MD5 is the only hash algorithm that has ever been found. Or that you're the Gilligan's Island Professor and MD5 hashes are your coconuts. What are the most potentially useful things you can build out of the most primitive, dumb components?

For example:

- Output the length of the input (or a hash of the length if you must have a constant-length output)

- Hash the input forwards and backwards and produce two hashes. (Remembering that, though the output is 256 bits now, you still only have coconuts to work with.)

- Include more complicated variations on the input in the hashes. e.g. start in the middle and oscillate forward and backward over the input, or move the second half of the input in front of the first before hashing, or use the input/hash of the input to seed a pseudorandom re-ordering of the input before hashing, etc.

- Format-aware hashing - whatever program will interpret the content of the file can also produce a hash, or some [canonical] interpretation of the content that can be hashed. e.g., for an image format, we could ask the renderer how many iterations of some operation it had to perform to render the output, or in the worst case, hash the bitmap it produced.

For sha1, people made a system where you can detect the patterns that lead to a collision, and (for example) replace it with a different hash only for inputs that would be a problem. https://github.com/cr-marcstevens/sha1collisiondetection i think git does this to eek more life out of sha1.

I imagine you could take a similar counter-cryptnalysis approach to md5. (I am out of my depth here, so there could be reasons this doesnt work for md5 im unaware of)