What does HackerNews think of multihash?

Self describing hashes - for future proofing

Language: Shell

#1 in API
> IPFS now uses base58btc exclusively

That's blatantly wrong. IPFS supports 25 different base representations (https://github.com/multiformats/multibase/blob/master/multib...).

In fact, recently, two community members decided to implement a new base encoding with emojis for fun:

https://cid.ipfs.tech/#%F0%9F%9A%80%F0%9F%AA%90%E2%AD%90%F0%...

https://github.com/multiformats/multihash supports at the very least SHA1 SHA2-256 SHA2-512 SHA3/Keccak Blake2b-256/Blake2b-512/Blake2s-128/Blake2s-256 Blake3 and Strobe. Hashes in IPFS are being standardised through the IETF and W3C https://www.ietf.org/id/draft-multiformats-multihash-05.html.

If you need rhash, you are welcome to submit a PR! We also have a grants program you can use to be rewarded for this.

Also check out multihash from the IPFS folks: https://github.com/multiformats/multihash

It's a more robust, well-specified, interoperable version of this concept.

Though it's probably overkill if you control both the consumer and producer side (i.e. don't need the interoperability) and are just looking to make hash upgrades smoother, in that case a simple version prefix like Go's approach described above has lower overhead.

They use multi-hash [0] in magnet links, presumably for exactly this reason.

... but for consistency (like their narrowing of valid bencode), they’ve presumably chosen one main hash for now, so that every client and server doesn’t have to handle all of these cases as people provide a million variants of the same torrent.

[0]: https://github.com/multiformats/multihash

I don't think you are, and the IPFS ecosystem has already run into this problem. In IPFS, content is addressed by its hash, but different protocols are better suited to different encodings - browsers speak http, which is case insensitive.

And so they've worked out self-describing hashes! https://multiformats.io/multihash/

The basic idea is that you include the name of the algorithm as part of the hash. This allows changing the hash algorithm without breaking backwards compatibility.

It's a cool standard with implementations[0] for many languages. I don't know if it was considered for git, but it does seem likely that this issue will come up again before the end of time :)

[0]https://github.com/multiformats/multihash

I wish more people would adopt (and better formalize) multihash and related projects https://github.com/multiformats/multihash

It’s a really simple good idea that has developed in the ipfs world.

One of the mistakes Git made was that the hashes don't describe what algorithm was used to generate them. That makes backward compatibility and incremental upgrades harder. MultiHash is a solution for such issues: https://github.com/multiformats/multihash
Also relevant: Multihash is a format for self-describing hashes that helps with data portability and future-proofing: https://github.com/multiformats/multihash
For anyone needing to decide what hash function to use, I recommend to have a look at multihash: https://github.com/multiformats/multihash
Instead of just migrating everything to a new hashing function when something gets bad, why not migrate to something that would be future-proof and easy to switch out? Multihash[0] is one solution, where a hash contains information about what hash-function was used when generating the hash, so you can have the same input, multiple hashes.

- [0] https://github.com/multiformats/multihash

Take a look at multihash[0]. I don't know the inner workings of the program, but I imagine it would be possible for `multihash` to periodically rehash files (as a cron job?) when a new crypto algorithm gets introduced.

[0]: https://github.com/multiformats/multihash