What does HackerNews think of sha1collisiondetection?
Library and command line tool to detect SHA-1 collision in a file
There is also work to support SHA-256, though that seems to have stalled: https://lwn.net/Articles/898522/
The fundamental problem is that get developers assumed that hash algorithms would never be changed, and that was a ridiculous assumption. It's much wiser to implement crypto agility.
I imagine you could take a similar counter-cryptnalysis approach to md5. (I am out of my depth here, so there could be reasons this doesnt work for md5 im unaware of)
[1] https://github.com/cr-marcstevens/sha1collisiondetection
That is very hard, but not what was quoted above. The length has no part in it. The core part needed for the shattered collision attacks involves basically binary data.
$ curl -s https://shattered.io/static/shattered-1.pdf | hexdump -C > s1
$ curl -s https://shattered.io/static/shattered-2.pdf | hexdump -C > s2
$ diff s1 s2
13,20c13,20
< 000000c0 73 46 dc 91 66 b6 7e 11 8f 02 9a b6 21 b2 56 0f |sF..f.~.....!.V.|
< 000000d0 f9 ca 67 cc a8 c7 f8 5b a8 4c 79 03 0c 2b 3d e2 |..g....[.Ly..+=.|
< 000000e0 18 f8 6d b3 a9 09 01 d5 df 45 c1 4f 26 fe df b3 |..m......E.O&...|
< 000000f0 dc 38 e9 6a c2 2f e7 bd 72 8f 0e 45 bc e0 46 d2 |.8.j./..r..E..F.|
< 00000100 3c 57 0f eb 14 13 98 bb 55 2e f5 a0 a8 2b e3 31 | 000000c0 7f 46 dc 93 a6 b6 7e 01 3b 02 9a aa 1d b2 56 0b |.F....~.;.....V.|
> 000000d0 45 ca 67 d6 88 c7 f8 4b 8c 4c 79 1f e0 2b 3d f6 |E.g....K.Ly..+=.|
> 000000e0 14 f8 6d b1 69 09 01 c5 6b 45 c1 53 0a fe df b7 |..m.i...kE.S....|
> 000000f0 60 38 e9 72 72 2f e7 ad 72 8f 0e 49 04 e0 46 c2 |`8.rr/..r..I..F.|
> 00000100 30 57 0f e9 d4 13 98 ab e1 2e f5 bc 94 2b e3 35 |0W...........+.5|
> 00000110 42 a4 80 2d 98 b5 d7 0f 2a 33 2e c3 7f ac 35 14 |B..-....*3....5.|
> 00000120 e7 4d dc 0f 2c c1 a8 74 cd 0c 78 30 5a 21 56 64 |.M..,..t..x0Z!Vd|
> 00000130 61 30 97 89 60 6b d0 bf 3f 98 cd a8 04 46 29 a1 |a0..`k..?....F).|
An ASCII formatted file only has text data. Also, with the shattered attack you can't choose what the two versions should be so you are required to cross reference the different looking binary data to turn on/turn off some functionality. So the attack is mostly interesting when you include binary data. With the chosen prefix attack, you can have two arbitrary components, even textual ones, but they still have to be followed by such a binary component.Also now git has collision detection code from sha1collisiondetection [1], making attacks even harder.
[1]: https://github.com/cr-marcstevens/sha1collisiondetection
> git doesn't really use SHA-1 anymore, it uses Hardened-SHA-1 (they just so happen to produce the same outputs 99.99999999999...% of the time).[1]
https://stackoverflow.com/questions/10434326/hash-collision-...
There's essentially no chance that the string "foo\n" fell into that tiny probability of difference. The reason there's a difference is because before git hashes something, git will do various processing to it (maybe appending and prepending various things) and those things broke the carefully created collision. But a chosen-prefix attack might mean those various things can be accounted for, and a collision could still be found.
So we need to directly run hardened SHA1 on the data, which I believe is located at https://github.com/cr-marcstevens/sha1collisiondetection
As seen in https://github.com/git/git/blob/master/sha1dc_git.c
So I tested that one:
$ sha1collisiondetection-master/bin/sha1dcsum bar baz messageA messageB shattered-1.pdf shattered-2.pdf
f1d2d2f924e986ac86fdf7b36c94bcdf32beec15 bar
f1d2d2f924e986ac86fdf7b36c94bcdf32beec15 baz
4f3d9be4a472c4dae83c6314aa6c36a064c1fd14 *coll* messageA
9ed5d77a4f48be1dbf3e9e15650733eb850897f2 *coll* messageB
16e96b70000dd1e7c85b8368ee197754400e58ec *coll* shattered-1.pdf
e1761773e6a35916d99f891b77663e6405313587 *coll* shattered-2.pdf
So it does protect against the new attack.The only thing that prefixing the length makes difficult is using the same prefix multiple times: you basically have to make up your mind about the type and length before mounting the shattered attack. Also, the prefix means you have to do your own shattered attack and can't use the PDFs that google provided as proof of their project's success. Price tag for that seems to be 11k.
[1]: https://github.com/cr-marcstevens/sha1collisiondetection
Git quickly switched to the sha1collisiondetection library[1] by default after the SHAttered attack was published. It's a SHA-1 library written by the authors of the paper which the attack.
Edit: Marc Stevens saying that existing library will mitigate this new attack: https://twitter.com/realhashbreaker/status/11284190295369236...
It's what Git uses by default, of course there's no guarantee that new SHA-1 attacks won't be discovered, but it's better than nothing.
There are metrics that will alert GitHub's infrastructure team if a collision is found (to confirm that we aren't seeing any false positives). Those metrics were quietly shipped (without the matching "die") for a week before flipping the final switch.
If you want to know more about the patterns, see the sha1collisiondetection project:
https://github.com/cr-marcstevens/sha1collisiondetection
There's a research paper linked in the README.
My understanding of it is that it runs a SHA-1 and examines the internal state of the of the digest along the way to see if it matches up with known vectors that could be manipulated to cause a collision.
They're basically building that into git so that if this specific collision attack is ever used, git will notice and throw a warning/error.
Somebody already submitted patch series to (optionally) use it in git in place of SHA-1:
and it's super effective: The possibility of false positives can be neglected as the probability is smaller than 2^-90.
It's also interesting that this attack is from the same author that detected that Flame (the nation-state virus) was signed using an unknown collision algorithm on MD5 (cited in the shattered paper introduction).