What does HackerNews think of AppleNeuralHash2ONNX?

Convert Apple NeuralHash model for CSAM Detection to ONNX.

Language: Python

Or just make your own fake/questionable hash collisions with a script Some Guy made on Github: https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX
Well, it’s mostly easy to tell since we still have researchers decompiling and scrubbing through the OS to see what’s in it and what it does[0].

https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX

I'm not big into "conspiracy" but you gotta wonder why this "NeuralHash" file on my computer exists if "NeuralHash" was supposedly delayed, according to Apple, on an undetermined timeline.

If I did want to cook up conspiracy theory, it would be easy: Apple wants to distract from the fact that NeuralHash was broken by researchers. This project apparently is able to create CSAM collisions:

https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX

If NeuralHash is really rolled out, and if this python project can really create collisions, the CSAM system could be DDoS'd by people on their own computers, jamming up Apple's internal censorship review system with false positives. Hence, Apple would be incentivized to sweep this under the rug by "delaying" rollout indeterminately.

Why are exact collisions interesting? They are not intended to be compared exactly.

This algorithm doesn't even give exact matches for the same image on different hardware.

https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX

Note: Neural hash generated here might be a few bits off from one generated on an iOS device. This is expected since different iOS devices generate slightly different hashes anyway. The reason is that neural networks are based on floating-point calculations. The accuracy is highly dependent on the hardware. For smaller networks it won't make any difference. But NeuralHash has 200+ layers, resulting in significant cumulative errors.

> There are no news articles that explain how anyone will be falsely accused for having pictures of their own baby.

Umm... hash collisions that everyone keeps warning about is not enough?, all the discussions so far, I'll just go ahead and assume your comment here is in bad faith.

> The system is even resistant against intentionally created false positives.

Famous last words... Here's one of the top posts for reddit.com/r/apple

https://old.reddit.com/r/apple/comments/p930wu/i_wont_be_pos...

Here's a really high quality collision: https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX//issu...

Here's 2 totally different images off by a single BIT: https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX//issu...

Here's a dog and a kid colliding: https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX//issu...

It took a few days after extracting the model to show how flawed it is... Apple's only 'security' feature here was obscurity...

It's so broken the person doing analysis above stopped as Apple will only change the hash function to include his pictures as training data instead of fixing the whole system.

Are you still convinced?

Having a second 'perceptual' hash doesn't really add much value... I'm not an expert, here's a better view on why: https://news.ycombinator.com/item?id=28243031

Also funniest bit from that on how broken it is

"Finding a SHA1 collision took 22 years, and there are still no effective preimage attacks against it. Creating the NeuralHash collider took a single week."