What does HackerNews think of dangerzone?

Take potentially dangerous PDFs, office documents, or images and convert them to safe PDFs

Language: Python

start here: https://github.com/freedomofpress/dangerzone

i've never used it, but i've been meaning to check it out. at least it should give you a jumping off point for further investigation.

if that is insufficient, use proofpoint.

for archives that are tickling bugs, you have to use a similar technique. it's not enough to analyze them and send them on as-is. you have to unpack in a sandbox (which will be detectable, no 2 ways about it, but the question is will anyone expend enough effort to detect -- no, not for your use case, seeing as how you're asking the question at all), process with dangerzone or dangerzone-like tool, then re-archive it and let the user see only that new archive.

You can use something similar on macOS, Windows or Linux, based on Docker containers, see Dangerzone: https://github.com/freedomofpress/dangerzone
Check out DangerZone. It encodes a .pdf (and other formats) to image data then converts it back to .pdf, optionally preserving OCR'ed text, so that any potential executable code hidden within is lost. For further security, all operations run sandboxed.

https://github.com/freedomofpress/dangerzone

Is there a quick command line or GUI tool that strips a PDF of everything except text/formatting?

In other words, can I run `pdf-make-safe` on a file before opening it to make sure it can't execute arbitrary code the moment I do?

I found this:

https://github.com/freedomofpress/dangerzone

Is it any good?