Can someone explain to me how a pdf can execute code?
Exploits in the PDF viewer.
The Adobe tools in particular have been a bountiful source of exploits for decades, but it's a complicated spec and there are plenty of opportunities for bugs.
I see, much like Unicode exploits. I use Chrome to view PDFs which I assume to be safe.
So, our best effort is to constrain what certain data can do when we process it, in the hope that this prevents surprising negative consequences like a PDF that steals privileged information and sends it elsewhere.
Notice that, in some sense, a PDF which just contains a photograph of your wife tied to a chair and holding today's newspaper, plus human readable text like, "We have your wife Sarah and all three kids Beth, Jim and Amanda. We are watching. Do not try to call for help. Email the privileged information to [email protected] or we will kill your family" is also potentially effective at doing this, but we would not usually consider that an exploit in this context.
One irritation in this space is that programmers love General Purpose Programming Languages. The idea of the general purpose language is that it can do anything. But the problem in this sort of situation is that we don't want programs which can do anything, in fact doing anything is our worst case scenario. We actually want Special Purpose Programming Languages. We want to write our PDF data processing software in a language that even if we were trying can't do the things that should never happen as a result of processing a PDF.
This is the purpose of languages like WUFFS: https://github.com/google/wuffs
You can't write a WUFFS program to, for example, email anything to [email protected] even if you desperately needed to, which means you definitely won't accidentally write a program which can email the privileged information to the crooks when fed a PDF. Of course the PDF mentioned earlier with the kidnap note inside it could still work. And also of course making a PDF renderer out of WUFFS would be a really big ask. WUFFS-the-library today can render PNG, GIF, BMP but notably not yet JPEG. But it's clearly possible for something like PDF rendering to happen under these constraints. Nobody ordinarily viewing a PDF wants it to do arbitrary stuff.