What does HackerNews think of sandsifter?

The x86 processor fuzzer

Language: Python

This reminds me of sandsifter, the x86 processor fuzzer:

https://github.com/xoreaxeaxeax/sandsifter

This is a really clever technique! I was impressed by sandsifter[1] when it originally came out, and this seems an awful lot faster and less prone to false negatives (since it's purely speculative and doesn't require sandsifter's `#PF` hack).

At the risk of unwarranted self-promotion: the other side of this equation is fidelity in software instruction set decoders. x86's massive size and layers of historical complexity make it among the most difficult instruction formats to accurately decode; I've spent a good part of the last two years working on a fuzzer that's discovered thousands of bugs in various popular x86 decoders[2][3].

[1]: https://github.com/xoreaxeaxeax/sandsifter

[2]: https://github.com/trailofbits/mishegos

[3]: https://ww.easychair.org/publications/preprint_download/1LHr

Idea:

If any assembler/disassembler author/team out there wants to produce an assembler/disassembler which is authoritative (difficult to do on x86, because there are so many different possible combinations of instruction encoding, https://github.com/xoreaxeaxeax/sandsifter : "Typically, several million undocumented instructions on your processor will be found, but these generally fall into a small number of different groups.") -- then what they'd do is to create a third program -- which "pits" the output of Assembler A vs. Assembler B, Disassembler A vs. Disassembler B...

That is, between any two assemblers (for the same CPU architecture/instruction set), or any two disassemblers (again, for the same instruction set), where are the anomalies?

If we think about an assembler as a simple function, y=f(x), that is, I give it a string of ascii bytes as input (x), and I get a string (1..n) binary bytes as output (y), and a disassembler as the reverse function, then how difficult would it be to write a program which imported the assembly functionality of two (or more) assemblers, and then just started comparing the outputs?

Well, there's a slight problem there, which is, that you'd have to create a series of strings representing valid assembler instructions first...

But, why not let the disassembler(s) do that!

So our future program for "pitting" assembler vs. assembler, disassembler vs. disassembler looks like this:

1) Start with a single byte, 00.

2) Pass that byte to the disassembler. Is it a valid instruction?

3) If so, pass the string passed back to an assembler, and get the result of that.

4) Is the resulting binary byte (or byte string) the same as the one we started with? If so, all OK. If not, log the anomaly to a log!

5) Increment single byte by 1, perform above instructions in a loop until after we hit 255, then start with a 2-byte string, and same thing (like an odometer). Keep doing this until we've expanded to the max allowable for a x86, which I believe is 15 bytes in length (Note: That's one BIG number(!) -- could we perform this loop in one lifetime? I don't know... perhaps if it took too long we could intelligently skip some combinations like Christopher Domas does in Sandsifter)...

But anyway, that would be the algorithm for "pitting" Assembler vs. Assembler (or perhaps more specifically Assembler vs. Disassembler (you get the general idea of what's being said here!)) -- and comparing the results!

Think of it as a 'diff' tool -- but for the output of assemblers/disassemblers -- as opposed to files...

Why?

Well, because x86 is complex, to say the least!

And well, because it's more likely than not, that any given Assembler/Disassembler -- contains bugs and/or errors, even though they might not be intentional!

Anyway, if no one else does it... I'll do it in the future (too busy with other things right now!)... so "Note to future self" .

But the entire program would amount to a few loops...

Actually, that's another good point... Any assembler/disasembler program worth its salt -- should provide its libraries with Python (or other easily scriptable language) bindings... many do; some don't; just a random related observation...

Some interesting links, to be sure!

Yes, fpdebug definitely looks like it is the better written code of the two...

I will definitely have to grab myself a copy of its source code and play around with it...

>It has some minor errors in the Rex decoding though, if I remember correctly....

By Rex, you mean Regex, right? Well, it having issues there wouldn't surprise me one bit... If I used it, fpdebug, I probably wouldn't use it with all of the bells and whistles turned on -- I'd probably pare it down to only the most essential source code...

But that being said, I think that having that source code to refer to -- could be highly beneficial...

See, there's this problem in Software Engineering, which looks something like this:

Let's say I want an open source library (and in Pascal no less!) whose purpose is twofold, on the one hand to translate raw binary x86 machine code sequences into readable text, and on the other, to translate the text of an x86 assembler instruction, into binary.

OK, so now, the $64,000 question is:

Who (or what) -- is the absolute source of truth for that library?"

See, even if that library were written by Intel themselves, I guarantee that there will be mistakes...

The mistakes are not due to documentor and/or programmer negligence (Intel or otherwise) -- so much as they are due to instruction set complexity...

You see, I'll bet, pound for pound, line of code for line of code, that any assembler/disassembler routines for say a RISC instruction set (RISC-V for example) -- should be much, much shorter (and more elegant to boot!) than Intel's "everything but the kitchen sink, we keep expanding it because it gives us the competitive edge via patent and other legal protections" constantly-expanding-the-instruction-set approach...

So the issue, at least in x86-land is,

"Who (or what) -- is the absolute source of truth with respect to the x86 instruction set?"

Also, remember that Christopher Domas (Google him, you'll find a whole lot of interesting stuff) -- discovered that x86 processors typically can and do implement all sorts of undocumented instructions:

https://github.com/xoreaxeaxeax/sandsifter

>"Typically,

several million undocumented instructions on your processor will be found

, but these generally fall into a small number of different groups. After binning the anomalies, the summarize tool attempts to assign each instruction to an issue category:

o Software bug (for example, a bug in your hypervisor or disassembler),

o Hardware bug (a bug in your CPU), or

o Undocumented instruction (an instruction that exists in the processor, but is not acknowledged by the manufacturer)"

Anyway, thanks for the link! (The second one! )

You might be interested in sandsifter [0], which does precisely that!

[0]: https://github.com/xoreaxeaxeax/sandsifter

At a large enough N 5% to 10% is massive, at a small enough N 5% to 10% gets you closer to competitive, somewhere between though is where those extra percentage points are "meh".

For instance I run my computing on a TR 1950X and I could probably undervolt or overclock the chip but, to be honest, it's not worth the perceived "risk". Undervolting could break specific instructions in undetectable ways and the only way to be sure things haven't broken would be to run something like sandsifter [0] multiple times. Because I, and many others who would have the technical know how, write software on these systems it's not very much worth the headache. If I had a seperate system I built just for gaming or doing a specific task and I wanted to reduce how much I spent, or the power usage, etc undervolting would then shift out of headache -> super worth it.

The last thing I want to have to do is run every piece of code I need through godbolt and pray there's no AVX instructions that "may" be fishy.

I think this is largely why cloud providers don't do things like this as aggressively. They do it to some extent with custom off-roadmap chips but not controlled dynamically in userspace. If we can get something like this original post into the kernel that would be a massive win for everyone.

[0] - https://github.com/xoreaxeaxeax/sandsifter

Very interesting. I'm trying to do something closely related for other reasons.

One thing I'm thinking of is that it might be possible to brute-force the semantics of short snippets of code using genetic algorithms. A similar technique has been demonstrated a few times by author of [1].

I want to use this to eventually rapidly search a large number of binaries for insecure behavior. But to do that I need to be able to formulate questions like:

"Find me a function where attacker controlled data is marshaled to a size type and then used to allocate memory, to which a different attacker controlled amount of attacker controlled data is written".

Basically this: https://github.com/Battelle/PaperMachete

[1]: https://github.com/xoreaxeaxeax/sandsifter

Great project and write-up. I'm reminded of a couple other projects.

MC Hammer project for LLVM tests round-trip properties of the ARM assembler. http://llvm.org/devmtg/2012-04-12/Slides/Richard_Barton.pdf

Sandsifter x86 fuzzer. https://github.com/xoreaxeaxeax/sandsifter https://youtu.be/KrksBdWcZgQ

Fuzzing isn't really practical if all you do is just generate a totally random bit stream for input. There are many much more clever and robust strategies to hit as many edge cases as possible. Check AFL[1] for some details on generating smart random input files. You can also combine that with pretty advanced dynamic execution analysis to fuzz against unknown processor instruction sets, like in sandsifter[2].

[1]: http://lcamtuf.coredump.cx/afl/

[2]: https://github.com/xoreaxeaxeax/sandsifter

It seems this is the preferred URL: https://github.com/xoreaxeaxeax/sandsifter - for example the issue tracker is enabled, and has 45 issues, whereas the other URL has the issue tracker disabled.

Can one of the admins fix?

I wonder how easy it would be to port https://github.com/xoreaxeaxeax/sandsifter to the RISC-V instruction set.

Would probably be a decent step in the right direction for validating/verifying the future of trusted computing.

Although... this gives rise to a 2nd thought. If it was _this easy_ to build a RISC-V implementation, is it all that special, technically speaking? I ask as someone naive about processor design. Is implementation relatively straightforward, but design hard?

While it is (a bit disappointingly) a known instruction, it seems that Domas has came up with this by his own (via sandsifter [1]). Also, it is really unexpected that there is a documented userland instruction that goes to ring 0 anyway, so it is a "documented" "backdoor". In any case I still think that this study is interesting, mainly because it shows that we can inspect such instructions seemingly out of nowhere.

[1] https://github.com/xoreaxeaxeax/sandsifter

Sorry you are getting down-voted. In fact, Intel and AMD have thousands of undocumented instructions that folks have only started [1] to enumerate. Even then, some of the instructions may not be visible through this method of enumeration. FB would have the option to add their own instructions. I would be curious to see how transparent they might be around the production and documentation of these chips.

[1] https://github.com/xoreaxeaxeax/sandsifter

Main reason is cost to secure vs profits to be made. We technically could have more secure chips, but that would delay the regular release cycle of processors that businesses, investors and consumers have come accustomed too.

Now if things were secure by default with no option to disable it we would be in a much better place right now security wise (not sure about how easy the use-ability would be though at first) as engineers would have to adapt to programming more securely on the hardware and software level. Though I did find it strange that when you read the Intel documentation everything is not accurate and there is a large amount of illegal opcode [0].

There is a chance that these vulnerabilities were reported internally and externally, never made it past a manager or never reported to the vendor, seen as not exploitable or theoretical by management or higher level engineer until someone else found it and figured out how to exploit it with time.

This is the same case with encryption algorithms that have been implemented on chip, in theory they are hard to brute force, but with time normally a method is created that can reduce this time by using artifacts found in the algorithm or certain symptoms that occur during the process of attempted encryption/decryption that could be a game changer that was not thought of or documented.

[0] https://github.com/xoreaxeaxeax/sandsifter

Ah, this is the talk behind the sandsifter tool that was making its round a few weeks back. Nice to get the deeper picture.

github: https://github.com/xoreaxeaxeax/sandsifter white paper: https://github.com/xoreaxeaxeax/sandsifter/blob/master/refer...