Recently worked on a CPU JIT targeting x86-64 host as well, so had some fun reading the source code. I was delighted to discover it to be pretty much as simple as it can be.
The author is lucky brainfuck is so simple, no need for the hairy generic MOD/RM [0] addressing mode encoding.
Of course instruction encoding is far from the hairiest part of the process of JITting. Data flow simplification (= don't generate unnecessary intermediates, like target architecture flag register bits that never affect execution), register allocation and instruction selection are pretty complicated optimization problems.
Oh, and indirect calls and jumps. How you'll hate those when writing a CPU JIT... Especially those that very inconveniently jump in the middle of a previously generated JIT trace.
[0]: MOD/RM is has four modes (MOD).
One for simply accessing a register. Three for addressing memory through pointers in various ways. The most complicated case is SIB (scale + index + base) addressing mode. In C, it'd be something like:
addr = (reg1<;
Where scale is 1, 2, 4 or 8, value is a signed 8 or 32-bit value. Reg1 can't be SP/ESP/RSP due to encoding limitations.Some x86(-64) register combinations can't be encoded directly, so the encoder will need to play around and possibly "upgrade" the addressing mode to a more complicated SIB version.
For more curious, this page was very informative for writing my addressing mode encoder:
https://wiki.osdev.org/X86-64_Instruction_Encoding#ModR.2FM_...
Yes, the project was fun, yet simple and fast to write! I was thinking about doing some more advanced optimizations, although I believe the opportunities for that are pretty limited for Brainfuck. Also, I think the code is pretty nice and clean and generic enough to be partially moved to another possible project, which is the gain of doing things more generic than they actually needed to be.
You said you were working on another JIT - what was that? Also, do you have any suggestions of another project/language to implement compiler for to gain some knowledge about writing optimizations, actual intepretation/compilation logic, etc?
Thanks for the additional resources about the instruction encoding - seems like this is a bit easier to understand than AMD64 documentation!
> You said you were working on another JIT - what was that?
A JIT for a custom 32-bit RISC soft-CPU (FPGA).
> Also, do you have any suggestions of another project/language to implement compiler for to gain some knowledge about writing optimizations, actual intepretation/compilation logic, etc?
You could write another C compiler. Or a JIT for some scripting language. Lua is about as simple as it gets when it comes to JITtable scripting languages. Of course amazing http://luajit.org/ already exists.
Existing CPUs and bytecode formats are also potentially good.
In particular, other people might actually want to use a small WebAssembly (https://webassembly.org/) AOT/JIT compiler. I know I would. :)
Some source code I've personally found educating on this topic:
1) Lots of easier stuff (ARM/x86 encoding, basic optimizations, etc.) can be found in JITting emulators, like Dolphin: (https://github.com/dolphin-emu/dolphin)
2) Web browser Javascript engines, like (https://github.com/v8/v8/tree/master/src). And of course LuaJIT.
3) Compiler codegen, like LLVM (https://github.com/llvm-mirror/llvm)