Cool!

Not sure if this approach would work for CPUs with bigger instruction sets/more complex instructions, e.g. the Gameboy CPU "DAA" instruction [1] is notorious for being tricky to implement when writing a GB emulator

I guess tackling the tedious/easy to implement ones with this code generation method would be helpful, leaving the more complicated ones to be tackled manually though!

[1] https://ehaskins.com/2018-01-30%20Z80%20DAA/

I'm using a code generation approach both for 6502 and Z80, not as extreme and elegant as demonstrated here though.

Instead of a pure data description I have python scripts which generate C source code. The 6502 is perfect for code generation because instructions are very uniform, and the "interesting" part of instructions are the addressing modes which always run the same sequence of operations in front of the actual instruction-specific "payload". The Z80 instruction set has many more special cases, but it can be decoded "algorithmically" as well, see here:

http://www.z80.info/decoding.htm

My Z80 emulator basically implements this "recipe" in python, and generates a huge "unrolled" switch-case statement with one case-branch per instruction (ok not quite, the CB prefix instruction range is still decoded algorithmically to reduce the resulting binary code size a bit).

Complex instruction logic like DAA are still essentially hand-written C functions though, the code generation mainly helps with the "mundane" parts of an instruction, like opcode fetch, and regular memory load/store machine cycles.

One nice side effect of using code generation is that it is very easy to create variations of the emulator. For instance I created a cycle-stepped version of my 6502 emulator (versus the previous instruction-stepped version) with surprisingly few changes to the code-generation script.

PS: the whole stuff is here:

https://github.com/floooh/chips

PPS: an interesting approach (which I haven't tried) for 6502 emulation would be to use the 6502's decode ROM (aka PLA) as the "base-data" for code generation.