https://www.gnu.org/software/lightning/

>"GNU lightning is a library that generates assembly language code at run-time; it is very fast, making it ideal for Just-In-Time compilers, and it

abstracts over the target CPU

, as it exposes to the clients a standardized RISC instruction set inspired by the MIPS and SPARC chips."

[...]

>"GNU lightning is usable in complex code generation tasks. The available backends cover the aarch64, alpha, arm, hppa, ia64, mips, powerpc, risc-v, s390, sparc and x86 architectures."

https://www.gnu.org/software/lightning/manual/lightning.html

>"To be portable,

GNU lightning abstracts over current architectures’ quirks and unorthogonalities.

The interface that it exposes to is that of a standardized RISC architecture loosely based on the SPARC and MIPS chips. There are a few general-purpose registers (six, not including those used to receive and pass parameters between subroutines), and arithmetic operations involve three operands—either three registers or two registers and an arbitrarily sized immediate value.

On one hand, this architecture is general enough that it is possible to generate pretty efficient code even on CISC architectures such as the Intel x86 or the Motorola 68k families.

On the other hand, it matches real architectures closely enough that, most of the time, the compiler’s constant folding pass ends up generating code which assembles machine instructions without further tests."

[...]

>"3 GNU lightning’s instruction set

GNU lightning’s instruction set was designed by deriving instructions that closely match those of most existing RISC architectures, or that can be easily syntesized if absent. Each instruction is composed of:

an operation, like sub or mul most times, a register/immediate flag (r or i) an unsigned modifier (u), a type identifier or two, when applicable. Examples of legal mnemonics are addr (integer add, with three register operands) and muli (integer multiply, with two register operands and an immediate operand). Each instruction takes two or three operands; in most cases, one of them can be an immediate value instead of a register.

Most GNU lightning integer operations are signed wordsize operations, with the exception of operations that convert types, or load or store values to/from memory. When applicable, the types and C types are as follow:

     _c         signed char
     _uc        unsigned char
     _s         short
     _us        unsigned short
     _i         int
     _ui        unsigned int
     _l         long
     _f         float
     _d         double
Most integer operations do not need a type modifier, and when loading or storing values to memory there is an alias to the proper operation using wordsize operands, that is, if ommited, the type is int on 32-bit architectures and long on 64-bit architectures. Note that lightning also expects sizeof(void*) to match the wordsize.

When an unsigned operation result differs from the equivalent signed operation, there is a the _u modifier.

There are at least seven integer registers, of which six are general-purpose, while the last is used to contain the frame pointer (FP). The frame pointer can be used to allocate and access local variables on the stack, using the allocai or allocar instruction.

Of the general-purpose registers, at least three are guaranteed to be preserved across function calls (V0, V1 and V2) and at least three are not (R0, R1 and R2).

Six registers are not very much, but this restriction was forced by the need to target CISC architectures which, like the x86

[PDS: I think despite this trade-off, that this was a good engineering decision(!), otherwise x86 could not be targeted, and that would destroy a huge swath of functionality!]

, are poor of registers; anyway, backends can specify the actual number of available registers with the calls JIT_R_NUM (for caller-save registers) and JIT_V_NUM (for callee-save registers).

There are at least six floating-point registers, named F0 to F5. These are usually caller-save and are separate from the integer registers on the supported architectures; on Intel architectures, in 32 bit mode if SSE2 is not available or use of X87 is forced, the register stack is mapped to a flat register file. As for the integer registers, the macro JIT_F_NUM yields the number of floating-point registers.

The complete instruction set follows; as you can see, most non-memory operations only take integers (either signed or unsigned) as operands;

this was done in order to reduce the instruction set

[PDS: Opinion: Any way that the instruction set can be simplified is a good thing!]

, and because most architectures only provide word and long word operations on registers. There are instructions that allow operands to be extended to fit a larger data type, both in a signed and in an unsigned way."

[...]

>"8 Acknowledgements

As far as I know, the first general-purpose portable dynamic code generator is DCG, by Dawson R. Engler and T. A. Proebsting. Further work by Dawson R. Engler resulted in the VCODE system; unlike DCG, VCODE used no intermediate representation and directly inspired GNU lightning.

Thanks go to Ian Piumarta, who kindly accepted to release his own program CCG under the GNU General Public License, thereby allowing GNU lightning to use the run-time assemblers he had wrote for CCG. CCG provides a way of dynamically assemble programs written in the underlying architecture’s assembly language. So it is not portable, yet very interesting.

I also thank Steve Byrne for writing GNU Smalltalk, since GNU lightning was first developed as a tool to be used in GNU Smalltalk’s dynamic translator from bytecodes to native code."

[...]

PDS: Overall, looks like a great idea!

Also, I think that future compiler and VM writers and FPGA soft-CPU authors -- should target this "abstracted instruction set"!

Keywords: Instruction Set Architecture, ISA, x86, RISC, RISC-V, Abstraction, Abstract Instruction Set, Instruction Set Subset, Generic Compatibility Layer

> I think that future compiler and VM writers and FPGA soft-CPU authors -- should target this "abstracted instruction set"!

GNU lightning succeeds in what it sets out to do, which is to offer a simple and minimal JIT code-generator. It offers nothing in the way of optimisation, by design. Most projects looking for a code-generator are looking for something with great optimisation built-in, so they're not wrong to go with LLVM or the JVM rather than GNU lightning (or something similar like Mir [0][1]). I don't think the average compiler would gain much by targeting GNU lightning.

With all that said, GNU Guile, a Scheme interpreter, uses a fork of GNU lightning, insufferably named lightening. [3]

[0] https://github.com/vnmakarov/mir

[1] https://lists.gnu.org/archive/html/lightning/2020-02/msg0001...

[2] https://wingolog.org/archives/2019/05/24/lightening-run-time...