I looked into this project when it was first announced. The “in Rust” part seems more aspirational than reality. For those who may not know, Knuth originally wrote TeX in a language called WEB, which is basically Pascal with some preprocessors for making it usable and documentable. Later extensions to TeX, including eTeX, pdfTeX and XeTeX have also been written in WEB. The existing TeX distributions (TeX Live, MikTeX, etc), at their core first translate this WEB/Pascal into (autogenerated and basically unreadable) C, then run it through a C compiler, etc.

What this project has done is take the auto-generated translation into C of xetex.web, and wrapped this (unreadable) C code in Rust — which is an odd choice to say the least. It seems (from Reddit comments by the author) that the reason is that author of this project at the time was unaware of LuaTeX, which (starting with a manual and readable translation into C years ago) is actually written in C.

All these odd choices aside, and barring the somewhat misleading “in Rust” description (misleading for another reason: the TeX/LaTeX ecosystem is mostly TeX macros rather than the core “engine” anyway), there are some good user-experience decisions made by this project. With a regular TeX distribution these would be achieved with something like latexmk/rubber/arara, which too are wrappers around TeX much like this project.

There is still room for someone to do a “real” rewrite of TeX (in Rust or whatever), but as someone on TeX.SE said, it is very easy to start a rewrite of TeX; the challenge is to finish it.

Efforts to ditch any C remnants are being made[1].

[1] https://github.com/crlf0710/tectonic/

Thanks. Is there any code to look at? (Couldn't easily find anything relevant on that repo…) I didn't mention it earlier, but I'm collecting a list of alternative TeX implementations[1] so if there's even (say) a dozen lines of WEB that have been converted to Rust, I'd be very eager to take a look and compare — it would be illuminating.

[1]: https://tex.stackexchange.com/questions/507846

Edit: To answer my own question. The relevant source (corresponding to the main part of xetex.web) is here: https://github.com/crlf0710/tectonic/blob/2580c55/engine/src... I've posted a comparison of some example code first from the official xetex.web listing, and then from the “Rust” code here: https://gist.github.com/shreevatsa/627399d0150e66d211a264bc0...

You can draw your own conclusions from comparing the code samples, but to point out a few obvious differences:

• What has the symbolic name “fi_or_else” in the WEB code has become the magic number “108” in the Rust code. (This is because the author of this project decided to have their Rust code start from the autogenerated C code which has already lost this symbolic information.)

• What is simply “if tracing_ifs>0” in the WEB code is 29 lines of Rust code, involving a magic offset into eqtb.

• The comments from the original are gone.

• Something like “cur_if:=subtype(p);” becomes “cur_if = (*mem.offset(p as isize)).b16.s0 as small_number;”.

I wonder how maintainable such Rust code will be. These problems are not insurmountable and the code can always be cleaned up later I guess… my point is simply that at the moment it is not idiomatic Rust code, for instance.

It's the result of automated c2rust[1] conversion. Of course it will be cleaned up. C2rust itself provides handy refactoring tools scriptable in Lua language. A lot of the code will be removed, for example image handling can be done through image-rs crate, etc.

[1] https://github.com/immunant/c2rust