I think the MinGW project is very useful, but I simply do not recommend using it.

My experience with it was trying to debug why a friend’s build was working on Windows and not Wine. I figured it out: it’s because MinGW implements something called pseudo relocations, which implement symbol-relative relocations inside the entrypoint of your executable. It’s needed because when compiling, you don’t know if an unresolved symbol will be external or not, so it gets compiled as if it’s not. But when linking, and you finally know that the symbol is external, it’s too late to change the code to refer to an import table indirection.

On ELF platforms, relocations can refer to symbols in other libraries during runtime linking. So external symbols can be treated very similarly to internal ones, just with linking done at runtime instead of compile-time.

On Windows, you have to use `__declspec(dllimport)` or some such to inform the compiler that the symbol will be external. It can then generate code for referring to the IAT.

MinGW wanted to support compiling UNIXy software that didn’t use this attribute, so it does not require it. Instead… the ELF-style pseudo-relocations are used. So at the entrypoint, your MinGW program or DLL loops through a table of pointers inside machine code, looks up their real address in the IAT, and performs fixups. It will even temporarily change the protection of the relevant pages to be writable. Check to see if your MinGW program has unexpected imports to kernel32.VirtualProtect! (IIRC this can also happen if you trigger executable stack by using GCC trampolines, another reason to not use GCC…)

(edit: Please read the replies below, but according to MinGW developers, this only happens in some edge cases; ordinarily, thunks will be used, which do not have this problem. In addition, MinGW binaries will always link to VritualProtect, because the pseudo-reloc code exists whether or not you use them. I never noticed it before, so that's interesting.)

There’s only one problem: this is extremely failure-prone. For example, if you are on AMD64, the typical CALL instruction contains a 32-bit RIP-relative offset. But in a 64-bit process space, two modules can happily be more than 2^32 bytes away in memory. I assume ELF platforms have a linker that’s aware of this, since the linker knows about these special relocations, and thus has to ensure the binaries are in range… but I don’t know much about ELF linking, so don’t take that as fact.

So why does this break on Wine and not Windows? Well, as far as I can tell, it doesn’t break on Windows because for some reason ASLR never puts any of the modules too far away. I haven’t figured out why. Wine doesn’t implement ASLR and loads modules at their preferred addresses as long as it’s available, and those addresses wind up being too far away. The pseudo reloc code does not output any errors when the address overflows; would probably be worth a pull request, but I am still puzzling with why Windows doesn’t ever seem to cause the problem.

Unless I am missing something crucial, it’s possible that a large amount of Windows x64 binaries compiled with MinGW work almost entirely by accident, and maybe not even every time.

Most symbols like this are function calls, so a significantly safer approach would be for MinGW to generate a thunk for those, and then just fail at linking when encountering a data import that isn’t decorated. Unfortunately, I don’t make the calls. :P

I don’t mean to disparage any specific persons. I just generally recommend against MinGW even in the face of the fact that it is virtually the only free software option to compile Windows binaries.

mstorsjo

In general, this doesn't work by accident. In practice, whenever calling a function from a different DLL, normally a thunk is inserted (if it isn't marked dllimport), and when referencing a variable that might reside in a different DLL (and be autoimported), its address is read indirectly via a full 64 bit pointer.

When linking to a function "func" in another DLL, the import library normally provides two symbols, "func" which is a thunk which allows it to be called from anywhere, and "__imp_func" which is the IAT entry (containing a full 64 bit address). If the caller called it via a dllimport marked declaration, it's using "__imp_func", otherwise it uses "func" and ends up jumping via a thunk.

In some cases, when mingw-w64 wants to override importing certain functions (by providing a statically linked version of it instead), the functions are marked "DATA" in the def file (which is used for generating the import libraries). Then the thunk is left out and only the "__imp_func" symbol is left. If you hit a situation where you didn't end up linking the intended replacement function, then it's possible to hit this situation where you autoimport something that only works if it's loaded close enough. (In hindsight, it might have been even better to comment out those functions entirely from the import libraries.)

If you do hit this issue again (or if you remember exactly how to reproduce it), then plese do file a bug somewhere so we can look into it.

But in general, mingw-w64 linked binaries, even including ones using the autoimport facility and runtime pseudo relocations, should work even if they are loaded far apart from each other.

ChrisSD

While you're here, do you happen to know if mingw will gain support for native thread local storage? Emulated TLS is still the only option, right?

mstorsjo

Clang (and lld) do support native TLS, and mingw-w64 does have the things that are needed. I think binutils also might have what's needed too, but AFAIK the thing that's missing is support for it in GCC.

Actually, (upstream) Clang defaults to native TLS instead of emulated TLS. In MSYS2, Clang is overridden to use emulated TLS by deafult to interoperate better with GCC built code and libstdc++ though.

The toolchain I maintain, https://github.com/mstorsjo/llvm-mingw, defaults to native TLS throughout.