Oh my. Comparison sorts are still slower than counting sorts on typical architectures, esp. radix sort. (Coupled with insertion sort of small sizes.)
Or burstsort if the data size is truly huge.
As usual, StackOverflow is missing the forest for the trees.
It's not even the fastest comparison-based sorter for the vast majority of inputs. There are much faster sorters out there that exploit the details of modern hardware (branch prediction is a big one, parallelism is also an obvious win if you can make use of it, etc), e.g. https://github.com/SaschaWitt/ips4o/ (sequential and parallel, disclaimer: the authors are my colleagues) or https://github.com/orlp/pdqsort (sequential)