What does HackerNews think of robin-hood-hashing?
Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20
Language:
C++
std::unordered_map is notoriously slow, several times slower than a fast hashmap implementation like Google's absl or Martin's robin-hood-hashing [1]. In addition, your implementation lets the hashmap grow from a small size. As you know the total length, you could preallocate the hash table and avoid rehashing. As typical hash table implementations double the capacity during rehashing, preallocating the buckets should be about twice as fast if we don't consider the cache and the complex mechanism behind the STL allocator.
That said, std::sort is not the fastest sort implementation, either. It is hard to say which is faster for a small N of 10000. Your general point is still right in that an O(n*log(n)) algorithm is not necessarily slower than an O(n) algorithm in practice.
For anyone in a situation where an unordered_set or unordered_map is in a hot part of the code, I'd also highly recommend Robin Hood: https://github.com/martinus/robin-hood-hashing
It made a huge difference in one of the programs I was running.
EDIT: I could have sworn they used to also support the ordered versions, but maybe not.
Got a bit better C++ version here which uses a couple libraries instead of std:: stuff - https://gist.github.com/jcelerier/74dfd473bccec8f1bd5d78be5a... ; boost, fmt and https://github.com/martinus/robin-hood-hashing
$ g++ -I robin-hood-hashing/src/include -O2 -flto -std=c++20 -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -lfmt
$ time ./a.out < kjvbible_x10.txt > /dev/null
0,19s user 0,01s system 99% cpu 0,197 total
with the same build flags, optimize.cpp gives me 0,22s user 0,01s system 99% cpu 0,233 total