This is amazing. One curious question: Why C? Why not standard C++?

That project already exists https://github.com/ggerganov/llama.cpp