What does HackerNews think of minGPT?
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
We ran it on such a dataset and found out that directly using https://github.com/karpathy/minGPT consistently gives a better result. So we ended up using the output of Prophet as an input feature to a neural network, but the result was not improved in any significant way.
BTW, Andrej, if you're reading this, it is not just excellent it is beyond excellent. I do a lot of tinkering with transformers and other models lately, and base them all on minGPT. My fork is now growing into a kind of monorepo for deep learning experimentation, though lately it started looking like a repo of Theseus, and the boat is not as simple anymore :)
For a higher-level conceptual view of how Transformers work, you can check out the now-classic "Illustrated Transformer" series [3] and this programmer-oriented explanation (with code in Rust) from someone at Anthropic [4].
[1] https://www.oreilly.com/library/view/natural-language-proces...
[2] https://github.com/karpathy/minGPT
[3] https://jalammar.github.io/illustrated-transformer/
[4] https://blog.nelhage.com/post/transformers-for-software-engi...