For those interested I would also check out Andrej Karpathy's YouTube video on building GPT from scratch:

https://youtu.be/kCc8FmEb1nY

Karpathy has a bunch of great resources on this front! His minGPT writeup is excellent https://github.com/karpathy/minGPT His more recent project nanoGPT which references this video is a much more capable, but still learning friendly, implementation.