For those interested I would also check out Andrej Karpathy's YouTube video on building GPT from scratch:
Karpathy has a bunch of great resources on this front! His minGPT writeup is excellent https://github.com/karpathy/minGPT His more recent project nanoGPT which references this video is a much more capable, but still learning friendly, implementation.