What does HackerNews think of minGPT?

Prophet: Automatic Forecasting Procedure | Sep 2023

Tried it once. Its promise is to take the dataset's seasonal trend into account, which makes sense for Facebook's original use case.

We ran it on such a dataset and found out that directly using https://github.com/karpathy/minGPT consistently gives a better result. So we ended up using the output of Prophet as an input feature to a neural network, but the result was not improved in any significant way.

GPT in 60 Lines of NumPy | Feb 2023

Expand Context ↕

Karpathy has a bunch of great resources on this front! His minGPT writeup is excellent https://github.com/karpathy/minGPT His more recent project nanoGPT which references this video is a much more capable, but still learning friendly, implementation.

The GPT Architecture, on a Napkin | Dec 2022

A small, very clearly written and well commented implementation: https://github.com/karpathy/minGPT

Andrej Karpathy leaves Tesla | Jul 2022

Expand Context ↕

He contributed some commits to his excellent https://github.com/karpathy/minGPT during that time.

BTW, Andrej, if you're reading this, it is not just excellent it is beyond excellent. I do a lot of tinkering with transformers and other models lately, and base them all on minGPT. My fork is now growing into a kind of monorepo for deep learning experimentation, though lately it started looking like a repo of Theseus, and the boat is not as simple anymore :)

The Principles of Deep Learning Theory | Apr 2022

Expand Context ↕

I have been enjoying Natural Language Processing with Transformers [1]. It's largely focused on the Huggingface library, but Chapter 3 has a very nice walkthrough that builds up the encoder portion of an encoder-decoder Transformer from "scratch" (it still uses some primitives found in PyTorch like nn.Embedding). The decoder portion is covered in less depth and they instead refer folks to Karpathy's awesome minGPT [2], which implements a decoder-only (GPT-style) Transformer in ~300 lines of nicely-commented Python+PyTorch code.

For a higher-level conceptual view of how Transformers work, you can check out the now-classic "Illustrated Transformer" series [3] and this programmer-oriented explanation (with code in Rust) from someone at Anthropic [4].

[1] https://www.oreilly.com/library/view/natural-language-proces...

[2] https://github.com/karpathy/minGPT

[3] https://jalammar.github.io/illustrated-transformer/

[4] https://blog.nelhage.com/post/transformers-for-software-engi...

A Simple Transformer Implementation(PyTorch) Without Difficult Syntax | Feb 2022

Here's another simple implementation: https://github.com/karpathy/minGPT