What does HackerNews think of nanoGPT?

Free and Fast LLM Finetuning | Jul 2023

It would be interesting to see: (1) why a lamini account is required? (2) how this compares to https://github.com/karpathy/nanoGPT

Ask HN: How does an LLM “know” to respond in the first place? | Jun 2023

Expand Context ↕

Thanks,

Is there a toy conversational LLM on Github or elsewhere?

Something like: https://github.com/karpathy/nanoGPT

Ask HN: Resource to learn how to train and use ML Models | Jun 2023

I compiled this list - https://gist.github.com/TikkunCreation/5de1df7b24800cc05b482...

In particular, you'll probably want to skip to nanoGPT (https://github.com/karpathy/nanoGPT) and then maybe if you are interested in a bit more of the theory, Zero to Hero (https://karpathy.ai/zero-to-hero.html), and his comments in one of the threads linked: https://news.ycombinator.com/item?id=34414716

Fine tuning may also be a faster and better place to start, this is a good guide for fine tuning some publicly released LLMs: https://erichartford.com/uncensored-models

Neural Networks: Zero to Hero | Apr 2023

I'm doing an ML apprenticeship [1] these weeks and Karpathy's videos are part of it. We've been deep down into them. I found them excellent. All concepts he illustrates are crystal clear in his mind (even though they are complicated concepts themselves) and that shows in his explanations.

Also, the way he builds up everything is magnificent. Starting from basic python classes, to derivatives and gradient descent, to micrograd [2] and then from a bigram counting model [3] to makemore [4] and nanoGPT [5]

[1]: https://www.foundersandcoders.com/ml

[2]: https://github.com/karpathy/micrograd

[3]: https://github.com/karpathy/randomfun/blob/master/lectures/m...

[4]: https://github.com/karpathy/makemore

[5]: https://github.com/karpathy/nanoGPT

Using ChatGPT Plugins with LLaMA | Mar 2023

Expand Context ↕

Just playing casually with NanoGPT (https://github.com/karpathy/nanoGPT) with a desktop holding a 2080ti, it's really really really clear to me that the path to get to a pre-fine-tuned LLM is remarkably easy. RLHF is the piece above this which appears to also be surprisingly easy (if Sam Altman is to be believed). The juice is making these tools incredibly easy.

I think the barrier to entry here is low. OpenAI is ahead now, but I doubt that lives forever.

How fast would ChatGPT run on a laptop locally? | Feb 2023

It's quite unlikely that you would even be able to run it at all. The model is quite large and probably won't fit in memory. If you could get it to run, it would be extremely slow

Worth checking out: https://github.com/karpathy/nanoGPT The associated videos will go into this more, iirc, at the end he said the real GPT-2 would take 5-20 seconds to generate a small amount of text using CPU only.

GPUs for Deep Learning in 2023 – An In-depth Analysis | Jan 2023

Expand Context ↕

Can you please run bench from https://github.com/karpathy/nanoGPT ?

Microsoft in talks to acquire a 49% stake in ChatGPT owner OpenAI | Jan 2023

Expand Context ↕

what about https://github.com/karpathy/nanoGPT (currently trending on Github)