First, thanks to the publisher and authors for making this freely available!

I retired recently after using neural networks since the 1980s. I still spend at least 10 hours a week keeping up with DL, RL, etc. it seems like the roof has blown off the field recently, progress increases exponentially. I like material that makes me think of NNs with different intuitions. I am working on a CC licensed book consisting my experiments, in Jupiter notebook/Colab form - expect me to be shamelessly plugging that in a few months.

In the book, I especially loved this quote:

“You can hide a lot in a large-N matrix.” – Steve Shenker – John McGreevy

hikingsimulator

It's really neat to find people on HN who've been working on those structures for such a long time. If you can indulge me, what is a lesser known or obscure book on neural networks or an adjacent topic that you think would deserve to be read?

mark_l_watson

Perhaps some of the really old texts by Kahonen, Carver Mead, etc.?

For more modern material, there are a few new good books on Transformers. Transformers are interesting because they were designed for efficiency: layers the same size, encoding both data and time sequencing information in each sample (so recurrent NNs aren’t required), etc.

pfd1986

Mind sharing some titles?

moyix

I have been enjoying Natural Language Processing with Transformers [1]. It's largely focused on the Huggingface library, but Chapter 3 has a very nice walkthrough that builds up the encoder portion of an encoder-decoder Transformer from "scratch" (it still uses some primitives found in PyTorch like nn.Embedding). The decoder portion is covered in less depth and they instead refer folks to Karpathy's awesome minGPT [2], which implements a decoder-only (GPT-style) Transformer in ~300 lines of nicely-commented Python+PyTorch code.

For a higher-level conceptual view of how Transformers work, you can check out the now-classic "Illustrated Transformer" series [3] and this programmer-oriented explanation (with code in Rust) from someone at Anthropic [4].

[1] https://www.oreilly.com/library/view/natural-language-proces...

[2] https://github.com/karpathy/minGPT

[3] https://jalammar.github.io/illustrated-transformer/

[4] https://blog.nelhage.com/post/transformers-for-software-engi...