What does HackerNews think of tinygrad?

You like pytorch? You like micrograd? You love tinygrad! ❤️

Language: Python

AMD has made several attempts, their most recent effort apparently is the ROCm [0] software platform. There is an official PyTorch distro for Linux that supports ROCm [1] for acceleration. There's also frameworks like tinygrad [2] that (claim) support for all sorts of accelerators. Thats as far as the claims go, I don't know how it handles the real world. If the occasional George Hotz livestream (creator of TinyGrad) is anything to go by, AMD has to rule out a lot of driver issues to be any actual competition for team green.

I really hope AMD manages a comeback like they showed a few years ago with their CPUs. Intel joining the market is certainly helping, but having three big players competing would certainly be desirable for all sorts of applications that require GPUs. AMD cars like the 7900 XTX are already fairly promising on paper with fairly big VRAMs, they'd probably be much more cost effective than NVIDIA cards if software support was anywhere near comparable.

[0]: https://www.amd.com/en/graphics/servers-solutions-rocm

[1]: https://pytorch.org/

[2]: https://github.com/geohot/tinygrad

Might be a silly question but is GGML a similar/competing library to George Hotz's tinygrad [0]?

[0] https://github.com/geohot/tinygrad

While PyTorch is obviously the future in the short term, it will be interesting to see how this space evolves.

Before Tensorflow, people (myself included) were largely coding all of this stuff pretty manually, or with the zoo of incredibly clucky homemade libs.

Tensorflow and PyTorch made the whole situation far more accessible and sane. You can get a basic neural network working in a few lines of code. Magical.

But it's still early days. George Hotz, author of tinygrad[0], a PyTorch "competitor", made a really insightful comment -- we will look back on PyTorch & friends like we look back on FORTRAN and COBOL. Yes, they were far better than what came before. But they are really clunky compared to what we have today.

What will we have in 20 years?

[0] https://github.com/geohot/tinygrad, https://tinygrad.org

From the article, it seems he's focusing on the raw performance side of Machine Learning á la tinygrad[0].

[0]: https://github.com/geohot/tinygrad

Since diving into stable diffusion, I found the original code not very well organized not factored.

Then trying and looking at https://github.com/geohot/tinygrad which can implement SD, it’s really well written and ideas well organized, concise, and it works on multiple platforms well.

For an excellent code base to see DRY in action, look at tinygrad: https://github.com/geohot/tinygrad

I believe it has a potential to be a great alternative to pytorch.

I love watching GeoHot's Twitch streams as he goes to the extreme to simplify the codebase, and the end result is amazing.

geohot reverse engineered ANE last year, for his tinygrad project, https://github.com/geohot/tinygrad He streamed the many-hours efforts on Youtube, in case this is your thing: https://m.youtube.com/watch?v=mwmke957ki4
George hotz got his "for play" tensor library[a] to run on the Apple Neural Engine (ANE). The results were somewhat dissapointing, however, and currently it only does relu.

[a]: https://github.com/geohot/tinygrad

He also made a short gradient descent library, tinygrad.

https://github.com/geohot/tinygrad

Binary size should have been treated like a limited resource: set a hard limit in the test suite, and let the engineers who commit code in the organization fight it out how to delete unused code when they create new one for a new feature.

I'm following George Hotz's tinygrad that is a CPU+GPU deep learning framework under 1000 lines of code with great interest where all engineers are trying to shave lines of code while maintaining readable code (it's like a game when you set rules):

https://github.com/geohot/tinygrad

Here's the GPU ops part:

https://github.com/geohot/tinygrad/blob/master/tinygrad/ops_...