What does HackerNews think of mlc-llm?

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Language: Python

#5 in R

Scaling LLama2-70B with Multiple Nvidia/AMD GPU | Oct 2023

Machine Learning Compilation (MLC) now supports compiling LLMs to multiple GPUs.

For Llama2-70B, it runs 4-bit quantized Llama2-70B at:

- 34.5 tok/sec on two NVIDIA RTX 4090 at $3k

- 29.9 tok/sec on two AMD Radeon 7900XTX at $2k

- Also it is scales well with 8 A10G/A100 GPUs in our experiment.

Details:

- Blog post: https://blog.mlc.ai/2023/10/19/Scalable-Language-Model-Infer...

- Project: https://github.com/mlc-ai/mlc-llm

AMD may get across the CUDA moat | Oct 2023

For LLM inference, a shoutout to MLC LLM, which runs LLM models on basically any API that's widely available: https://github.com/mlc-ai/mlc-llm

Show HN: Ollama for Linux – Run LLMs on Linux with GPU Acceleration | Sep 2023

Expand Context ↕

Maybe they're talking about https://github.com/mlc-ai/mlc-llm which is used for web-llm (https://github.com/mlc-ai/web-llm)? Seems to be using TVM.

Fine-tune your own Llama 2 to replace GPT-3.5/4 | Sep 2023

you already have TVM for the cross platform stuff

see https://tvm.apache.org/docs/how_to/deploy/android.html

or https://octoml.ai/blog/using-swift-and-apache-tvm-to-develop...

or https://github.com/mlc-ai/mlc-llm

Ask HN: Are you training and running custom LLMs and how are you doing it? | Aug 2023

Another Engine: - https://github.com/mlc-ai/mlc-llm

Apple Tests ‘Apple GPT,’ Develops Generative AI Tools to Catch OpenAI | Jul 2023

Expand Context ↕

Things are already possible on today's hardware, see https://github.com/mlc-ai/mlc-llm which allows many models to be run on M1/M2 Macs, WASM, iOS and more. The main limiting factor will be small enough, high quality enough models that performance is high enough ultimately this is HW limited and they will need to improve the neural engine/map more computation on to it to make the mobile exp. possible.

Llama.cpp: Full CUDA GPU Acceleration | Jun 2023

Expand Context ↕

With any luck projects like MLC will help close the gap

https://github.com/mlc-ai/mlc-llm

27 years later and the Psion 3a is still wonderful (2020) | Jun 2023

Expand Context ↕

https://github.com/mlc-ai/mlc-llm