Can anyone share how the computer's performance is impacted by running the model locally? And what your specs are?
Are there any LLMs that run on regular (AMD/Intel) CPUs? Or does everything require at least an M1 or a decent GPU?
You can absolutely run LLMs without a GPU, but you need to set expectations for performance. Some projects to look into are
* llama.cpp - https://github.com/ggerganov/llama.cpp
* KoboldCpp - https://github.com/LostRuins/koboldcpp
* GPT4All - https://gpt4all.io/index.html
llama.ccp will run LLMs that have been ported to the gguf format. If you have enough RAM, you can even run the big 70 billion parameter models. If you have a CUDA GPU, you can even offload part of the model onto the GPU and have the CPU do the rest, so you can get some partial performance benefit.The issue is that the big models run too slowly on a CPU to feel interactive. Without a GPU, you'll get much more reasonable performance running a smaller 7 billion parameter model instead. The responses won't be as good as the larger models, but they may still be good enough to be worthwhile.
Also, development in this space is still coming extremely rapidly, especially for specialized models like ones tuned for coding.