>Most new Python projects are still about AI
Are there any that allow a user to run an LLM with GPT-3-ish capabilities on a single pc w/ <= 4GB GPU Ram in a reasonable amount of time?
Where "reasonable" is something like no more than a few minutes to get the output. A little longer wouldn't be too bad either since you could script something that submits prompts automatically and let things run in the background or something.
Trying to search for such a thing-- if it exists-- is nearly impossible right now with so much being done & talked about. Some content talks about running something on only 16GB GPU ram (!!!) which is far beyond many discrete cards. a 3080 has up to 16, some only 8 (I think 16 is the max?) and fairly decent entry level+ cards top out at 4GB.
Any options out there?
The trick for the moment is to skip Python though. lambda.cpp and its many variants are the ones that I've heard working best.
I suggest starting with LLaMA 7B or Alpaca. More notes here: https://simonwillison.net/tags/homebrewllms/
This one is the easiest to get working I think: https://github.com/nomic-ai/gpt4all