George Hotz already implemented LLaMA 7B and 15B on Twitch yesterday on GPU in Tunygrad llama branch:

The only problem is that it's swapping on 16GB Macbook, so you need at least 24GB in practice.