What is the cheapest way to run it? I'm looking to build a product over it.

Probably quantizing or using base weights and this project https://github.com/ggerganov/llama.cpp on a CPU machine with AVX512 instructions.