This might be true for the type of business and institutional uses that can operate under the extremely puritanical filters that are bolted onto gpt3.5-turbo. But for most human person uses the earlier text completion models like gtp3 davinci are incomparibly better and more responsive. But also 10x as pricey. Still, it's worth it compared to the lackluster and recalcitrant non-output of gpt3.5-turbo.

I think over the next couple months most human people will switch away from gpt3.5-turbo in openai's cloud to self-hosted LLM weights quantized to run on consumer GPU (and even CPU), even if they're not quite as smart.

I have a hard time imagining anything that comes even close to ChatGPT being able to run on consumer hardware in the next couple of years

I can certainly imagine it after seeing https://github.com/ggerganov/llama.cpp

Still a couple years out but moving way faster than I would have expected.