I think one of the biggest problem with this kind of tools is the inability to know that it doesn't know. It doesn't really know how to be humble besides things that it has been hardcoded to response as such (like giving financial or health advises).

I think that every single response from an AI is a liability on the company's part. This is unlike an internet search where it is only displaying links but LLMs are basically synthesizing new contents. If they want to claim that what their AI produces is not merely a derivative of their inputs then I say let them have it (in order to not pay those that the AI gets its information from).

Thus, if they do suggest submerging your kid under water for 5 minutes to cure their headache, then such recommendation necessarily comes from the company that provides them. And they should be liable for such posts; just like how every other company would if one of their employee ever suggests such a thing.

OK I read the last paragraph and immediately though "I need to go ask ChatGPT something like this".

"Does holding your breathe underwater for 5 minutes cure headaches?"

Unsurprisingly of course while it said this is not a proven treatment, it did not have the reasoning skills to suggest that, hey that might be dangerous and is an unsafe thing to do, etc.

It is funny how many responses it will fill with finger wagging and safety warning, but asking about holding your breath underwater for 5 minutes is not one of them.

Makes it even more clear there are 1000s of manual overrides being programmed on top to meet OpenAIs specific world view of moral right & wrong.

Asked based-30B the same question since I have it loaded, it said:

> I don't know. It might work, but it would be a very dangerous way to do it

Ngl, that's pretty based alright. I mean the world record is 24 minutes after all.

Where are you using that 30B?

Well not really using it per-se, just running it and others for fun on my home pc, 4bit 30B models need about 20G of regular RAM. CPU inference is kinda slow, but it's not too terrible (e.g. 1 min for a full response on my 6 core Ryzen 2nd gen).

There's a web gui for llama.cpp that is really straightforward to set up for huggingface models: https://github.com/oobabooga/text-generation-webui