What does HackerNews think of GLM-130B?

GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

Language: Python

GLM-130B[1] is something comparable to GPT-3. It's a 130billion parameter model vs GPT-3's 175 billion, and it can comfortably run on current-gen high end consumer level hardware. A system with 4 RTX 3090s (< $4k) gets results in about 16 seconds per query.

The proverbial 'some guy on Twitter'[2] got it setup, and broke down the costs, demonstrated some prompts, and what not. The output's pretty terrible, but it's unclear to me whether that's inherent or a result of priority. I expect OpenAI spent a lot of manpower on supervised training, whereas this system probably had minimal, especially in English (it's from a Chinese university).

If these technologies end up as anything more than a 'novelty of the year' type event, then I expect to see them able to be run locally on phones within a decade. There will be a convergence between hardware improving and the software getting more efficient.

[1] - https://github.com/THUDM/GLM-130B

[2] - https://twitter.com/alexjc/status/1617152800571416577

GLM-130B[1] (a 130 billion parameter model vs GPT-3's 175 billion parameter model) is able to run optimally on consumer level high-end hardware, 4xRTX 3090 in particular. That's < $4k at current prices, and as hardware prices go one can only imagine what it'll be in a year or two. It also enables running with degraded performance on lesser systems.

It's a whole lot cheaper to run neural net style systems than to train them. "Somebody on Twitter"[2] got it setup, and broke down the costs, demonstrated some prompts, and what not. Cliff notes being a fraction of a penny per query, with each taking about 16s to generate. The output's pretty terrible, but it's unclear to me whether that's inherent or a result of priority. I expect OpenAI spent a lot of manpower on supervised training, whereas this system probably had minimal, especially in English (it's from a Chinese university).

[1] - https://github.com/THUDM/GLM-130B

[2] - https://twitter.com/alexjc/status/1617152800571416577

Can you think of any non-weapons examples where centralization/gatekeeping of a tech meaningfully and causally benefited society or a technology itself?

Actually, thinking about my own question I'm even inclined to remove the non-weapons qualifier. The most knee jerk response, nuclear weapons, is perhaps the best example of unexpected benefit. The 'decentralization' of nuclear weapons is undoubtedly why the Cold War was the Cold War, and not World War 3. And similarly why we haven't* seen an open war between nations with nuclear weapons. One power to rule over all suddenly turned into "war with this country no longer has a win scenario" effectively ending open warfare between nuclear nations.

There's also the inevitability/optics argument. There are already viable open source alternatives [1], and should this tech ultimately prove viable/useful that will only be the beginning. So there certainly will be "ai" that will be open, it just won't come from OpenAI(tm)(c).

[1] - https://github.com/THUDM/GLM-130B

There are a handful of "open source LLM" initiatives out there, although I don't think any of them are quite up to the level of ChatGPT. Possibly one of the more interesting ones is GLM-130B.

https://github.com/THUDM/GLM-130B

Released by some folks at Tsinghua University in China, back in August. The model itself is licensed under some janky "free to use, but not open source" license, but it looks like most of the code for training, evaluation, etc. is available and licensed under either the Apache License or a BSD-like license.

You might also find this of interest:

https://arxiv.org/pdf/2103.08894 - "Distributed Deep Learning Using Volunteer Computing-Like Paradigm"

FWIW, I tend to agree with your overall sentiment. As AI becomes progressively more capable, it represents an ever increasing possibility of consolidating more and more power into the hands of fewer and fewer entities. I believe that one way to counter that (albeit not one without its own risks) is to democratize access to AI as much as possible.

Actually, now that I think about it, wasn't something along those lines purportedly the original idea behind OpenAI in the first place? Or am I having a "Mandela Moment" and mis-remembering?