What does HackerNews think of aitextgen?

NanoGPT | Jan 2023

To train small gpt-like models, there's also aitextgen: https://github.com/minimaxir/aitextgen

Are you the asshole? – AI powered answer bot | Apr 2022

Expand Context ↕

> I mean this as person that is actually studying this technology

I literally publish open-source packages on how to use this technology.

https://github.com/minimaxir/gpt-2-simple

https://github.com/minimaxir/aitextgen

Show HN: Tensorpedia – Using GPT-2 to synthesize Wikipedia articles | Jan 2022

Hey HN! I've been lurking for a while now and I've finally created something that I feel is worth sharing.

I've called this project "Tensorpedia." At its core, Tensorpedia takes in a title and utilizes it as a prompt for GPT-2 to synthesize the introductory part of a Wikipedia article. The machine learning stuff is written using a wonderful library called aitextgen [0], using Wikipedia's "Vital Articles" as a data set [1]. The server is written in Node, and it uses Redis as an article cache. If you want to read my article about it (for some reason), you can check it out here [2].

I created this project to get more experience with server technologies. While I wouldn't say it's a complicated application, I learned quite a lot from it.

Additionally, as I was inspired by all of those this-x-doesn't-exist projects from a while back, this project is mostly for fun. As such, I don't know how much practical use it has, but I've generated some pretty hilarious articles from it.

[0] https://github.com/minimaxir/aitextgen

[1] https://en.wikipedia.org/wiki/Wikipedia:Vital_articles/Level...

[2] https://jonahsussman.net/posts/2022-01-this-wiki-dne/

OpenAI’s API Now Available with No Waitlist | Nov 2021

Expand Context ↕

AI text content generation is indeed a legit industry that's still in its nascent stages. It's why I myself have spent a lot of time working with it, and working on tools for fully custom text generation models (https://github.com/minimaxir/aitextgen).

However, there are tradeoffs currently. In the case of GPT-3, it's cost and risk of brushing against the Content Guidelines.

There's also the surprisingly underdiscussed risk of copyright of generated content. OpenAI won't enforce their own copyright, but it's possible for GPT-3 to output existing content verbatim which is a massive legal liability. (it's half the reason I'm researching custom models fully trained with copyright-safe content)