I use LLM-based autocomplete in my IDE, and it’s not taking away my job unless/until it improves by multiple orders of magnitude. It’s good at filling in boilerplate, but even for that I have to carefully check its output because it can make little errors even when I feel like what I want should be obvious. The article is absolutely correct in saying you have to be critical of its output.
I would say it improves my productivity by maybe 5%, which is an incredible achievement. I’m already getting to where coding without it feels very tedious.
I find it increases my productivity about 5-10% when working with the technologies I'm the most familiar with and use regularly (Elixir, Phoenix, JavaScript, general web dev.) But when I'm doing something unfamiliar and new, it's more like 90%. It's incredible.
Recently at work, for example, I've been setting up a bunch of stuff with some new technologies and libraries that I'd never really used before. Without ChatGPT I'd have spent hours if not days poring through tedious documentation and outdated tutorials while trying to hack something together in an agonising process of trial and error. But ChatGPT gave me a fantastic proof-of-concept app that has everything I needed to get started. It's been enormously helpful and I'm convinced it saved me days of work. This technology is miraculous.
As for my job security... well, I think I'm safe for now; ChatGPT sped me up in this instance but the generated app still needs a skilled programmer to edit it, test it and deploy it.
On the other hand I am slightly concerned that ChatGPT will destroy my side income from selling programming courses... so if you're a Rails developer who wants to learn Elixir and Phoenix, please check out my course Phoenix on Rails before we're both replaced by robots: PhoenixOnRails.com
(Sorry for the self promotion but the code ELIXIRFORUM will give a $10 discount.)
The thing is the hallucinations, I also wasted few hours trying to work on solutions with GPT where it just kept making up parameters and random functions.
1) It could use the JSONformer idea [0] where we have a model of the language which determines what are the valid next tokens; we only ask it to supply a token when the language model gives us a choice, and when considering possible next tokens, we immediately ignore any which are invalid given the model. This could go beyond mere syntax to actually considering the APIs/etc which exist, so if the LLM has already generated tokens "import java.util.", then it could only generate a completion which was a public class (or subpackage) of "java.util.". Maybe something like language servers could help here.
2) Every output it generates, automatically compile and test it before showing it to the user. If compile/test fails, give it a chance to fix its mistake. If it gets stuck in a loop, or isn't getting anywhere after several attempts, fall back to next most likely output, and repeat. If after a while we still aren't getting anywhere, it can show the user its attempts (in case they give the user any idea).