What does HackerNews think of jsonformer?
A Bulletproof Way to Generate Structured JSON from Language Models
- You can fine tune these models for very specific tasks, which GPT-4 might not be as good at.
- Open source models are free. You can use them as much as you want without worrying about a $xx,xxx bill at the end of the month which makes tinkering with them easier.
- Smaller models like this can run on consumer hardware, even phones, and can run offline.
- Privacy and not having to abide by a third parties terms. You don't have to deal with "As a large language model...", especially with uncensored models.
- Tools like jsonformer https://github.com/1rgs/jsonformer are not possible with OpenAIs API.
- It's also just really cool, let's be honest.
As for model performance at different context sizes, it's seems a bit complicated. From what I understand, even if models are tweaked (for example using the superHOT RoPE hack or sparse attention) to be able to use longer contexts, they still have to be fined tuned on input of this increased context to actually utilize it, but performance seems to degrade regardless as input length increases.
For your question about fine tuning models to respond with only "yes" or "no", I recommend looking into how the jsonformers library works: https://github.com/1rgs/jsonformer . Essentially, you still let the model generate many tokens for the next position, and only accept the ones that satisfy certain criteria (such as the token for "yes" and the token for "no".
You can do this with openAI API too, using tiktoken https://twitter.com/AAAzzam/status/1669753722828730378?t=d_W... . Be careful though as results will be different on different selections of tokens, as "YES", "Yes", "yes", etc are all different tokens to the best of my knowledge
1) It could use the JSONformer idea [0] where we have a model of the language which determines what are the valid next tokens; we only ask it to supply a token when the language model gives us a choice, and when considering possible next tokens, we immediately ignore any which are invalid given the model. This could go beyond mere syntax to actually considering the APIs/etc which exist, so if the LLM has already generated tokens "import java.util.", then it could only generate a completion which was a public class (or subpackage) of "java.util.". Maybe something like language servers could help here.
2) Every output it generates, automatically compile and test it before showing it to the user. If compile/test fails, give it a chance to fix its mistake. If it gets stuck in a loop, or isn't getting anywhere after several attempts, fall back to next most likely output, and repeat. If after a while we still aren't getting anywhere, it can show the user its attempts (in case they give the user any idea).
Let's say you're halfway through a generation of a json blob with a name field and a job field and have already generated
{
"name": "bob"
At this point, guidance will take over generation control from the model to generate the next text {
"name": "bob",
"job":
If the model had generated that, you'd be waiting 70 ms per token (informal benchmark on my M2 air). A comma, followed by a newline, followed by "job": is 6 tokens, or 420ms. But since guidance took over, you save all that time.Then guidance passes control back to the model for generating the next field value.
{
"name": "bob",
"job": "programmer"
programmer is 2 tokens and the closing " is 1 token, so this took 210ms to generate. Guidance then takes over again to finish the blob {
"name": "bob",
"job": "programmer"
}
[1] https://github.com/1rgs/jsonformer
https://github.com/newhouseb/clownfish
Note: guidance is way more general of a tool than theseEdit: spacing
It can be made to, and I think I stumbled upon a core insight that makes simple format coercion reproducible without fine-tuning or logit shenanigans, so yeah, this allows you to both reduce false positives and constrain failures to false positives or to task boundaries.
There’s also RHLF-derived coercion which is hilarious. [2]
[0] https://github.com/1rgs/jsonformer