I think the problem is that tokens are not characters. So even if you had access to a JSON parser state that could tell you whether or not a given character is valid as the next character, I am not sure how you would translate that into tokens to apply the logit biases appropriately. There would be a great deal of computation required at each step to scan the parser state and generate the list of prohibited or allowable tokens.
But if one could pull this off, it would be super cool. Similar to how Microsoft’s guidance module uses the logit_bias parameter to force the model to choose between a set of available options.
You simply sample tokens starting with the allowed characters and truncate if needed. It’s pretty efficient, there’s an implementation here: https://github.com/1rgs/jsonformer