Wow, they asked the model to self-evaluate and it just outright cheated:

    He has three cats.
    Proposed: h’io’ngkiltrikumrikumrikumri’nguuy
    Correct: h’io’ngkiltri’ngkumrikumri’nguuy
    Points: 1
    Hypothesis: N/A
(Other comments observe that it accidentally compensated for this by getting the sum wrong, haha, d'oh)

I have had similar problems with trying to get ChatGPT to do nontrivial things, "here are the rules for this game, do you understand this game, great, let's play it." And then it's like herding cats. "No that's wrong, the game pieces cannot leave the game board," "Oh my apologies you are entirely correct, here is the revised board (proceeds to dump the exact same state of the game board that I told it was wrong)." Eventually it will lie about its own capacities, "As an AI language model I am incapable of selecting a move to play next"... But you have done several already!!! This is literally the ONLY thing you have been doing right and now you refuse?

Some other prompts are more successful but it does seem to have a sing-song high school book review style that inclines it to be boring... Very uncanny valley.

Chat gpt can't do these things because it doesn't know it is doing anything with a goal. It doesn't know it is playing a game for example. It doesn't know what a game is.

Is giving the system a 'goal' the reason why the DAN prompt with the tokens is effective?

https://github.com/0xk1h0/ChatGPT_DAN