> Memory is quite limited on Windows 3.1 machines, so I tried to reduce the amount of memory needed for WinGPT, especially in sending and receiving the query and response from the OpenAI API. The JSON responses of modern APIs aren't especially optimized for size, and OpenAI's API is no exception. I've asked the model to be brief in an effort to keep responses as small as possible. I've also chosen not to send the text of previous turns in the API calls, even though this means that the bot won't be able to use prior conversation context.

You can trade space for CPU time using SAX-style parsing[1]. SAX-style parsing visits each node of a tree from top to bottom, firing events for each node visited. The amount of memory required compared to fully parsing JSON is negligible, and writing a custom SAX-style parser for JSON shouldn't be too difficult. There are existing SAX-style JSON parsers you can examine as well.[2]

[1]: https://en.wikipedia.org/wiki/Simple_API_for_XML [2]: https://rapidjson.org/md_doc_sax.html

Yep! I'm using JSMN (https://github.com/zserge/jsmn), which is a streaming parser that visits each token sequentially, so there's only one copy of each JSON response in memory. I also avoid allocating new intermediate memory whenever possible; for example, to unescape backslashes in the JSON strings, I use a destructive loop that moves the non-backslash characters forward in memory, and truncates the string by moving the null terminator earlier in the string. Not something I'd imagine doing in most environments today, but as you said, it saves a bit of space at the expense of CPU time :)

  void DestructivelyUnescapeStr(LPSTR lpInput) {
    int offset = 0;
    int i = 0;
    while (lpInput[i] != '\0') {
      if (lpInput[i] == '\\') {
        offset++;
      } else {
        lpInput[i - offset] = lpInput[i];
      }
      i++;
    }
    lpInput[i - offset] = '\0';
  }