Is the webpage content passed to ChatGPT, or is this more intended to be a way to easily use chatgpt?
On the first part: I've been trying to build a tool that parses webpages using ChatGPT, but I'm struggling to figure out the best way to pass the website content over. Some options I have tried:
* Raw HTML - expensive, and in a lot of cases doesn't fit in prompt input
* OCR - works better than I would have expected, but can struggle with certain fonts, and a lot of the webpage structure is lost
have you already tried this: https://github.com/mozilla/readability ?