Once upon a time I was using pup[0] for such thing as well as later I changed to cascadia[1] which seemed much more advanced.

Comparing the two repos, it seems pup is dead, but cascadia may not be.

These tools, including htmlq, seem to sell themselves as "jq for html", which is far from the truth. Jq is closer to the awk where you can do just about everything with json. Cascadia, htmlq, and pup seem closer to grep for html. They can essentially only select data from a html source.

[0] https://github.com/EricChiang/pup [1] https://github.com/suntong/cascadia

Well, jq is grep as well as sed and awk, but yeah, htmlq seems to be just grep, for sake of comparison.

But I don't think html has any need for a sed/awk tool, or at least not as much. Json output could very well be piped forward to the next CLI tool after you've changed it slightly with jq. I don't see this scenario as likely with html.

> Well, jq is grep as well as sed and awk, but yeah, htmlq seems to be just grep, for sake of comparison.

Exactly, and that is what I mean. If you want to compare, compare it with grep, not jq.

Someone else posted xidel[0] in this thread, which I've not used, but it seems to be the "jq but for html".

[0] https://github.com/benibela/xidel