Of those who responded to the 'what is your blog and why should I read it' thread. HN is large. Likely larger than many of its visitors realize because the participants are a relatively small fraction of the readership.
So HN readers are not necessarily contributors. And not all contributors would plug their blogs in a thread asking them to do so.
If you want to get an idea of the HN readership rather than of the HN contributors you may want to start off by scraping all the profile pages instead, it will give you a much larger set of sample data to work with.
That’s a fair point! I would have to read HN’s terms of use though. Not sure if that’s allowed or not. I felt good scrapping the comments section since everyone there “opted in “ to share their website to the broader community.