Has somebody crawled and made a corpus out of hacker news? Is it maintained?
https://github.com/ashish01/hn-data-dumps