What does HackerNews think of grobid?

A machine learning software for extracting information from scholarly documents

Language: Java

#33 in Deep learning
Can you elaborate on how you parse the PDF? Are you simply converting it to text using a python library or something more robust like GROBID[1]?

1: https://github.com/kermitt2/grobid

For academic papers: GROBID [0] is a machine learning library for extracting, parsing and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications.

[0] https://github.com/kermitt2/grobid/

As far as I have been able to tell, the public state of the art in academic paper metadata parsing is Grobid: https://github.com/kermitt2/grobid

Not quite as simple a commandline interface as you suggest, but not too hard to set up, and pretty impressive. Now if only Google Scholar would open-source whatever they use...