What is the current state of open source speech recognition?

It would be immensely useful to be able to run a monologue or dialog wav file through a program and get more-or-less good text, even if there are some errors. As far as I know this is still quite a difficult problem, requiring an immense amount of data, and good language models, but I wouldn't be surprised if today there are some pre-trained models available that can be run using one of the many machine learning Python toolkits?