I've been looking for a tool that can passively transcribe audio files to make them easier to search - this looks like it could almost solve that use case - maybe with a scripted tagger.

That is funny. For audio books I'm currently working on an `epub` command for `tone` which will be able to extract text from `epub` files, e.g.:

  tone epub --format="markdown" --extract-sentences --one-file-per-chapter output-path/
As a result, you can use https://github.com/readbeyond/aeneas with the generated text / markdown files to create a json mapping file looking like this:

  {
   "fragments": [
    {
     "begin": "0.000",
     "children": [], 
     "end": "7.920",
     "id": "f000001",
     "language": "eng",
     "lines": [
      "This is the first sentence of the audio book."
     ]   
    }
  }
Since aeneas is a bit inaccurate, I'm also working on an improvement with silence detection for these mapping files.

If you are looking for something that is "ready to use", you could check out https://github.com/r4victor/syncabook or the according library https://github.com/r4victor/afaligner

If you have audio files, that are NOT audio books, the epub approach will not help you and the other comments are more helpful.