This is awesome!
@danfox, sent you an email though commenting here too.
I'm the CTO @ GitHub. Would love to talk to you about this and other things we are building in this area at GitHub.
Feel free to email direct to jason at github.com
iirc, Github uses (used?) my old project (https://github.com/intel/hyperscan) at Intel. It's probably faster than the alternatives, although if you want to support all types of regex you'll need to use Hyperscan as a prefilter for a richer regex engine like PCRE.
This project looks like it pulls literal factors out of the regex that I type in, maybe to an index a la that Russ Cox blog post a while back about Code Search. It seems to Not Like things that have very open-ended character classes (e.g. \w) unless there is a decent length literal involved somewhere.
It seems to have a very rudimentary literal extraction routine, as it decides to give a partial result set when fed an alternation between two literals that it handles pretty well on their own.