First of all, thanks everyone for the valuable feedback, critics, suggestions, etc. Truly appreciate that!

And thanks for patience to all people who are playing with the Nevod right now and getting errors. We have an unexpectedly high interest and number of visitors is very high in our playground. Meanwhile, it's a preview of the technology, please keep in mind.

Let me clarify few things. The speed of Nevod is based on two things:

1. Nevod operates on words, not individual characters. From my 30 years of coding experience I would say that roughly 95% of text processing/rules are word-based. So it covers most of tasks.

2. Nevod matches MULTIPLE patterns against document in ONE PASS. Patters/expressions are indexed by state machine and filtered effectively during matching.

So, at a RULE DECK of 1000 patterns, we got 300 times improvement comparing to RegExp. This is a kind of tasks when you let say need to highlight syntax, to massively verify and route incoming requests in cloud, perform massive text validations, track certain phrases and relations in massive human communication, etc.

With growing number of patterns the difference between Nevod and RegExp is higher and higher and go far beyond 300 times. So, I'm a kind of disagree with moderators removed the word "hundreds" from the headline. :)

We will publish benchmark results and we will also release "negrep" (Nevod GREP) command line tool for Windows, Linux, Mac, so everyone will be able to run "negrep", play with it and verify benchmark him/herself. Generally, Nevod technology will be available both in form of a library, in form of command line tool, and in form of service.

> 2. Nevod matches MULTIPLE patterns against document in ONE PASS. Patters/expressions are indexed by state machine and filtered effectively during matching.

This is a good idea. It's such a good idea that we did it in 2006, sold a company to Intel based around in in 2013 and open-sourced the resulting library (https://github.com/intel/hyperscan) in 2015.

All multi-pattern systems get a huge improvement over one-at-a-time regex. It would be embarrassing if they did not. Our rule of thumb for Hyperscan was that it should be about 5-10x faster than libpcre on single patterns and the gap should get progressively bigger.

I don't doubt that there is quite a bit of interesting work (entirely outside of regex - the word-based focus looks interesting) going on in this project, but the fact that you couldn't be bothered to Google around for a sensible baseline is not encouraging.

Also, which bit of your system do you think is novel enough to patent? You may find considerable prior art out there, which you don't really seem to be aware of based on how much emphasis you're placing on the not-entirely-novel idea of using a state machine.