Get the latest tech news

Regular Expression Matching with a Trigram Index (2012)


f 2006, I was lucky enough to be an intern at Google. At the time, Google had an internal tool called gsearch that acted as if it ran grep over all the files in the Google source tree and printed the results.

Because it can call on the standard Go package to parse regular expressions and boil them down to basic operations, the new matcher is under 500 lines of code. for the regular expression ab[cd]e. If the implementation chose to apply the simplifying transformations more aggressively, it would use a smaller memory footprint but arrive at Despite all their apparent syntactic complexity, regular expressions in the mathematical sense of the term can always be reduced to the few cases (empty string, single character, repetition, concatenation, and alternation) considered above.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of regular expression

regular expression

Photo of trigram index

trigram index

Related news:

News photo

Is there a regular expression to detect a valid regular expression? (2008)