Get the latest tech news

Runtime-Extensible SQL Parsers Using Peg


Despite their central role in processing queries, parsers have not received any noticeable attention in the data systems space. State-of-the art systems are content with ancient old parser generators. These generators create monolithic, inflexible and unforgiving parsers that hinder innovation in query languages and frustrate users. Instead, parsers should be rewritten using modern abstractions like Parser Expression Grammars (PEG), which allow dynamic changes to the accepted query syntax and better error recovery. In this post, we discuss how parsers could be re-designed using PEG, and validate our recommendations using experiments for both effectiveness and efficiency.

While this is also true in other ecosystems like Python, the design of SQL with its heavy focus on syntax and not function calls makes the extensions second-class citizens that have to somehow work around the restrictions by the original parser, e.g., by embedding custom expressions in strings. Some systems go to great lengths to try to provide a meaningful error message, e.g., this column does not exist, did you mean ..., but this is typically limited to resolving identifiers following the actual parsing. Here, we use the%recover construct to match a misplaced GROUP BY clause, re-using the original definition, and then trigger a custom error message that advises the user on how to fix their query.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of PEG

PEG

Related news:

News photo

Use context-free grammars instead of parser combinators and PEG