re2c

Re2c is a free and open-source lexer generator for C and C++. The main goal of the project is to generate very fast lexers that match or exceed the speed of carefully optimized hand-written code. Instead of using traditional table-driven approach, re2c encodes the underlying finite state automata directly in the form of conditional jumps and applies numerous optimizations to the generated code. The resulting programs are faster and often smaller than their table-driven counterparts, and they are much easier to debug and understand. Re2c has an unusual flexible user interface: instead of assuming a fixed program template, it leaves the definition of the interface code to the user and allows to configure almost every aspect of the generated code. This gives the users a lot of freedom in the way they bind the lexer to their particular environment and allows them to decide on the optimal input model. Re2c supports fast and lightweight submatch extraction which does not requre the overhead on full parsing — a feature that is rarely found in the wild. Re2c is used by many other projects (such as php, ninja, yasm, spamassassin, BRL-CAD and wake) and aims at being fully backward compatible. On the other hand, it is a research project and a playground for the development of new algorithms in the field of formal grammars and automata.

feed Subscribe to receive the latest news and updates. See the user manual for a complete overview with examples.

Download

You can get the latest release on Github, as well as the older releases (make sure you download the latest minor version in each series). Many Linux distributions and other systems provide their own packages. Re2c source code is hosted on both Github (https://github.com/skvadrik/re2c) and SourceForge (https://sourceforge.net/p/re2c). Github serves as the main repository, bugtracker and tarball hosting. SourceForge is used as a backup repository and email hosting.

Bugs & patches

Please send bugs reports, patches and other feedback to github issue tracker or email them to re2c-devel@lists.sourceforge.net and re2c-general@lists.sourceforge.net mailing lists. Re2c has an IRC channel #re2c on freenode. Re2c developers are happy to answer questions and provide help. Contributions are always welcome!

Papers

Authors

Re2c was originally written by Peter Bumbulis (peter@csg.uwaterloo.ca) in 1993. Since then it has been maintained and developed by multiple volunteers, most notably, Brian Young (bayoung@acm.org), Markus Boerger (helly@users.sourceforge.net), Dan Nuffer (nuffer@users.sourceforge.net) and Ulya Trofimovich (skvadrik@gmail.com). Other re2c contributors are Derick Rethans, Emmanuel Mogenet, Hartmut Kaiser, jcfp, joscherl, Mike Gilbert, Nerd, nuno-lopes, Oleksii Taran, Peter Bumbulis, Petr Skocik, Paulo Custodio, Ross Burton, Ryan Mast, Serghei Iakovlev, Sergei Trofimovich and Tim Kelly (apologies if someone is missing).

License

Re2c is in the public domain. The data structures and algorithms used in re2c are all either taken from documents available to the general public or are inventions of the author. Programs generated by re2c may be distributed freely. Re2c itself may be distributed freely, in source or binary, unchanged or modified. Distributors may charge whatever fees they can obtain for re2c. If you do make use of re2c, or incorporate it into a larger project an acknowledgement somewhere (documentation, research report, etc.) would be appreciated. Re2c is distributed with no warranty whatsoever. The code is certain to contain errors. Neither the author nor any contributor takes responsibility for any consequences of its use.

Version

This website describes re2c version 1.2.