re2c¶
re2c is a free and open-source lexer generator for C/C++, Go and Rust with a focus on generating fast code. It compiles regular expression specifications to deterministic finite automata and encodes them in the form of conditional jumps in the target language. This approach is generally faster than table-based lexers, and the generated code is easier to debug and understand. A flexible user interface allows one to adapt the generated lexer to a particular environment and input model, avoiding the overhead on unnecessary checks and buffers. re2c is based on the lookahead TDFA algorithm that allows it to perform fast and lightweight submatch extraction. It is used in other open-source projects such as php, ninja, yasm, spamassassin, BRL-CAD, wake, etc.
Subscribe to receive the latest news and updates. See the user manuals (C/C++, Go, Rust) for a complete overview with examples.
Download¶
You can get the latest release on GitHub, as well as the older releases. Many Linux distributions and other systems provide their own packages. The source code is hosted on both GitHub (https://github.com/skvadrik/re2c) and SourceForge (https://sourceforge.net/p/re2c). GitHub serves as the main repository, bugtracker and tarball hosting. SourceForge is used as a backup repository and email hosting.
Bugs & patches¶
Please send bugs reports, patches and other feedback to GitHub issue tracker or email them to
re2c-devel@lists.sourceforge.net and
re2c-general@lists.sourceforge.net
mailing lists. There is an IRC channel #re2c
on
irc.libera.chat and
irc.oftc.net. Questions and contributions are
welcome!
Papers¶
2022 A closer look at TDFA by Angelo Borsotti and Ulya Trofimovich. arXiv:2206.01398 [pdf 2022]
2020 RE2C: A lexer generator based on lookahead-TDFA by Ulya Trofimovich. Software Impacts 6 (2020) 100027, [pdf 2021]
2019 Efficient POSIX submatch extraction on NFA by Angelo Borsotti and Ulya Trofimovich. Software: Practice and Experience 51, 2, pp. 159–192 [pdf 2019]
2017 Tagged Deterministic Finite Automata with Lookahead by Ulya Trofimovich. arXiv:1907.08837, [pdf 2017]
1994 RE2C: a more versatile scanner generator by Peter Bumbulis and Donald D. Cowan. ACM Letters on Programming Languages and Systems (LOPLAS) [ps 1994]
License¶
re2c is in the public domain. The data structures and algorithms used in re2c are all either taken from documents available to the general public or are inventions of the author. Programs generated by re2c may be distributed freely. re2c itself may be distributed freely, in source or binary, unchanged or modified. Distributors may charge whatever fees they can obtain for re2c. If you do make use of re2c, or incorporate it into a larger project an acknowledgment somewhere (documentation, research report, etc.) would be appreciated. re2c is distributed with no warranty whatsoever. The code is certain to contain errors. Neither the author nor any contributor takes responsibility for any consequences of its use.
Version¶
This website describes re2c version 3.0.