Changelog

4.0x

4.0.2 (2024-12-11)

  • CMake build system: fixed bug #515 where language-specific binaries erroneously defaulted to generating code for C.

  • Playground: address bar now reflects navigation between examples, editors use a more high-contrast CSS theme.

4.0.1 (2024-11-25)

  • Added missing doc sources to the distribution tarball (#503)

  • Reworked C/C++ examples to avoid using new configuration aliases until the world has updated to re2c 4.0, made them compatible with C.

  • A few build system changes: increased CMake minimum required version to 3.15, added missing dependencies on doc sources in Makefile.am.

  • Fixed typos in docs.

4.0 (2024-11-19)

  • Added a generic technique for describing language backends based on the idea of syntax files (#450).

  • Added support for new languages: D, Haskell, Java, JavaScript, OCaml, Python, V, Zig.

  • Added new record API for all languages (enabled with --api record, re2c:api = record) and made it the default API for Haskell and OCaml.

  • Renamed former “default API” to “simple API”, implemented it for all backends except Haskell and OCaml, and enabled by default API for C, D, Java, JavaScript, Python, V and Zig.

  • Added new code generation model - recursive functions (enabled with --recursive-functions), primarily to be used for functional languages.

  • Added options:

    • --syntax <file>

    • --api simple

    • --api generic

    • --api record

    • --goto-label

    • --recursive-functions

    • --lang none

    • --lang d

    • --lang haskell

    • --lang java

    • --lang js

    • --lang ocaml

    • --lang python

    • --lang v

    • --lang zig

    • --leftmost-captvars

    • --posix-captvars

    • --captvars (alias for --leftmost-captvars)

    • --captures (alias for --leftmost-captures)

  • Added configurations:

    • re2c:api = simple (alias for re2c:api = default)

    • re2c:api = generic (alias for re2c:api = custom)

    • re2c:api = record

    • re2c:computed-gotos (alias for re2c:cgoto)

    • re2c:cond:abort

    • re2c:tags:negative

    • re2c:leftmost-captvars

    • re2c:posix-captvars

    • re2c:captvars (alias for re2c:leftmost-captvars)

    • re2c:captures (alias for re2c:leftmost-captures)

    • re2c:monadic

    • re2c:fn:sep

    • re2c:[define:]YYFN

    • re2c:[define:]YYINPUT

    • re2c:[define:]YYGETACCEPT

    • re2c:[define:]YYSETACCEPT

    • re2c:[define:]YYCOPYSTAG

    • re2c:[define:]YYCOPYMTAG

    • re2c:[define:]YYGETCOND (alias for re2c:[define:]YYGETCONDITION)

    • re2c:[define:]YYSETCOND (alias for re2c:[define:]YYSETCONDITION)

    • re2c:[variable:]yyfill

    • re2c:[variable:]yynmatch

    • re2c:[variable:]yypmatch

    • re2c:[variable:]yych:literals

  • All configurations that have define: or variable: part in their name now have an alias without this part.

  • Added new block types:

    • /*!svars:re2c ... */

    • /*!mvars:re2c ... */

  • Flex-style opening/closing braces %{ and %} for block start/end markers now work for all block types.

  • Added syntax file feature lists:

    • supported_apis with values from the list: simple, generic, record

    • supported_api_styles with values from the list: free-form, functions

    • supported_code_models with values from the list: goto-label, loop-switch, recursive-functions

    • supported_targets with values from the list: code, dot, skeleton

    • supported_features with values from the list: nested-ifs, bitmaps, computed-gotos, case-ranges, tags, captures, captvars, monadic, unsafe

  • Added syntax file language-specific options:

    • semicolons

    • backtick_quoted_strings

    • single_quoted_strings

    • indentation_sensitive

    • wrap_blocks_in_braces

  • Added syntax file code templates:

    • code:var_local

    • code:var_global

    • code:const_local

    • code:const_global

    • code:array_local

    • code:array_global

    • code:array_elem

    • code:enum

    • code:enum_elem

    • code:assign

    • code:type_int

    • code:type_uint

    • code:type_cond_enum

    • code:type_yybm

    • code:type_yytarget

    • code:cmp_eq

    • code:cmp_ne

    • code:cmp_lt

    • code:cmp_gt

    • code:cmp_le

    • code:cmp_ge

    • code:if_then_else

    • code:if_then_else_oneline

    • code:switch

    • code:switch_cases

    • code:switch_cases_oneline

    • code:switch_case_range

    • code:switch_case_default

    • code:loop

    • code:continue

    • code:goto

    • code:fndecl

    • code:fndef

    • code:fncall

    • code:tailcall

    • code:recursive_functions

    • code:fingerprint

    • code:line_info

    • code:abort

    • code:yydebug

    • code:yypeek

    • code:yyskip

    • code:yybackup

    • code:yybackupctx

    • code:yyskip_yypeek

    • code:yypeek_yyskip

    • code:yyskip_yybackup

    • code:yybackup_yyskip

    • code:yybackup_yypeek

    • code:yyskip_yybackup_yypeek

    • code:yybackup_yypeek_yyskip

    • code:yyrestore

    • code:yyrestorectx

    • code:yyrestoretag

    • code:yyshift

    • code:yyshiftstag

    • code:yyshiftmtag

    • code:yystagp

    • code:yymtagp

    • code:yystagn

    • code:yymtagn

    • code:yycopystag

    • code:yycopymtag

    • code:yygetaccept

    • code:yysetaccept

    • code:yygetcond

    • code:yysetcond

    • code:yygetstate

    • code:yysetstate

    • code:yylessthan

    • code:yybm_filter

    • code:yybm_match

  • Added global variables in syntax files:

    • nl

    • indent

    • dedent

    • topindent

  • Added global conditionals in syntax files:

    • .api.simple

    • .api.generic

    • .api.record

    • .api_style.functions

    • .api_style.freeform

    • .case_ranges

    • .code_model.goto_label

    • .code_model.loop_switch

    • .code_model.recursive_functions

    • .date

    • .loop_label

    • .monadic

    • .start_conditions

    • .storable_state

    • .unsafe

    • .version

  • Added warning -Wundefined-syntax-config.

  • Warnings that indicate serious issues are now turned on by default (and can be disabled with -Wno-<warning> options.

  • Added configure options:

    • --enable-syntax (Autoconf)

    • RE2C_REBUILD_SYNTAX (CMake)

  • Dropped support for function-like API style for Rust. (it was hard to use, if at all possible)

  • Added online playground that allows one to run re2c in a web browser: https://re2c.org/playground.

  • Infra work on Github Actions CI.

3.0x

3.1 (2023-07-19)

  • Added capturing groups with leftmost greedy semantics:

    • Enabled with --leftmost-captures option or re2c:leftmost_captures configuration (55de79d8, 3a98b543).

  • Added non-capturing groups:

    • Added new syntax (! ...) for non-capturing groups (1edd25d3, b813c9b4, 338806b9).

    • Added the ability to flip defaults: make (...) capturing and (! ...) non-capturing with --invert-captures option or re2c:invert_captures configuration (20030ff1, ce756195).

  • Regenerated Unicode include header to support a newer standard (e3ec2597).

  • Published TDFA paper: https://arxiv.org/abs/2206.01398, co-authored with Angelo Borsotti (fa94d9c7).

  • Removed experimental algorithms that are superseded by TDFA(1) and generally less efficient:

    • Removed staDFA algorithm and deprecated --stadfa option (ac5c06cc).

    • Removed TDFA(0) algorithm and deprecated --no-lookahead option (dc8f264a).

    • (libre2c) Removed backward-matching algorithm (27256be1).

    • (libre2c) Removed Kuklewicz POSIX disambiguation algorithm (aa97b014).

    • (libre2c) Removed GTOP shortest path finding algorithm (511a030c).

  • Bug fixes:

    • Fixed parsing of raw UTF-8 characters in Flex compatibility mode (d87f86ed).

    • Added header file to the dependencies generated with --depfile option (f807f763 and 2dda36aa).

    • Fixed stack overflow on large regular expressions by rewriting recursive functions in iterative form (46a9b4c4, aaf68292, 02e5d797, 5fffb187) and limited stack to 256K on GithubActions CI (111ee5da).

  • Build system:

    • Added minimal http://bazel.build integration (3205c867).

    • Added configure option --enable-parsers that regenerates bison parsers (9e0dbd3c).

    • Added CMake option RE2C_REBUILD_PARSERS (6e91c22d).

    • With CMake, fixed documentation generation on Windows.

  • Codebase improvements:

    • Moved the entire codebase to C++11.

    • Added uniform error handling (return codes are now properly checked and returned to the caller).

    • Reorganized codegen subsystem in four well-defined phases (analyze, generate, fixup, render) and separated codegen from parsing phase.

    • Improved memory allocation by using slab allocators instead of global free lists.

    • Moved to pure API for bison parsers.

    • Unified code style.

  • Testing:

    • Added --verbose flag to run_tests.py and suppressed verbose output by default.

    • Multiple improvements of continuous testing with GithubActions.

3.0 (2022-01-27)

  • Added code generation backend for Rust:

    • Enabled with --lang rust option.

    • A new re2rust binary (built by default, or configured with --enable-rust Autoconf option and RE2C_BUILD_RE2RUST CMake option).

  • Added options:

    • --loop-switch

    • --no-unsafe

  • Added configurations;

    • re2c:label:yyloop

    • re2c:unsafe

  • Renamed options to use common naming scheme. The old names are supported as aliases, so the change does not break existing code. Documentation has been updated to use new names.

    • --api is a new alias for --input

    • --ebcdic is a new alias for --ecb

    • --ucs2 is a new alias for --wide-chars

    • --utf32 is a new alias for --unicode

    • --utf16 is a new alias for --utf-16

    • --utf8 is a new alias for --utf-8

    • --header is a new alias for --type-header

  • Renamed configurations to use common naming scheme and support proper scoping under subcategories such as :define, :label, :variable, etc. The old names are supported as aliases, so the change does not break existing code. Documentation has been updated to use new names.

    • re2c:api is a new alias for re2c:flags:input

    • re2c:bit-vectors is a new alias for re2c:flags:bit-vectors

    • re2c:case-insensitive is a new alias for re2c:flags:case-insensitive

    • re2c:case-inverted is a new alias for re2c:flags:case-inverted

    • re2c:case-ranges is a new alias for re2c:flags:case-ranges

    • re2c:cond:prefix is a new alias for re2c:condprefix

    • re2c:cond:enumprefix is a new alias for re2c:condenumprefix

    • re2c:computed-gotos is a new alias for re2c:flags:computed-gotos

    • re2c:computed-gotos:threshold is a new alias for re2c:cgoto:threshold

    • re2c:debug-output is a new alias for re2c:flags:debug-output

    • re2c:encoding:ebcdic is a new alias for re2c:flags:ecb

    • re2c:encoding:utf32 is a new alias for re2c:flags:unicode

    • re2c:encoding:ucs2 is a new alias for re2c:flags:wide-chars

    • re2c:encoding:utf16 is a new alias for re2c:flags:utf-16

    • re2c:encoding:utf8 is a new alias for re2c:flags:utf-8

    • re2c:encoding-policy is a new alias for re2c:flags:encoding-policy

    • re2c:empty-class is a new alias for re2c:flags:empty-class

    • re2c:header is a new alias for re2c:flags:type-header

    • re2c:label:prefix is a new alias for re2c:labelprefix

    • re2c:label:yyfill is a new alias for re2c:label:yyFillLabel

    • re2c:label:start is a new alias for re2c:startlabel

    • re2c:nested-ifs is a new alias for re2c:flags:nested-ifs

    • re2c:posix-captures is a new alias for re2c:flags:posix-captures

    • re2c:tags is a new alias for re2c:flags:tags

    • re2c:variable:yych:conversion is a new alias for re2c:yych:conversion

    • re2c:variable:yych:emit is a new alias for re2c:yych:emit

    • re2c:variable:yybm:hex is a new alias for re2c:yybm:hex

    • re2c:unsafe is a new alias for re2c:flags:unsafe

  • Added directive alias conditions:re2c for types:re2c.

  • Multiple small changes in code generation, including some formatting changes that result in large diffs in the generated code:

    • Do not allocate indices for unused state labels (this results in a change in state enumeration), commits 919570c4 and 82b704f6.

    • Do not generate redundant YYPEEK statements, commit cca31d22.

    • Do not generate YYDEBUG statements for unused states labels, commit a46f01e6.

    • C backend: change formatting of switch statements, commit ed88e12e.

    • Go backend: render continuous character ranges in compact form, commit 09161b14.

    • Mark start and end of included .re files with line directives, commit 48e83fca.

  • A fix to limit maximum allowed NFA and DFA size (to avoid out of memory crashes and stack overflows), commit a3473fd7.

  • A fix to correctly compute fixed tags in trailing context, commit 68e1ab71.

  • A fix to generate non-overlapping names for s-tag and m-tag variables, commit 7c6b5c95.

  • Infrastructural: added support for CMake presets.

  • Updated documentation.

  • Backwards-incompatible changes that are unlikely to affect any users:

    • Restrict lexical contexts where %{ is recognized as a block start, commit dba7d055.

    • Emit an error when repetition lower bound exceeds upper bound, commit 039c1894.

2.2x

2.2 (2021-08-01)

  • Added named blocks and block lists in directives.

  • Added local blocks /*!local:re2c ... */.

  • Added in-block !include directive.

  • Added in-block !use directive.

  • Allowed reusable blocks without -r --reusable option.

  • Allowed customizing the generated code with configurations for directives max:re2c, maxnmatch:re2c, stags:re2c, mtags:re2c and types:re2c (see directive descriptions for details).

  • Forbid arbitrary text at the end of max:re2c directive. This may break backwards compatibility, although it is unlikely that this was used by anyone. The change was necessary in order to allow customization of the generated code with configurations.

  • Deprecated configurations flags:i, flags:no-debug-info in favour of the global options -i, --no-debug-info.

  • Reimplemented re2c test runner in Python (thanks to Serghei Iakovlev). Improved integration with GitHub Actions.

  • Changes in the experimental libre2c library: added new algorithms that construct t-string or extract submatch on all repetitions; added TDFA benchmark written in Java by Angelo Borsotti.

  • Updated documentation.

2.1x

2.1.1 (2021-03-27)

  • Added missing CMakeLists.txt to release tarballs (#346).

2.1 (2021-03-26)

  • Added GitHub Actions CI for Linux, macOS and Windows and fixed numerous build issues on those platforms (thanks to Serghei Iakovlev).

  • Added benchmarks for submatch extraction in lexer generators (ragel vs. kleenex vs. re2c with TDFA(0), TDFA(1) or sta-DFA algorithms).

    • New Autotools (configure) options: --enable-benchmarks, --enable-benchmarks-regenerate

    • New CMake options: -DRE2C_BUILD_BENCHMARKS, -DRE2C_REGEN_BENCHMARKS

    • New json2pgfplot.py script that converts benchmark results in JSON to a PDF with bar charts

  • Added option --depfile <filename> to generate build dependency files (allows to track /*!include:re2c*/ dependencies in the build system).

  • Added option --fixed-tags <none | all | toplevel> and improved fixed-tag optimization to work with nested tags.

  • Added lzip to the distribution tarballs.

  • Added registerless-TDFA algorithm in the experimental libre2c library.

  • Explicitly disallowed invalid configuration when -f, --storable-state option is used, but YYFILL is disabled (#306).

  • Fixed bug in UTF-8 decode for 4-bytes rune (#307, thanks to Satoshi Yasushima).

  • Fixed bugs in rare cases of the end-of-input rule $ usage (277f0295, 68611a57 and a9d582f9).

  • Optimized --skeleton generation time.

  • Renamed internal option --dfa to --nested-negative-tags.

  • Updated documentation for end of input handling and submatch extraction.

2.0x

2.0.3 (2020-08-22)

2.0.2 (2020-08-08)

  • Enable re2go building by default.

  • Package CMake files into release tarball.

2.0.1 (2020-07-29)

  • Updated version for CMake build system (forgotten in release 2.0).

  • Added a short article about re2c for the Software Impacts journal.

2.0 (2020-07-20)

  • Added new code generation backend for Go and a new re2go program (#272: Go support). Added option --lang <c | go>.

  • Added CMake build system as an alternative to Autotools (#275: Add a CMake build system (thanks to ligfx), #244: Switching to CMake).

  • Changes in generic API:

    • Removed primitives YYSTAGPD and YYMTAGPD.

    • Added primitives YYSHIFT, YYSHIFTSTAG, YYSHIFTMTAG that allow to express fixed tags in terms of generic API.

    • Added configurations re2c:api:style and re2c:api:sigil.

    • Added named placeholders in interpolated configuration strings.

  • Changes in reuse mode (-r, --reuse option):

    • Do not reset API-related configurations in each use:re2c block (#291: Defines in rules block are not propagated to use blocks).

    • Use block-local options instead of last block options.

    • Do not accumulate options from rules/reuse blocks in whole-program options.

    • Generate non-overlapping YYFILL labels for reuse blocks.

    • Generate start label for each reuse block in storable state mode.

  • Changes in start-conditions mode (-c, --start-conditions option):

    • Allow to use normal (non-conditional) blocks in -c mode (#263: allow mixing conditional and non-conditional blocks with -c, #296: Conditions required for all lexers when using ‘-c’ option).

    • Generate condition switch in every re2c block (#295: Condition switch generated for only one lexer per file).

  • Changes in the generated labels:

    • Use yyeof label prefix instead of yyeofrule.

    • Use yyfill label prefix instead of yyFillLabel.

    • Decouple start label and initial label (affects label numbering).

  • Removed undocumented configuration re2c:flags:o, re2c:flags:output.

  • Changes in re2c:flags:t, re2c:flags:type-header configuration: filename is now relative to the output file directory.

  • Added option --case-ranges and configuration re2c:flags:case-ranges.

  • Extended fixed tags optimization for the case of fixed-counter repetition.

  • Fixed bugs related to EOF rule:

    • #276: Example 01_fill.re in docs is broken

    • #280: EOF rules with multiple blocks

    • #284: mismatched YYBACKUP and YYRESTORE (Add missing fallback states with EOF rule)

  • Fixed miscellaneous bugs:

    • #286: Incorrect submatch values with fixed-length trailing context.

    • #297: configure error on ubuntu 18.04 / cmake 3.10

  • Changed bootstrap process (require explicit configuration flags and a path to re2c executable to regenerate the lexers).

  • Added internal options --posix-prectable <naive | complex>.

  • Added debug option --dump-dfa-tree.

  • Major revision of the paper “Efficient POSIX submatch extraction on NFA”.

1.3x

1.3 (2019-12-14)

  • Added option: --stadfa.

  • Added warning: -Wsentinel-in-midrule.

  • Added generic API primitives:

    • YYSTAGPD

    • YYMTAGPD

  • Added configurations:

    • re2c:sentinel = 0;

    • re2c:define:YYSTAGPD = "YYSTAGPD";

    • re2c:define:YYMTAGPD = "YYMTAGPD";

  • Worked on reproducible builds (#258: Make the build reproducible).

1.2x

1.2.1 (2019-08-11)

  • Fixed bug #253: re2c should install unicode_categories.re somewhere.

  • Fixed bug #254: Turn off re2c:eof = 0.

1.2 (2019-08-02)

  • Added EOF rule $ and configuration re2c:eof.

  • Added /*!include:re2c ... */ directive and -I option.

  • Added /*!header:re2c:on*/ and /*!header:re2c:off*/ directives.

  • Added --input-encoding <ascii | utf8> option.

    • #237: Handle non-ASCII encoded characters in regular expressions

    • #250 UTF8 enoding

  • Added include file with a list of definitions for Unicode character classes.

    • #235: Unicode character classes

  • Added --location-format <gnu | msvc> option.

    • #195: Please consider using Gnu format for error messages

  • Added --verbose option that prints “success” message if re2c exits without errors.

  • Added configurations for options:

    • -o --output (specify output file)

    • -t --type-header (specify header file)

  • Removed configurations for internal/debug options.

  • Extended -r option: allow to mix multiple /*!rules:re2c*/, /*!use:re2c*/ and /*!re2c*/ blocks.

    • #55: allow standard re2c blocks in reuse mode

  • Fixed -F --flex-support option: parsing and operator precedence.

    • #229: re2c option -F (flex syntax) broken

    • #242: Operator precedence with –flex-syntax is broken

  • Changed difference operator / to apply before encoding expansion of operands.

    • #236: Support range difference with variable-length encodings

  • Changed output generation of output file to be atomic.

    • #245: re2c output is not atomic

  • Authored research paper “Efficient POSIX Submatch Extraction on NFA” together with Dr Angelo Borsotti.

  • Added experimental libre2c library (--enable-libs configure option) with the following algorithms:

    • TDFA with leftmost-greedy disambiguation

    • TDFA with POSIX disambiguation (Okui-Suzuki algorithm)

    • TNFA with leftmost-greedy disambiguation

    • TNFA with POSIX disambiguation (Okui-Suzuki algorithm)

    • TNFA with lazy POSIX disambiguation (Okui-Suzuki algorithm)

    • TNFA with POSIX disambiguation (Kuklewicz algorithm)

    • TNFA with POSIX disambiguation (Cox algorithm)

  • Added debug subsystem (--enable-debug configure option) and new debug options:

    • -dump-cfg (dump control flow graph of tag variables)

    • -dump-interf (dump interference table of tag variables)

    • -dump-closure-stats (dump epsilon-closure statistics)

  • Added internal options:

    • --posix-closure <gor1 | gtop> (switch between shortest-path algorithms used for the construction of POSIX closure)

  • Fixed a number of crashes found by American Fuzzy Lop fuzzer:

  • Fixed handling of newlines:

    • correctly parse multi-character newlines CR LF in #line directives

    • consistently convert all newlines in the generated file to Unix-style LF

  • Changed default tarball format from .gz to .xz.

    • #221: big source tarball

  • Fixed a number of other bugs and resolved issues:

    • #2: abort

    • #6: segfault

    • #10: lessons/002_upn_calculator/calc_002 doesn’t produce a useful example program

    • #44: Access violation when translating the attached file

    • #49: wildcard state 000 rules makes lexer behave weard

    • #98: Transparent handling of #line directives in input files

    • #104: Improve const-correctness

    • #105: Conversion of pointer parameters into references

    • #114: Possibility of fixing bug 2535084

    • #120: condition consisting of default rule only is ignored

    • #167: Add word boundary support

    • #168: Wikipedia’s article on re2c

    • #180: Comment syntax?

    • #182: yych being set by YYPEEK () and then not used

    • #196: Implicit type conversion warnings

    • #198: no match for ‘operator!=’ in ‘i != std::vector<_Tp, _Alloc>::rend() [with _Tp = re2c::bitmap_t, _Alloc = std::allocator<re2c::bitmap_t>]()’

    • #210: How to build re2c in windows?

    • #215: A memory read overrun issue in s_to_n32_unsafe.cc

    • #220: src/dfa/dfa.h: simplify constructor to avoid g++-3.4 bug

    • #223: Fix typo

    • #224: src/dfa/closure_posix.cc: pack() tweaks

    • #225: Documentation link is broken in libre2c/README

    • #230: Changes for upcoming Travis’ infra migration

    • #239: Push model example has wrong re2c invocation, breaks guide

    • #241: Guidance on how to use re2c for full-duplex command & response protocol

    • #243: A code generated for period (.) requires 4 bytes

    • #246: Please add a license to this repo

    • #247: Build failure on current Cygwin, probably caused by force-fed c++98 mode

    • #248: distcheck still looks for README

    • #251: Including what you use is find, but not without inclusion guards

  • Updated documentation and website.

1.1x

1.1.1 (2018-08-30)

  • Fixed bug #211: re2c -V throws std::out_of_range (version to vernum conversion).

1.1 (2018-08-27)

  • Replaced Kuklewicz POSIX disambiguation algorithm with Okui algorithm.

  • Optimized GOR1 algorithm (computation of tagged epsilon-closure).

  • Added option --conditions (an alias for -c --start-conditions).

  • Fixed bug #201: Bugs with option: re2c:flags:no-debug-info.

  • Reworked first part of TDFA paper.

1.0x

1.0.3 (2017-11-08)

  • Fixed bug #198: build error on MacOS with GCC-4.2.1

1.0.2 (2017-08-26)

  • Fixed bug #194: Build with --enable-docs

  • Updated documentation.

1.0.1 (2017-08-11)

  • Fixed bug #193: 1.0 build failure on macOS: error: calling a private constructor of class ‘re2c::Rule’

  • Added paper “Tagged Deterministic Finite Automata with Lookahead” to the distribution files.

1.0 (2017-08-11)

  • Added options:

    • -P --posix-captures (POSIX-compliant capturing groups)

    • -T --tags (standalone tags with leftmost greedy disambiguation)

    • --no-lookahead

    • --no-optimize-tags

    • --eager-skip

    • --dump-nfa

    • --dump-dfa-raw

    • --dump-dfa-det

    • --dump-dfa-tagopt

    • --dump-dfa-min

    • --dump-adfa

  • Added new syntax:

    • @<stag>

    • #<mtag>

  • Added new directives:

    • /*!stags:re2c ... */

    • /*!mtags:re2c ... */

    • /*!maxnmatch:re2c ... */

  • Added new API:

    • YYSTAGN (t)

    • YYSTAGP (t)

    • YYMTAGN (t)

    • YYMTAGP (t)

    • YYRESTORETAG (t)

    • YYMAXNMATCH

    • yynmatch

    • yypmatch

  • Added inplace confgurations:

    • re2c:define:YYSTAGN

    • re2c:define:YYSTAGP

    • re2c:define:YYMTAGN

    • re2c:define:YYMTAGP

    • re2c:define:YYRESTORETAG

    • re2c:flags:8 or re2c:flags:utf-8``

    • re2c:flags:b or re2c:flags:bit-vectors

    • re2c:flags:case-insensitive

    • re2c:flags:case-inverted

    • re2c:flags:d or re2c:flags:debug-output

    • re2c:flags:dfa-minimization

    • re2c:flags:eager-skip

    • re2c:flags:e or re2c:flags:ecb

    • re2c:flags:empty-class

    • re2c:flags:encoding-policy

    • re2c:flags:g or re2c:flags:computed-gotos

    • re2c:flags:i or re2c:flags:no-debug-info

    • re2c:flags:input

    • re2c:flags:lookahead

    • re2c:flags:optimize-tags

    • re2c:flags:P or re2c:flags:posix-captures

    • re2c:flags:s or re2c:flags:nested-ifs

    • re2c:flags:T or re2c:flags:tags

    • re2c:flags:u or re2c:flags:unicode

    • re2c:flags:w or re2c:flags:wide-chars

    • re2c:flags:x or re2c:flags:utf-16

    • re2c:tags:expression

    • re2c:tags:prefix

  • Added warning -Wnondeterministic-tags.

  • Added fuzz-testing scripts

  • Added paper “Tagged Deterministic Finite Automata with Lookahead”.

  • Fixed bugs:

    • #121: trailing contexts are fundamentally broken

    • #135: In installation make check give syntax error

    • #137: run_tests.sh fail when running configure script with absolute path

    • #138: website improvement

    • #141: Tests under Windows

    • #142: segvault with null terminated input

    • #145: Values for enum YYCONDTYPE are not generated when default rules with conditions are used

    • #147: Please add symbol name to “can’t find symbol” error message

    • #152: Line number in #line directive after enum YYCONDTYPE is 0-based

    • #156: Build with Visual Studio 14 2015: symbol name conflict

    • #158: Inconsistent forward declaration of struct/class vs definition

    • #160: Open text files with “wb” causes issues on Windows

    • #162: Reading files with “rb” causes issues in Windows

    • #165: Trailing context consumed if initial expression matches it

    • #176: re2c help message is too wide for most terminals

    • #184: Small documentation issue

    • #186: Difference operator sometimes doesn’t work with utf-8

  • Merged pull requests:

    • #131: Use bash-specific [[ builtin

    • #136: Added basic support for travis-ci.org integration

    • #171: Typo fix

    • #172: Grammar fixes in the docs

    • #173: Grammar fixes in the manpage

    • #174: more documentation fixes

    • #175: more manpage fixes

    • #177: sync –help output w/ manpage

    • #178: Moves rts used in the manpage to master

    • #179: compose manpage out of rsts from gh-pages-gen

    • #189: Typo fix and small grammatical change

    • #191: Makefile.am: create target directory before writing into it

0.16x

0.16 (2016-01-21)

  • Fixed bug #127: code generation error with wide chars and bitmaps (omitted goto statement)

  • Added DFA minimization and option --dfa-minimization <table | moore>

  • Fixed bug #128: very slow DFA construction (resulting in a very large DFA)

  • Fixed bug #132: test failure on big endian archs with 0.15.3

0.15x

0.15.3 (2015-12-02)

  • Fixed bugs and applied patches:

    • #122: clang does not compile re2c 0.15.x

    • #124: Get rid of UINT32_MAX and friends

    • #125: [OS X] git reports changes not staged for commit in newly cloned repository

  • Added option --no-version that allows to omit version information.

  • Reduced memory and time consumed with -Wundefined-control-flow.

  • Improved coverage of input data generated with -S --skeleton.

0.15.2 (2015-11-23)

0.15.1 (2015-11-22)

  • Fixed test failures caused by locale-sensitive ‘sort’.

0.15 (2015-11-22)

  • Updated website http://re2c.org:

    • added examples

    • updated docs

    • added news

    • added web feed (Atom 1.0)

  • Added options:

    • -S, --skeleton

    • --empty-class <match-empty | match-none | error>

  • Added warnings:

    • -W

    • -Werror

    • -W<warning>

    • -Wno-<warning>

    • -Werror-<warning>

    • -Wno-error-<warning>

  • Added specific warnings:

    • -Wundefined-control-flow

    • -Wunreachable-rules

    • -Wcondition-order

    • -Wuseless-escape

    • -Wempty-character-class

    • -Wswapped-range

    • -Wmatch-empty-string

  • Fixed options:

    • -- (interpret remaining arguments as non-options)

  • Deprecated options:

    • -1 --single-pass (single pass is the default now)

  • Reduced size of the generated .dot files.

  • Fixed bugs:

    • #27: re2c crashes reading files containing %{ %} (patch by Rui)

    • #51: default rule doesn’t work in reuse mode

    • #52: eliminate multiple passes

    • #59: bogus yyaccept in -c mode

    • #60: redundant use of YYMARKER

    • #61: empty character class [] matches empty string

    • #115: flex-style named definitions cause ambiguity in re2c grammar

    • #119: -f with -b/-g generates incorrect dispatch on fill labels

    • #116: empty string with non-empty trailing context consumes code units

  • Added test options:

    • -j, -j <N> (run tests in N threads, defaults to the number of CPUs)

    • --wine (test windows builds using wine)

    • --skeleton (generate skeleton programs, compile and execute them)

    • --keep-tmp-files (don’t delete intermediate files for successful tests)

  • Updated build system:

    • support out of source builds

    • support `make distcheck`

    • added `make bootstrap` (rebuild re2c after building with precompiled .re files)

    • added `make tests` (run tests with -j)

    • added `make vtests` (run tests with --valgrind -j)

    • added `make wtests` (run tests with --wine -j 1)

    • added Autoconf tests for CXXFLAGS. By default try the following options: -W -Wall -Wextra -Weffc++ -pedantic -Wformat=2 -Wredundant-decls -Wsuggest-attribute=format -Wconversion -Wsign-conversion -O2 -Weverything), respect user-defined CXXFLAGS

    • support Mingw builds: `configure -host i686-w64-mingw32`

    • structured source files

    • removed old MSVC files

  • Moved development to github (https://github.com/skvadrik/re2c), keep a mirror on sourceforge.

0.14x

0.14.3 (2015-05-20)

  • applied patch #27: re2c crashes reading files containing %{ %}

  • dropped distfiles for MSVC (they are broken anyway)

0.14.2 (2015-03-25)

  • fixed #57: Wrong result only if another rule is present

0.14.1 (2015-02-27)

  • fixed #55: re2c-0.14: re2c -V outputs null byte

0.14 (2015-02-23)

  • Added generic input API

    • #21: Support to configure how re2c code interfaced with the symbol buffer?”

  • fixed #46: re2c generates an infinite loop, depends on existence of previous parser

  • fixed #47: Dot output label escaped characters

0.13x

0.13.7.5 (2014-08-22)

0.13.7.4 (2014-07-29)

  • Enabled make docs only if configured with --enable-docs

  • Disallowed to use yacc/byacc instead of bison to build parser

  • Removed non-portable sed feature in script that runs tests

0.13.7.3 (2014-07-27)

  • Fixed CXX warning

  • Got rid of asciidoc build-time dependency

0.13.7.2 (2014-07-27)

  • Included man page into dist, respect users CXXFLAGS.

0.13.7.1 (2014-07-26)

  • Added missing files to tarball

0.13.7 (2014-07-25)

  • Added UTF-8 support

  • Added UTF-16 support

  • Added default rule

  • Added option to control ill-formed Unicode

0.13.6 (2013-07-04)

  • Fixed #2535084 uint problem with Sun C 5.8

  • #3308400: allow Yacc-style %{ code brackets }%

  • #2506253: allow C++ // comments

  • Fixed inplace configuration in -e mode.

  • Applied #2482572 Typos in error messages.

  • Applied #2482561 Error in manual section on -r mode.

  • Fixed #2478216 Wrong start_label in -c mode.

  • Fixed #2186718 Unescaped backslash in file name of #line directive.

  • Fixed #2102138 Duplicate case labels on EBCDIC.

  • Fixed #2088583 Compile problem on AIX.

  • Fixed #2038610 Ebcdic problem.

  • improve dot support: make char intervals (e.g. [A-Z]) instead of one edge per char

0.13.5 (2008-05-25)

  • Fixed #1952896 Segfault in re2c::Scanner::scan.

  • Fixed #1952842 Regression.

0.13.4 (2008-04-05)

  • Added transparent handling of #line directives in input files.

  • Added re2c:yyfill:check inplace configuration.

  • Added re2c:define:YYSETSTATE:naked inplace configuration.

  • Added re2c:flags:w and re2c:flags:u inplace configurations.

  • Added the ability to add rules in use:re2c blocks.

  • Changed -r flag to accept only rules:re2c and use:re2c blocks.

0.13.3 (2008-03-14)

  • Added -r flag to allow reuse of scanner definitions.

  • Added -F flag to support flex syntax in rules.

  • Fixed SEGV in scanner that occurs with very large blocks.

  • Fixed issue with unused yybm.

  • Partial support for flex syntax.

  • Changed to allow /* comments with -c switch.

  • Added flag -D/--emit-dot.

0.13.2 (2008-02-14)

  • Added flag --case-inverted.

  • Added flag --case-insensitive.

  • Added support for <!...> to enable rule setup.

  • Added support for => style rules.

  • Added support for := style rules.

  • Added support for :=> style rules.

  • Added re2c:cond:divider and re2c:cond:goto inplace configuration.

  • Fixed code generation to emit space after if.

0.13.1 (2007-08-24)

  • Added custom build rules for Visual Studio 2005 (re2c.rules). (William Swanson)

  • Fixed issue with some compilers.

  • Fixed #1776177 Build on AIX.

  • Fixed #1743180 fwrite with 0 length crashes on OS X.

0.13.0 (2007-06-24)

  • Added -c and -t to generate scanners with (f)lex-like condition support.

  • Fixed issue with short form of switches and parameter if not first switch.

  • Fixed #1708378 segfault in actions.cc.

0.12x

0.12.3 (2007-08-24)

  • Fixed issue with some compilers.

  • Fixed #1776177 Build on AIX.

  • Fixed #1743180 fwrite with 0 length crashes on OS X.

0.12.2 (2007-06-26)

  • Fixed #1743180 fwrite with 0 length crashes on OS X.

0.12.1 (2007-05-23)

  • Fixed #1711240 problem with " and 7F on EBCDIC plattforms.

0.12.0 (2007-05-01)

  • Re-release of 0.11.3 as new stable branch.

  • Fixed issue with short form of switches and parameter if not first switch.

  • Fixed #1708378 segfault in actions.cc.

  • re2c 0.12.0 has been tested with the following compilers:

    • gcc version 4.1.2 (Gentoo 4.1.2)

    • gcc version 4.1.2 20070302 (prerelease) (4.1.2-1mdv2007.1)

    • gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)

    • gcc version 4.1.1 20070105 (Red Hat 4.1.1-51)

    • gcc version 4.1.0 (SUSE Linux 10)

    • gcc version 4.0.3 (4.0.3-0.20060215.2mdk for Mandriva Linux release 2006.1)

    • gcc version 4.0.2 20050901 (prerelease) (SUSE Linux) (32 + 64 bit)

    • MacPPC, gcc version 4.0.1 (Apple Computer, Inc. build 5367)

    • MacIntel, gcc version 4.0.1 (Apple Computer, Inc. build 5250)

    • gcc version 3.4.4 [FreeBSD] 20050518 (32 + 64 bit)

    • gcc version 3.4.4 (cygming special) (gdc 0.12, using dmd 0.125)

    • gcc version 3.4.2 [FreeBSD]

    • gcc version 3.3.5 20050117 (prerelease) (SUSE Linux)

    • gcc version 3.3.3 (PPC, 32 + 64 bit)

    • Microsoft (R) C/C++ Optimizing Compiler Version 14.00.50727.762 for x64 (64 bit)

    • Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86 (Microsoft Visual C++ 2005)

    • Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.3077 for 80x86 (Mictosoft Visual C++ 2003)

    • Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.00.9466 for 80x86 (Microsoft Visual C++ 2002)

    • Intel(R) C++ Compiler for 32-bit applications, Version 9.1 Build 20070322Z Package ID: W_CC_C_9.1.037

    • Intel(R) C++ Compiler for Intel(R) EM64T-based applications, Version 9.1 (64 bit)

    • icpcbin (ICC) 9.1 20070215

    • CC: Sun C++ 5.8 2005/10/13 (CXXFLAGS='-library=stlport4')

    • MIPSpro Compilers: Version 7.4.4m (32 + 64 bit)

    • aCC: HP C/aC++ B3910B A.06.15 [Mar 28 2007] (HP-UX IA64)

0.11x

0.11.3 (2007-04-01)

  • Added support for underscores in named definitions.

  • Added new option --no-generation-date.

  • Fixed issue with long form of switches.

0.11.2 (2007-03-01)

  • Added inplace configuration re2c:yyfill:parameter.

  • Added inplace configuration re2c:yych:conversion.

  • Fixed -u switch code generation.

  • Added ability to avoid defines and overwrite generated variable names.

0.11.1 (2007-02-20)

  • Applied #1647875 Add const to yybm vector.

0.11.0 (2007-01-01)

  • Added -u switch to support unicode.

0.10x

0.10.8 (2007-04-01)

  • Fixed issue with long form of switches.

0.10.7 (2007-02-20)

  • Applied #1647875 Add const to yybm vector.

0.10.6 (2006-08-05)

  • Fixed #1529351 Segv bug on unterminated code blocks.

  • Fixed #1528269 Invalid code generation.

0.10.5 (2006-06-11)

  • Fixed long form of -1 switch to --single-pass as noted in man page and help.

  • Added MSVC 2003 project files and renamed old 2002 ones.

0.10.4 (2006-06-01)

  • Fix whitespace in generated code.

0.10.3 (2006-05-14)

  • Fixed issue with -wb and -ws.

  • Added -g switch to support gcc’s computed goto’s.

  • Changed to use nested if’s instead of switch(yyaccept) in -s mode.

0.10.2 (2006-05-01)

  • Changed to generate YYMARKER only when needed or in single pass mode.

  • Added -1 switch to force single pass generation and make two pass the default.

  • Fixed -i switch.

  • Added configuration yyfill:enable to allow suppression of YYFILL() blocks.

  • Added tutorial like lessons to re2c.

  • Added /*!ignore:re2c */ to support documenting of re2c source.

  • Fixed issue with multiline re2c comments (/*!max:re2c ... */ and alike).

  • Fixed generation of YYDEBUG() when using -d switch.

  • Added /*!getstate:re2c */ which triggers generation of the YYGETSTATE() block.

  • Added configuration state:abort.

  • Changed to not generate yyNext unless configuration state:nextlabel is used.

  • Changed to not generate yyaccept code unless needed.

  • Changed to use if instead of switch expression when yyaccpt has only one case.

  • Added docu, examples and tests to .src.zip package (0.10.1 zip was repackaged).

  • Fixed #1479044 incorrect code generated when using -b.

  • Fixed #1472770 re2c creates an infinite loop.

  • Fixed #1454253 Piece of code saving a backtracking point not generated.

  • Fixed #1463639 Missing forward declaration.

  • Implemented #1187127 savable state support for multiple re2c blocks.

  • re2c 0.10.2 has been tested with the following compilers:

    • gcc (GCC) 4.1.0 (Gentoo 4.1.0)

    • gcc version 4.0.3 (4.0.3-0.20060215.2mdk for Mandriva Linux release 2006.1)

    • gcc version 4.0.2 20050901 (prerelease) (SUSE Linux)

    • gcc (GCC) 3.4.5 (Gentoo 3.4.5, ssp-3.4.5-1.0, pie-8.7.9)

    • gcc version 3.4.4 [FreeBSD] 20050518

    • gcc version 3.4.4 (cygming special) (gdc 0.12, using dmd 0.125)

    • gcc version 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)

    • gcc-Version 3.3.5 (Debian 1:3.3.5-13)

    • gcc-Version 3.3.0 (mips-sgi-irix6.5/3.3.0/specs)

    • MIPSpro Compilers: Version 7.4.4m

    • Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86 (Microsoft Visual C++ 2005)

    • Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.3077 for 80x86 (Mictosoft Visual C++ 2003)

    • Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.00.9466 for 80x86 (Microsoft Visual C++ 2002)

    • Intel(R) C++ Compiler for Intel(R) EM64T-based applications, Version 9.0 Build 20050430 Package ID: l_cc_p_9.0.021

    • CC: Sun C++ 5.8 2005/10/13 (CXXFLAGS='-library=stlport4')

    • bison 2.1, 1.875d, 1.875b, 1.875

0.10.1 (2006-02-28)

  • Added support for Solaris and native SUN compiler.

  • Applied #1438160 expose YYCTXMARKER.

  • re2c 0.10.1 has been tested with the following compilers:

    • gcc version 4.0.3 (4.0.3-0.20060215.2mdk for Mandriva Linux release 2006.1)

    • gcc version 4.0.2 (4.0.2-1mdk for Mandriva Linux release 2006.1)

    • gcc version 4.0.2 20050901 (prerelease) (SUSE Linux)

    • gcc version 3.4.4 (cygming special) (gdc 0.12, using dmd 0.125)

    • gcc-Version 3.3.5 (Debian 1:3.3.5-13)

    • gcc-Version 3.3.0 (mips-sgi-irix6.5/3.3.0/specs)

    • MIPSpro Compilers: Version 7.4.4m

    • Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86 (Microsoft Visual C 2005)

    • Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.00.9466 for 80x86 (Microsoft Visual C 2002)

    • Intel(R) C++ Compiler for 32-bit applications, Version 9.0 Build 20051130Z Package ID: W_CC_C_9.0.028

    • CC: Sun C++ 5.8 2005/10/13 (CXXFLAGS='-compat5 -library=stlport4')

    • bison 2.1, 1.875d, 1.875b, 1.875

0.10.0 (2006-02-18)

  • Added make target zip to create windows source packages as zip files.

  • Added re2c:startlabel configuration.

  • Fixed code generation to not generate unreachable code for initial state.

  • Added support for c/c++ compatible \u and \U unicode notation.

  • Added ability to control indendation.

  • Made scanner error out in case an ambiguous /* is found.

  • Fixed indendation of generated code.

  • Added support for DOS line endings.

  • Added experimental unicode support.

  • Added config_w32.h to build out of the box on windows (using msvc 2002+).

  • Added Microsoft Visual C .NET 2005 build files.

  • Applied #1411087 variable length trailing context.

  • Applied #1408326 do not generate goto next state.

  • Applied #1408282 CharSet initialization fix.

  • Applied #1408278 readsome with MSVC.

  • Applied #1307467 Unicode patch for 0.9.7.

0.9x

0.9.12 (2005-12-28)

  • Fixed bug #1390174 re2c cannot accept {0,}.

0.9.11 (2005-12-18)

  • Fixed #1313083 -e (EBCDIC cross compile) broken.

  • Fixed #1297658 underestimation of n in YYFILL(n).

  • Applied #1339483 Avoid rebuilds of re2c when running subtargets.

  • Implemented #1335305 symbol table reimplementation, just slightly modifed.

0.9.10 (2005-09-04)

  • Add -i switch to avoid generating #line information.

  • Fixed bug #1251653 re2c generate some invalid #line on WIN32.

0.9.9 (2005-07-21)

  • Implemented #1232777 negated char classes [^...] and the dot operator ..

  • Added hexadecimal character definitions.

  • Added consistency check for octal character definitions.

0.9.8 (2005-06-26)

  • Fixed code generation for -b switch.

  • Added Microsoft Visual C .NET build files.

0.9.7 (2005-04-30)

  • Applied #1181535 storable state patch.

  • Added -d flag which outputs a debugable parser.

  • Fixed generation of #line directives (according to ISO-C99).

  • Fixed bug #1187785 Re2c fails to generate valid code.

  • Fixed bug #1187452 unused variable yyaccept.

0.9.6 (2005-04-14)

  • Fix build with gcc >= 3.4.

0.9.5 (2005-04-08)

  • Added /*!max:re2c */ which emits #define YYMAXFILL <max> line. This allows to define buffers of the minimum required length. Occurence must follow /*re2c */ and cannot preceed it.

  • Changed re2c to two pass generation to output warning free code.

  • Fixed bug #1163046 re2c hangs when processing valid re-file.

  • Fixed bug #1022799 re2c scanner has buffering bug.

0.9.4 (2005-03-12)

  • Added --vernum support.

  • Fixed bug #1054496 incorrect code generated with -b option.

  • Fixed bug #1012748 re2c does not emit last line if \n missing.

  • Fixed bug #999104 --output=output option does not work as documented.

  • Fixed bug #999103 Invalid options prefixed with two dashes cause program crash.

0.9.3 (2004-05-26)

  • Fixes one small possible bug in the generated output. ych instead of yych is output in certain circumstances.

0.9.2 (2004-05-26)

  • Added -o option to specify the output file which also will set the #line directives to something useful.

  • Print version to cout instead of cerr.

  • Added -h and -- style options.

  • Moved development to http://sourceforge.net/projects/re2c

  • Fixed bug #960144 minor cosmetic problem.

  • Fixed bug #953181 cannot compile with.

  • Fixed bug #939277 Windows support.

  • Fixed bug #914462 automake build patch

  • Fixed bug #891940 braced quantifiers: {\d+(,|,\d+)?} style.

  • Fixed bug #869298 Add case insensitive string literals.

  • Fixed bug #869297 Input buffer overrun.

0.9.1 (2003-12-13)

  • Removed rcs comments in source files.

re2c adopted (2003-12-09)

  • Version 0.9.1 README:

    Originally written by Peter Bumbulis (peter@csg.uwaterloo.ca)
    Currently maintained by Brian Young (bayoung@acm.org)
    
    The re2c distribution can be found at:
    http://www.tildeslash.org/re2c/index.html
    
    The source distribution is available from:
    http://www.tildeslash.org/re2c/re2c-0.9.1.tar.gz
    
    This distribution is a cleaned up version of the 0.5 release
    maintained by me (Brian Young). Several bugs were fixed as well
    as code cleanup for warning free compilation. It has been
    developed and tested with egcs 1.0.2 and gcc 2.7.2.3 on Linux x86.
    Peter Bumbulis' original release can be found at:
    ftp://csg.uwaterloo.ca/pub/peter/re2c.0.5.tar.gz
    
    re2c is a great tool for writing fast and flexible lexers.
    It has served many people well for many years and it deserves
    to be maintained more actively. re2c is on the order of 2-3
    times faster than a flex based scanner, and its input model
    is much more flexible.
    
    Patches and requests for features will be entertained. Areas
    of particular interest to me are porting (a Solaris and an NT
    version will be forthcoming) and wide character support. Note
    that the code is already quite portable and should be buildable
    on any platform with minor makefile changes.
    
  • Version 0.5 Peter’s original ANNOUNCE and README:

    re2c is a tool for generating C-based recognizers from regular
    expressions. re2c-based scanners are efficient: for programming
    languages, given similar specifications, an re2c-based scanner
    is typically almost twice as fast as a flex-based scanner with
    little or no increase in size (possibly a decrease on cisc
    architectures). Indeed, re2c-based scanners are quite competitive
    with hand-crafted ones.
    
    Unlike flex, re2c does not generate complete scanners: the user
    must supply some interface code. While this code is not bulky
    (about 50-100 lines for a flex-like scanner; see the man page
    and examples in the distribution) careful coding is required for
    efficiency (and correctness). One advantage of this arrangement
    is that the generated code is not tied to any particular input
    model. For example, re2c generated code can be used to scan
    data from a null-byte terminated buffer as illustrated below.
    
    Given the following source:
    
        #define NULL        ((char*) 0)
        char *scan(char *p) {
        char *q;
        #define YYCTYPE     char
        #define YYCURSOR    p
        #define YYLIMIT     p
        #define YYMARKER    q
        #define YYFILL(n)
        /*!re2c
            [0-9]+      {return YYCURSOR;}
            [\000-\377] {return NULL;}
        */
        }
    
    re2c will generate:
    
        /* Generated by re2c on Sat Apr 16 11:40:58 1994 */
        #line 1 "simple.re"
        #define NULL        ((char*) 0)
        char *scan(char *p) {
        char *q;
        #define YYCTYPE     char
        #define YYCURSOR    p
        #define YYLIMIT     p
        #define YYMARKER    q
        #define YYFILL(n)
        {
                YYCTYPE yych;
                unsigned int yyaccept;
                goto yy0;
        yy1:    ++YYCURSOR;
        yy0:
                if((YYLIMIT - YYCURSOR) < 2) YYFILL(2);
                yych = *YYCURSOR;
                if(yych <= '/') goto yy4;
                if(yych >= ':') goto yy4;
        yy2:    yych = *++YYCURSOR;
                goto yy7;
        yy3:
        #line 10
                {return YYCURSOR;}
        yy4:    yych = *++YYCURSOR;
        yy5:
        #line 11
                {return NULL;}
        yy6:    ++YYCURSOR;
                if(YYLIMIT == YYCURSOR) YYFILL(1);
                yych = *YYCURSOR;
        yy7:    if(yych <= '/') goto yy3;
                if(yych <= '9') goto yy6;
                goto yy3;
        }
        #line 12
    
        }
    
    Note that most compilers will perform dead-code elimination to
    remove all YYCURSOR, YYLIMIT comparisions.
    
    re2c was developed for a particular project (constructing a fast
    REXX scanner of all things!) and so while it has some rough edges,
    it should be quite usable. More information about re2c can be
    found in the (admittedly skimpy) man page; the algorithms and
    heuristics used are described in an upcoming LOPLAS article
    (included in the distribution). Probably the best way to find out
    more about re2c is to try the supplied examples. re2c is written in
    C++, and is currently being developed under Linux using gcc 2.5.8.
    
    Peter