-Wcondition-order

Some older re2c programs that use -c --conditions option rely on a fixed condition order instead of using /*!types:re2c*/ directive or the -t --type-header option. This is incorrect and dangerous, as demonstrated by the following example [fixorder.re]. In this example the lexer has two conditions: a and b. It starts in condition a, which expects a sequence of letters a followed by a comma. The comma causes transition to condition b, which expects a sequence of letters b followed by an exclamation mark. Anything other input is an error. Nothing special, except that condition numbers are hardcoded manually (the mapping of conditions to numbers is toggled by REVERSED_CONDITION_ORDER define).

#include <stdio.h>

#ifdef REVERSED_CONDITION_ORDER
#    define yyca 1
#    define yycb 0
#else
#    define yyca 0
#    define yycb 1
#endif

int main()
{
    const char * YYCURSOR = "aaaa,bbb!";
    int c = yyca;
    for (;;) {
    /*!re2c
        re2c:define:YYCTYPE = char;
        re2c:yyfill:enable = 0;
        re2c:define:YYSETCONDITION = "c = @@;";
        re2c:define:YYSETCONDITION:naked = 1;
        re2c:define:YYGETCONDITION = c;
        re2c:define:YYGETCONDITION:naked = 1;

        <*> * { printf ("error\n"); break; }

        <a> "a"      { printf ("a"); continue; }
        <a> "," => b { printf (","); continue; }

        <b> "!" { printf ("!\n"); break; }
        <b> "b" { printf ("b"); continue; }
    */
    }
    return 0;
}

Let’s compile and run it. Everything works fine: we get aaaa,bbb! in both cases.

$ re2c -c -o fixorder.c -Wcondition-order fixorder.re
$
$ c++ -o fixorder fixorder.c && ./fixorder
aaaa,bbb!
$
$ c++ -o fixorder fixorder.c -DREVERSED_CONDITION_ORDER && ./fixorder
aaaa,bbb!

However, if we use the -s re2c option, the lexer becomes sensitive to condition order:

$ re2c -cs -o fixorder.c -Wcondition-order fixorder.re
fixorder.re:31:6: warning: looks like you use hardcoded numbers instead of autogenerated condition names:
better add '/*!types:re2c*/' directive or '-t, --type-header' option and don't rely on fixed condition order. [-Wcondition-order]
$
$ c++ -o fixorder fixorder.c && ./fixorder
aaaa,bbb!
$
$ c++ -o fixorder fixorder.c -DREVERSED_CONDITION_ORDER && ./fixorder
error

And we get a warning from re2c. The same behavior remains if we use -g or -b option. Why is that? A look at the generated code explains everything. By default the initial dispatch on conditions is a switch statement:

switch (c) {
case yyca: goto yyc_a;
case yycb: goto yyc_b;
}

Dispatch uses explicit condition names and works no matter what numbers are assigned to them. However, with the -s option, re2c generates an if statement instead of a switch:

if (c < 1) {
        goto yyc_a;
} else {
        goto yyc_b;
}

And with the -g option, it uses a jump table (computed goto):

static void *yyctable[2] = {
        &&yyc_a,
        &&yyc_b,
};
goto *yyctable[c];

The last two cases are sensitive to condition order. The fix is easy: as the warning suggests, use the /*!types:re2c*/ directive or the -t, --type-header option.