-Wunreachable-rules¶
Sometimes the input grammar contains rules that will never match. This can
happen for two reasons. First, some rules may be shadowed by other rules that
match the same input, but have higher priority. Second, the rule itself may be
infinitely greedy: it may consume as many input characters as it can get and
never stop, and as a result never match. Both cases indicate a problem with
the grammar, and -Wunreachable-rules
detects and reports such rules.
Let’s see an example of the first kind: shadowed rules (shadowed.re).
/*!re2c
"" { return ""; }
* { return "*"; }
"a" | "b" { return "a | b"; }
"a" { return "a"; }
[\x00-\xFF] { return "[0 - 0xFF]"; }
[^] { return "[^]"; }
*/
In this example the empty rule ""
never matches, because any single code
unit is matched by other rules, which take precedence due to the longerst match.
Rule "a"
is shadowed by rule "a" | "b"
, which also matches a
, but
takes precedence because it comes first. Similarly, rule [^]
is shadowed by
rule [\x00-\xFF]
. Default rule *
is also shadowed, but it’s an exception
that is not reported (default case should always be handled). Shadowed rules
normally do not appear in the generated code: re2c removes them during its dead
code elimination pass.
$ re2c -Wunreachable-rules shadowed.re -o shadowed.c
shadowed.re:2:16: warning: unreachable rule (shadowed by rules at lines 4, 6) [-Wunreachable-rules]
shadowed.re:5:16: warning: unreachable rule (shadowed by rule at line 4) [-Wunreachable-rules]
shadowed.re:7:16: warning: unreachable rule (shadowed by rules at lines 4, 6) [-Wunreachable-rules]
Now let’s see an example of second kind: infinitely greedy rule (greedy.re).
/*!re2c
[^]* { return "greeedy"; }
*/
This rule will continue eating input characters until YYFILL
fails, or until
it reads past the end of buffer and causes memory access violation.
$ re2c -Wunreachable-rules greedy.re -o greedy.c
greedy.re:2:9: warning: unreachable rule [-Wunreachable-rules]