Lexer definitions section is declared using %ldefs or %ld.
When this section is present in input file, it is assumed that file contains lexer definition.
By default all lexer rules are always active. It can be sometimes useful to selectively disable some rules. This can be achieved using modes.
User can declare multiple modes and assign lexer rules to these modes. At any moment a single mode is 'current'. Only rules active in this mode are enabled. A single rule can be assigned to multiple modes. Switching modes is done programmatically during lexer run.
Default lexer mode is called INITIAL. When lexer starts, this default mode is current. All lexer rules, if not specified otherwise, are assigned to INITIAL mode.
User can declare two kinds of modes:
Modes are also known as 'start conditions'.
Sometimes multiple regular expressions contain the same element or subpart. Such element can be declared once, as named regular expression, and then referenced from other places using its name in curly braces like:
A named regular expression is defined by placing its name (identifier) at the beginning of line followed at least one space and regular expression.
Declares an exclusive lexer mode
Declares a shared (inclusive) lexer mode
Declares return code which can be assigned to multiple lexer rules
where:
The %retcode command declares a return code which can be referenced from lexer rules. Return codes are declared for each value type separately. For a single value type there can be only one return code without name. If necessary additional return codes with custom names can be defined.
Return codes are referenced from lexer rules using %return command. The %return command specifies value type and optional name used to lookup the return code. Moreover %return command specifies identifier of token reported by particular rule. Placeholder $$ in enclosed code is replaced with returned token.