alpag.net manual
Input file format / Definition sections / Common definitions
< Definition sections | Lexer definitions >

Common definitions

Definitions listed below are common for lexers and parsers. These can be placed in top section of file or in %defs section.

lexer

Used to indicate that file contains lexer and nothing else

%lexer

This command is used to avoid ambiguity with files in old format. It does not have to be used in newly created files. It cannot be used in files containing both lexer and parser.

parser

Used to indicate that file contains parser and nothing else

%parser

This command is used to avoid ambiguity with files in old format. It does not have to be used in newly created files. It cannot be used in files containing both lexer and parser.

option

Sets value of option.

%option name1.subname2.subname3 optionvalue
%option name1.subname2.subname4 "string"
%option name1.subname2.subname5 123
%option name1.subname2.subname6 0xAA

Alpag options are organized hierarchically. Elements of hierarchic name should be separated with dots. Enumerated values are case-insensitive. String values should be double quoted. Numeric values can be decimal or hexadecimal (preceded with 0x).

Individual options are discussed in chapters describing particular lexer and parser mechanisms.

Full list of available options can be found in Options section.

Syntax highlighting extension for Visual Studio provides contextual help for all options.

language

Declares the output programming language.

%lang langIdentifier
%language langIdentifier
warningoff

Globally disables warnings with specific codes.

%warningoff warningCode1 [ warningCode2...]

Multiple values are allowed (decimal or hexadecimal).

Example
%warningoff 123 45 0xA5
code

Declares section of custom code which is placed inside generated lexer or parser.

%code namedCode { userCode }
%code { userCode }

Below is a categorized list of available code blocks. Description of these code blocks is provided in chapters on particular mechanisms.

Lexer code embedded in class

lexer_fields – code placed inside lexer class in head part of the class

lexer_body – code placed inside lexer class in tail part of the class

lexer_methods – same as lexer_body

lexer_top – code placed in head part of lexer file, outside of lexer class

lexer_bottom – code placed in tail part of lexer file, outside of lexer class

Lexer code reading input stream

read_input_bytes – code for reading input in byte mode (available macros: BUFFER, OFFSET, COUNT)

read_input_chars – code for reading input in character mode (available macros: BUFFER, OFFSET, COUNT)

Lexer code handling specific situations

continue_on_eof – code executed when EOF was reached. Returning true tells lexer to keep reading input

next_token_begin – code executed when lexer is invoked to return next match

match_before – code executed before match (or no match) is reported, common for all rules

match_after

Lexer tracing code

trace_next_symbol – code placed in body of procedure used for tracing subsequent input characters read

trace_input_matched – code placed in body of procedure tracing matches

trace_input_not_matched – code placed in body of procedure tracing no-match events

Parser code embedded in class

parser_fields – code placed inside parser class in head part of the class

parser_body – code placed inside parser class in tail part of the class

parser_methods – same as parser_body

parser_top – code placed in head part of parser file, outside of parser class

parser_bottom – code placed in tail part of parser file, outside of parser class

Parser code for fetching input

parser_next_token – code placed inside NextToken() parser procedure. should return next input symbol and optionally provide value for it. Macros DATA, VALUE and SYMBOL/TOKEN can be used to reference returned result.

Parser code for value allocation and deallocation

value_data – custom code to place in parser's structure storing value data

value_clear – code executed when value data is cleared (macro DATA available inside)

value_clear_new – code executed when pre-clearing freshly allocated value data (macro DATA available inside)

value_clear_reused – code executed when clearing reused value data (macro DATA available inside)

value_clear_deleted – code executed before deallocating value data (macro DATA available inside)

value_clear_consumed – code executed on source value data immediately after it was shallow-copied to another value data and just before it is discarded (macro DATA available inside)

Parser code handling errors

error – code executed when parser encountered an error and before it attempts to recover from that error.

error_begin_recovery – code executed after parser found on stack error symbol matching current input stream position and begins error recovery (macro ERROR_ID available inside)

error_skip_input – code executed for each input symbols skipped in recovery mode

error_recovered – code executed when parser recovered from error

Parser tracing code

trace_parse_error – code placed in body of procedure tracing subsequent phases of error recovery

trace_reduce – code placed in body of procedure tracing reductions

trace_shift – code placed in body of procedure tracing shifts

Parser code handling specific situations

reduce_before – code executed before reduction (common for all productions)

reduce_after – code executed after reduction (common for all productions)

< Definition sections | Lexer definitions >
Alpag Manual