alpag.net manual
First steps / Parser / Parser grammar
< Parser | Generating parser report >

Parser grammar

To start writing parser grammar create an empty text file with extension .alpag or .alp. If file contains only parser definition extension .alpp can be used to emphasize it. Alpag can accept input files with any extension, but using standard extension is a good idea. For backward compatibility reasons, extension .yacc is also recognized (for files containing parser, and nothing but parser).

In this example we assume filename: myParser.alp

Inside file you can place C++ style comments using /*..*/ or // syntax.

/* myParser – my first paser */

Contents of the file is divided into sections. A section starts with %% double percent token followed by name of the section. Types of used sections determine if the file contains a lexer a parser or both.

Place two sections: %%pdefs and %%prules in newly created file. Section declarations must appear at the beginning of line:

/* myParser – my first paser */

%%pdefs
// parser token definitions go here

%%prules
// parser rules go here

The %%pdefs section contains declarations of terminal symbols (tokens) that can appear on input. Terminal symbols are normally returned by lexer, so these should be declared in accordance with lexer grammar. In this example we do not consider any particular lexer grammar, so abstract set of terminals will be assumed.

Following grammar will be used:

/* myParser – my first paser */

%%pdefs
%token KEYWORD
%token NUMBER

%%prules
FILE: COMMANDS;
COMMANDS: CMD | COMMANDS CMD;
CMD: KEYWORD;
CMD: KEYWORD NUMBER;

Grammar declares two terminal symbols (tokens): KEYWORD and NUMBER. These symbols must be recognized by lexer and returned to the parser.

FILE is the main nonterminal of the grammar. It is added pro forma here, since it’s only purpose is to wrap COMMANDS nonterminal.

Nonterminal COMMANDS has two productions defined: one which expands to a single CMD and another, left recursive one, which expands to sequence of COMMANDS followed by CMD. Together these two productions can accept a sequence of one or more occurrences of CMD.

Finally CMD expands to either KEYWORD or KEYWORD followed by NUMBER.

Notice the use of pipe '|' to separate productions and mandatory semicolon ';' after last production for given nonterminal.

Save the myParser.alp file and run Alpag from command line:

alpag myParser.alp

If there were no errors, you should see generated myParserParser.cs file in the same directory as the input file.

By default a single code file is generated containing all parser components. Depending on selected options Alpag can also generate multiple files containing individual parser components. By default these files are named same as input file (here myParser) with additional suffix describing contents of particular file (here Parser).

< Parser | Generating parser report >
Alpag Manual