Below is an example of lexer and parser for a simple calculator application.
Input file
Input file for alpag with definition of both lexer and parser.
// Scroll down to:
// %%ldefs section for lexer grammar
// %%pdefs / %%prules sections for parser grammar
// These are some options controlling code generation
%option Code.Namespace "Calc"
%option Code.Base.OwnFile true
%option Parser.Code.Parser.ClassName "CalcParser"
%option Parser.Value.FieldType { double }
%option Parser.Debug.RecentStepsReport true
%option Parser.Debug.RecentStepsValueData true
%option Lexer.In.Format Chars
%option Lexer.Code.Lexer.ClassName "CalcLexer"
%option Lexer.Code.NextTokenFuncArgs Custom
%option Lexer.Code.NextTokenFuncArgsCustom { ref double resValue }
%option Lexer.Debug.RecentStepsReport true
%option Lexer.Debug.Comments true
%option Lexer.Debug.Infos true
%option Lexer.Debug.Methods true
%code parser_fields {
public CalcLexer lexer;
public double result;
}
%code lexer_fields {
protected string InputText = string.Empty;
protected int InputOff;
public void SetInputText( string value )
{
InputText = value;
InputOff = 0;
}
}
%code read_input_chars {
if( InputOff < InputText.Length )
{
buffer[offset++] = InputText[InputOff++];
return 1;
}
return 0;
}
%code parser_next_token {
valueData.Value = 0;
symbol = lexer.NextToken( ref valueData.Value );
}
// Definitions of tokens used by parser, yacc-like
%%pdefs
%token NUMBER
%token BR_OPEN "("
%token BR_CLOSE ")"
%left PLUS "+" MINUS "-"
%left MUL "*" DIV "/"
// parser grammar
%%prules
RESULT: EXPR { result = $1; };
EXPR:
NUMBER { $$ = $1; } |
"(" EXPR ")" { $$ = $2; } |
EXPR "+" EXPR { $$ = $1 + $3; } |
EXPR "-" EXPR { $$ = $1 - $3; } |
EXPR "*" EXPR { $$ = $1 * $3; } |
EXPR "/" EXPR {
double div = $3;
if( div == 0 )
$$ = double.NaN;
else
$$ = $1 / $3;
};
// lexer grammar, lex format
%%lrules
[0-9]+(\.[0-9]+)? {
resValue = Double.Parse( TokenValueGetString(), System.Globalization.CultureInfo.InvariantCulture );
return CalcParser.NUMBER;
}
\+ { return CalcParser.PLUS; }
\- { return CalcParser.MINUS; }
\* { return CalcParser.MUL; }
\/ { return CalcParser.DIV; }
\( { return CalcParser.BR_OPEN; }
\) { return CalcParser.BR_CLOSE; }
[\ \t]+
Generating output
Once input grammar is defined it is time to launch alpag.
Following types of output can be obtained from alpag:
-
Soure code files for lexer, parser, or combined lexer-parser.
Usually multiple files are generated (depends on configuration).
Single alpag input file (like one above) can be used to generate in one run
just the lexer just the parser or both.
-
XML reports with summary of input grammar and generated automatons.
These can be easily transformed e.g. into HTML.
It is also possible to take lexer/parser tables generated by alpag
and use them to build own custom lexers and parsers.
-
Textual reports with summary of both input grammar(s) as well as generated automaton(s).
These reports are very convenient for textual search or performing diff between similar automatons.
Lexer - browsable page
XML report can be used to generate HTML summary of the lexer.
Here is a HTML page with lexer summary (opens in a new window)
XSL template for generating HTML pages from XML is included with distribution.
Lexer - textual report
Below is textual report for the lexer.
There are lots of options for controlling its format.
File shown below was configured to lists both input grammar (with parsed and exploded regular expressions)
as well as output automaton.
rules
&1
modes: INITIAL
regex: [0-9]+(\.([0-9]+))?
repeat: +- [0-9]+
range: | +- [0-9]
rangeItems: | +- [0-9]
subRange: | +- 0-9
repeat: +- (\.([0-9]+))?
block: +- (\.([0-9]+))
...: +- \.
repeat: +- [0-9]+
range: +- [0-9]
rangeItems: +- [0-9]
subRange: +- 0-9
&2
modes: INITIAL
regex: \+
&3
modes: INITIAL
regex: -
&4
modes: INITIAL
regex: \*
&5
modes: INITIAL
regex: /
&6
modes: INITIAL
regex: \(
&7
modes: INITIAL
regex: \)
&8
modes: INITIAL
regex: [ \t]+
repeat: +- [ \t]+
range: +- [ \t]
rangeItems: +- [ \t]
char: +-
char: +- \t
modes
mode INITIAL ruleCount: 8 initial type: shared startState: @1
rules
&1 [1]
regex: [0-9]+(\.([0-9]+))?
repeat: +- [0-9]+
range: | +- [0-9]
rangeItems: | +- [0-9]
subRange: | +- 0-9
repeat: +- (\.([0-9]+))?
block: +- (\.([0-9]+))
...: +- \.
repeat: +- [0-9]+
range: +- [0-9]
rangeItems: +- [0-9]
subRange: +- 0-9
&2 [2]
regex: \+
&3 [3]
regex: -
&4 [4]
regex: \*
&5 [5]
regex: /
&6 [6]
regex: \(
&7 [7]
regex: \)
&8 [8]
regex: [ \t]+
repeat: +- [ \t]+
range: +- [ \t]
rangeItems: +- [ \t]
char: +-
char: +- \t
symbolRanges
explicitSymbolRange : 9..57 length: 49
symbolRange : 9..57 length: 49
lexerRange : 0..1114111 length: 1114112
inputEncodingMaxSizeBytes : 4
symbolEOL : %1
symbols
%1 symEOL
%2 elements: 0..9 charCount: 10
%3 elements: \.
%4 elements: 0x2B
%5 elements: \-
%6 elements: \*
%7 elements: \/
%8 elements: \(
%9 elements: \)
%10 elements: 0x20, 0x09 charCount: 2
effectiveRules
mode 0 INITIAL
&1
regex: [0-9][0-9]*([.]([0-9][0-9]*))?
sequence: +- [0-9][0-9]*
range: | +- [0-9]
chars: | | +- 0-9
repeat: | +- [0-9]*
range: | +- [0-9]
chars: | +- 0-9
repeat: +- ([.]([0-9][0-9]*))?
block: | +- ([.]([0-9][0-9]*))
range: | +- [.]
chars: | | +- .
sequence: | +- [0-9][0-9]*
range: | +- [0-9]
chars: | | +- 0-9
repeat: | +- [0-9]*
range: | +- [0-9]
chars: | +- 0-9
&2
regex: [+]
range: +- [+]
chars: | +- +
&3
regex: [\-]
range: +- [\-]
chars: | +- \-
&4
regex: [*]
range: +- [*]
chars: | +- *
&5
regex: [/]
range: +- [/]
chars: | +- /
&6
regex: [(]
range: +- [(]
chars: | +- (
&7
regex: [)]
range: +- [)]
chars: | +- )
&8
regex: [ \t][ \t]*
sequence: +- [ \t][ \t]*
range: | +- [ \t]
chars: | | +- \t
repeat: | +- [ \t]*
range: | +- [ \t]
chars: | +- \t
automaton
states
@1
goto @2 under %2 0..9
goto @3 under %4 0x2B
goto @4 under %5 \-
goto @5 under %6 \*
goto @6 under %7 \/
goto @7 under %8 \(
goto @8 under %9 \)
goto @9 under %10 0x20, 0x09
@2
accept &1
goto @2 under %2 0..9
goto @10 under %3 \.
@3
accept &2
@4
accept &3
@5
accept &4
@6
accept &5
@7
accept &6
@8
accept &7
@9
accept &8
goto @9 under %10 0x20, 0x09
@10
goto @11 under %2 0..9
@11
accept &1
goto @11 under %2 0..9
Parser - browsable page
XML report can be used to generate HTML summary of the parser.
Here is a HTML page with parser summary (opens in a new window)
The page contains a simple emulator of LR(1) parser. It enables testing its behaviour for arbitary sequences of input tokens.
To use emulator:
- click 'emu' in right-upper corner of the page to open emulator pane.
- use buttons on emulator toolbar to edit contents of parser stack
and contents of input buffer.
- you can enable option to make all terminals and nonterminals in the grammar clickable.
This is alternative method of filling the stack
- switch visibility of actions available in current state (all or just valid under next input symbol)
- Click 'forward' button to make one step.
- Click one of actions available in current state to execute it (works also for actions not legal for current input sybol)
- Use undo/redo to step back and forth
- Click 'edit stack' button' to see and modify textual summary of stack contents.
Note: you can paste here stack captured when debugging real code.
XSL template for generating HTML pages from XML is included with distribution.
Parser - textual report
Below is textual report for the parser.
There are lots of options for controlling its format.
File shown below was configured to list input grammar and generated automaton.
Notice conflicts section at the end.
grammar
terminals: 7
nonTerminals: 3
productions: 8
terminals
1 NUMBER (56,8)
2 BR_OPEN "(" (57,8)
3 BR_CLOSE ")" (58,8)
4 PLUS "+" assoc: left precedence: 1 (59,7)
5 MINUS "-" assoc: left precedence: 1 (59,16)
6 MUL "*" assoc: left precedence: 2 (60,7)
7 DIV "/" assoc: left precedence: 2 (60,15)
nonterminals
1 $accept prodCount: 1 start
2 RESULT prodCount: 1
3 EXPR prodCount: 6 left-right-recursive (64,9)
productions
@1 RESULT: EXPR
@2 EXPR: NUMBER
@3 EXPR: BR_OPEN EXPR BR_CLOSE
@4 EXPR: EXPR PLUS EXPR
@5 EXPR: EXPR MINUS EXPR
@6 EXPR: EXPR MUL EXPR
@7 EXPR: EXPR DIV EXPR
@8 $accept: RESULT
automaton
summary
stateCount: 15
map
shiftCount: 25
redCount: 34
gotoCount: 7
rawSize: 180
rawBusy: 66
packedSize: 208
rawBusyToSize: 0.367
packedSizeToRawSize: 1.156
packedSizeToRawBusy: 3.152
conflicts
confCount: 0
warnResCount: 0
totalCount: 16
states
state #1
itemset
@8[0] $accept: . RESULT
actions
shift to #4 under NUMBER
shift to #5 under BR_OPEN
goto #2 under RESULT
goto #3 under EXPR
state #2
itemset
@8[1] $accept: RESULT .
actions
reduce by @8 to $accept under $end
state #3
itemset
@1[1] RESULT: EXPR .
@4[1] EXPR: EXPR . PLUS EXPR
@5[1] EXPR: EXPR . MINUS EXPR
@6[1] EXPR: EXPR . MUL EXPR
@7[1] EXPR: EXPR . DIV EXPR
actions
shift to #8 under PLUS
shift to #9 under MINUS
shift to #10 under MUL
shift to #11 under DIV
reduce by @1 to RESULT under $end
state #4
itemset
@2[1] EXPR: NUMBER .
actions
reduce by @2 to EXPR under $end PLUS MINUS MUL DIV BR_CLOSE
state #5
itemset
@3[1] EXPR: BR_OPEN . EXPR BR_CLOSE
actions
shift to #4 under NUMBER
shift to #5 under BR_OPEN
goto #6 under EXPR
state #6
itemset
@3[2] EXPR: BR_OPEN EXPR . BR_CLOSE
@4[1] EXPR: EXPR . PLUS EXPR
@5[1] EXPR: EXPR . MINUS EXPR
@6[1] EXPR: EXPR . MUL EXPR
@7[1] EXPR: EXPR . DIV EXPR
actions
shift to #8 under PLUS
shift to #9 under MINUS
shift to #10 under MUL
shift to #11 under DIV
shift to #7 under BR_CLOSE
state #7
itemset
@3[3] EXPR: BR_OPEN EXPR BR_CLOSE .
actions
reduce by @3 to EXPR under BR_CLOSE PLUS MINUS MUL DIV $end
state #8
itemset
@4[2] EXPR: EXPR PLUS . EXPR
actions
shift to #4 under NUMBER
shift to #5 under BR_OPEN
goto #15 under EXPR
state #9
itemset
@5[2] EXPR: EXPR MINUS . EXPR
actions
shift to #4 under NUMBER
shift to #5 under BR_OPEN
goto #14 under EXPR
state #10
itemset
@6[2] EXPR: EXPR MUL . EXPR
actions
shift to #4 under NUMBER
shift to #5 under BR_OPEN
goto #13 under EXPR
state #11
itemset
@7[2] EXPR: EXPR DIV . EXPR
actions
shift to #4 under NUMBER
shift to #5 under BR_OPEN
goto #12 under EXPR
state #12
itemset
@7[3] EXPR: EXPR DIV EXPR .
@4[1] EXPR: EXPR . PLUS EXPR
@5[1] EXPR: EXPR . MINUS EXPR
@6[1] EXPR: EXPR . MUL EXPR
@7[1] EXPR: EXPR . DIV EXPR
actions
reduce by @7 to EXPR under BR_CLOSE PLUS MINUS MUL DIV $end
conflicts
resolved
s-r under PLUS shift to #8 or reduce by @7 to EXPR resolved as reduce using precedence
s-r under MINUS shift to #9 or reduce by @7 to EXPR resolved as reduce using precedence
s-r under MUL shift to #10 or reduce by @7 to EXPR resolved as reduce using assoc
s-r under DIV shift to #11 or reduce by @7 to EXPR resolved as reduce using assoc
state #13
itemset
@6[3] EXPR: EXPR MUL EXPR .
@4[1] EXPR: EXPR . PLUS EXPR
@5[1] EXPR: EXPR . MINUS EXPR
@6[1] EXPR: EXPR . MUL EXPR
@7[1] EXPR: EXPR . DIV EXPR
actions
reduce by @6 to EXPR under BR_CLOSE PLUS MINUS MUL DIV $end
conflicts
resolved
s-r under PLUS shift to #8 or reduce by @6 to EXPR resolved as reduce using precedence
s-r under MINUS shift to #9 or reduce by @6 to EXPR resolved as reduce using precedence
s-r under MUL shift to #10 or reduce by @6 to EXPR resolved as reduce using assoc
s-r under DIV shift to #11 or reduce by @6 to EXPR resolved as reduce using assoc
state #14
itemset
@5[3] EXPR: EXPR MINUS EXPR .
@4[1] EXPR: EXPR . PLUS EXPR
@5[1] EXPR: EXPR . MINUS EXPR
@6[1] EXPR: EXPR . MUL EXPR
@7[1] EXPR: EXPR . DIV EXPR
actions
shift to #10 under MUL
shift to #11 under DIV
reduce by @5 to EXPR under BR_CLOSE PLUS MINUS $end
conflicts
resolved
s-r under PLUS shift to #8 or reduce by @5 to EXPR resolved as reduce using assoc
s-r under MINUS shift to #9 or reduce by @5 to EXPR resolved as reduce using assoc
s-r under MUL shift to #10 or reduce by @5 to EXPR resolved as shift using precedence
s-r under DIV shift to #11 or reduce by @5 to EXPR resolved as shift using precedence
state #15
itemset
@4[3] EXPR: EXPR PLUS EXPR .
@4[1] EXPR: EXPR . PLUS EXPR
@5[1] EXPR: EXPR . MINUS EXPR
@6[1] EXPR: EXPR . MUL EXPR
@7[1] EXPR: EXPR . DIV EXPR
actions
shift to #10 under MUL
shift to #11 under DIV
reduce by @4 to EXPR under BR_CLOSE PLUS MINUS $end
conflicts
resolved
s-r under PLUS shift to #8 or reduce by @4 to EXPR resolved as reduce using assoc
s-r under MINUS shift to #9 or reduce by @4 to EXPR resolved as reduce using assoc
s-r under MUL shift to #10 or reduce by @4 to EXPR resolved as shift using precedence
s-r under DIV shift to #11 or reduce by @4 to EXPR resolved as shift using precedence
conflicts
totalCount: 16
conflictCount: 0
warnResCount: 0
resCount: 16
s-r in state #12 under PLUS shift to #8 or reduce by @7 to EXPR resolved as reduce using precedence
s-r in state #12 under MINUS shift to #9 or reduce by @7 to EXPR resolved as reduce using precedence
s-r in state #12 under MUL shift to #10 or reduce by @7 to EXPR resolved as reduce using assoc
s-r in state #12 under DIV shift to #11 or reduce by @7 to EXPR resolved as reduce using assoc
s-r in state #13 under PLUS shift to #8 or reduce by @6 to EXPR resolved as reduce using precedence
s-r in state #13 under MINUS shift to #9 or reduce by @6 to EXPR resolved as reduce using precedence
s-r in state #13 under MUL shift to #10 or reduce by @6 to EXPR resolved as reduce using assoc
s-r in state #13 under DIV shift to #11 or reduce by @6 to EXPR resolved as reduce using assoc
s-r in state #14 under PLUS shift to #8 or reduce by @5 to EXPR resolved as reduce using assoc
s-r in state #14 under MINUS shift to #9 or reduce by @5 to EXPR resolved as reduce using assoc
s-r in state #14 under MUL shift to #10 or reduce by @5 to EXPR resolved as shift using precedence
s-r in state #14 under DIV shift to #11 or reduce by @5 to EXPR resolved as shift using precedence
s-r in state #15 under PLUS shift to #8 or reduce by @4 to EXPR resolved as reduce using assoc
s-r in state #15 under MINUS shift to #9 or reduce by @4 to EXPR resolved as reduce using assoc
s-r in state #15 under MUL shift to #10 or reduce by @4 to EXPR resolved as shift using precedence
s-r in state #15 under DIV shift to #11 or reduce by @4 to EXPR resolved as shift using precedence