alpag.net manual
Lexer / Programming interface / Debugging
< Capturing | Code generation options >

Debugging

Lexers generated by Alpag can be provided with additional features supporting debugging. These features include:

Browsing grammar information

Although generated lexer is based on grammar, the grammar itself is not present in lexer. During debug lexer state often has to be compared against original lexer grammar. In order to keep everything in one place alpag can embed summary information about grammar in generated lexer.

When Lexer.Debug.Infos is set to true lexer contains:

Methods for accessing these structures are:

public static LexerRuleInfo GetLexerRuleInfoS( int ruleId )
public static LexerRuleInfo GetLexerRuleInfoS( int ruleId )
public static LexerSymbolInfo GetLexerSymbolInfoS( int symbolId )
public LexerSymbolInfo GetLexerSymbolInfo( int symbolId )

Rule and symbol info arrays are static and can be accessed using static methods with 'S' suffix. Instance methods (without 'S') are provided for convenience.

Structures returned by these methods contain informative strings describing rules and symbols that can be used in debugging.

Structure of automaton is present in lexer but, since it is encoded and packet, it is not browsable directly. Alpag can provide a suite of methods for exploring automaton structure in runtime.

Setting Lexer.Debug.Methods to true gives access to additional methods:

static LexerTransitionInfo[] GetLexerStateTransitionsS( int stateId )
LexerTransitionInfo[] GetLexerStateTransitions( int stateId )

Provide information about automaton transitions for given state.

static string GetLexerStateTransitionsStringS( int stateId )
string GetLexerStateTransitionsString( int stateId )

Provide summary string with information about automaton transitions for given state.

Since lexer automaton structure is static it is not related to particular lexer instance. Methods with 'S' suffix access this structure directly. Instance methods (without 'S') do the same and are provided for convenience.

Recent Steps

Usually when error condition occurs it is easy to check current lexer state but it is hard to understand how lexer got to this state. Lexer can save history of its most recent actions which can be inspected at any moment. This history is saved in fixed structure which introduces little overhead in processing.

When Lexer.Debug.RecentStepsReport is set to true lexer saves its most recent activities in a circular buffer. The depth of history is controlled by Lexer.Debug.RecentStepsCount option (with default value 100 steps). Each step is saved in structure which covers:

Following methods and fields enable access to this information:

int LexerRecentStepsCount

Current depth of history

LexerRecentStep GetLexerRecentStep( int off )

returns step in range 0.. LexerRecentStepsCount-1

LexerRecentStep[] GetLexerRecentSteps( [int maxCount] )

returns entire history

string GetLexerRecentStepsString( int maxCount = lexRecentSteps_MAX )

returns entire history as a single multiline string. If Lexer.Debug.Infos is enabled this method takes advantage of this information.

Tracing

User can define custom code for tracing lexer operation. With tracing activated lexer calls dedicated callback methods upon each symbol read and interpreted.

Exact order and types of arguments of methods described in this section are not guaranteed to be the preserved between alpag versions.

NextSymbol

When Lexer.Trace.NextSymbol is set to true, lexer reports each symbol read and action taken for that symbol to method:

void TraceNextSymbol( int symbolId, int charCode, int stateId, int nextStateId, int stepIndex )

Implementation of the method is controlled by Lexer.Trace.NextSymbolImpl option. When When option is set to UserCode custom code from trace_next_symbol section in inserted as body of the method. When option is set to Virtual, method is declared virtual and user must override it in a subclass.

Arguments of TraceNextSymbol() method are:

InputMatched

When Lexer.Trace.InputMatched is set to true, lexer reports each matched rule to method.

void TraceInputMatched( int matchId )

Implementation is controlled by Lexer.Trace.InputMatchedImpl option. When set to UserCode, body of the method is replaced by code from trace_input_matched section. When set to Virtual method is declared virtual and must be overridden. When set to UserCodeInline no method is declared and user code is inserted directly in lexer method.

InputNotMatched

When Lexer.Trace.InputNotMatched is set to true, lexer reports each case input was not matched against any rule to method:

void TraceInputNotMatched( int symbolId, int charCode, int stateId )

Implementation is controlled by Lexer.Trace.InputNotMatchedImpl option. When set to UserCode, body of the method is replaced by code from trace_input_not_matched

section. When set to Virtual method is declared virtual and must be overridden.

Input character

Lexer automaton is defined in the context of symbols. A single symbol may span multiple input characters. Characters read by lexer are immediately converted to symbols and do not literally appear inside lexer. Sometimes it is good to know what was the original input character that was converted to a symbol.

When option Lexer.Debug.LastInputChar is set to true, lexer contains additional variable LastInputChar with last character read and converted to a symbol. Note that value of this variable may be not valid if last symbol was generated by other means that a single input character.

< Capturing | Code generation options >
Alpag Manual