alpag.net manual
Introduction / Lexer / Regular grammar
< Lexer | Lexical analysis process >

Regular grammar

Lexer is defined by specifying regular grammar. This type of grammar is not very expressive and not powerful enough to specify input format of entire files for complex languages. It is however sufficient to describe simple patterns of characters matching syntactic elements like words or numbers.

Regular grammars allow following generic types of operators:

Where necessary elements can be grouped using braces as in the example:

ab|cd(ef)*

which specifies either ab alone or cd followed by any number of repetitions of ef that is either of:

ab, cd, cdef, cdefef, cdefefef

Alpag regular expressions allow several other convenience operators. Complete description of regular expression format can be found in Regular expression syntax chapter.

To understand examples given in following chapters two more operators must be explained:

Combined, these operators enable expressions like:

[A-Z][0-9A-Z]*

which stands for a single letter (in A to Z) range followed by zero or more letters or digits.

Regular expressions are sufficient to describe sequences of characters making up words or numbers. Lexers do not need to recognize more complicated patterns.

< Lexer | Lexical analysis process >
Alpag Manual