alpag.net manual
Lexer / Programming interface
< Automaton | Basic interface >

Programming interface

This chapter covers following aspects of embedding lexer in application:

Output settings

Same as for input user is free to configure format and encoding of output buffer. Options for configuring output buffer are analogous to options for input buffer.

The Lexer.Out.Format option can be used to set output buffer format (Bytes, Chars, BytesChars). The Lexer.Out.EncodingSwitchable setting tells if user will be able to switch encoding during runtime. The Lexer.Out.Encoding determines default output encoding (when switching is enabled) or the only encoding (with switching off). The Lexer.Out.Encodings can be used to specify explicitly what encodings are supported (when not set, all encodings are supported).

The Lexer.Out.Endiannes determines endiannes of wide encodings if output buffer is byte-oriented.

Note that there are no output options for encoding autodetection. User must specify desired encoding explicitly.

If input and output encodings are different (or switchable and thus potentially different) lexer acts as transcoding lexer which transforms input sequences to output format during run. Such transformation introduces a slight overhead but if input and output formats are different is inevitable.

Note that if input and output format and encoding are non-switchable and set to the same configuration, lexer will not perform transcoding. This improves performance and eliminates need for separate output buffer.

There is also one additional option Lexer.Out.OutFormatEqualIn. Setting this option to true informs lexer, that user does not want any transcoding and is willing to accept at the output any format and encoding that may possibly appear on input. With this option enabled, output is simply fed from input buffer. User must be able handle current encoding of input buffer. Using this option improves performance and is useful for tunneling data from input to output. It applies only to output buffer contents. In this mode lexer performs full analysis of input characters according to specified input encoding on usual basis.

< Automaton | Basic interface >
Alpag Manual