|
Module Genlexmodule Genlex:
A generic lexical analyzer.
This module implements a simple ``standard'' lexical analyzer, presented as a function from character streams to token streams. It implements roughly the lexical conventions of Caml, but is parameterized by the set of keywords of your language. Example: a lexer suitable for a desk calculator is obtained by let lexer = make_lexer ["+";"-";"*";"/";"let";"="; "("; ")"]
The associated parser would be a function from
type token =
The type of tokens. The lexical classes are:
Int and Float
for integer and floating-point numbers; String for
string literals, enclosed in double quotes; Char for
character literals, enclosed in single quotes; Ident for
identifiers (either sequences of letters, digits, underscores
and quotes, or sequences of ``operator characters'' such as
+ , * , etc); and Kwd for keywords (either identifiers or
single ``special characters'' such as ( , } , etc).val make_lexer :
Construct the lexer function. The first argument is the list of
keywords. An identifier
s is returned as Kwd s if s
belongs to this list, and as Ident s otherwise.
A special character s is returned as Kwd s if s
belongs to this list, and cause a lexical error (exception
Parse_error ) otherwise. Blanks and newlines are skipped.
Comments delimited by (* and *) are skipped as well,
and can be nested. |