The Typical type checker generator.

This package implements Typical, a language and compiler for specification and generation of type checkers. Typical is a domain specific language based on ML, and currently supports Java as a target language. The compiler's main class is {@link xtc.typical.Typical}.

The rest of this document covers the following topics:


ML Ancestry
Typical is a domain specific language built on the functional core of the O'Caml dialect of ML. Notably; Typical has the following syntactic differences from O'Caml:
Features
Typical also provides the following built-in constructs, types, and features specifically for expressing type checker specifications.
Library
Typical provides a library of primitive operations. The Java implementations of these operations can be found in Primitives.java. The operations, their signatures and descriptions are listed below:
Map operations
get : 'a -> 'b                                  
put : 'a -> 'b                        
Prelude operations:
abs_float : float -> float                      Affirm a float
abs_int : int -> int                            Affirm an int
and_bits : int -> int -> int                    Perform logical and on two integers
ancestor : 'a -> node                           Get the ancestor of the top node of the traversal path if it matches a specified pattern
annotate : node -> string -> 'a -> 'a           Annotate a node with a named value
annotate_list node list -> string -> 'a -> 'a   Annotate a list of nodes
ftoi : float -> int                             Convert a float to an int
get_annotation : node -> string -> 'a           Get the named annotation from a node
has_annotation : node -> string -> bool         Test if a node has an annotation with the specified name
is_bottom : 'a -> bool                          Test a value for bottom
is_empty  : 'a list -> bool                     Test a list for emptiness
is_not_bottom : 'a -> bool                      Test if a value is not bottom 
negate_bits : int -> int                        Compute the bitwise negation of an integer
negate_float : float -> float                   Negate a float value
negate_int : int -> int                         Negate an int value
node_name : node -> string                      Get the name of a node 
nth : 'a list -> int -> 'a                      Get a list's nth element
or_bits : int -> int                            Compute the bitwise or of an int
parent : 'a -> node                             Get the parent of the top node on the traversal path if it matches a specified pattern
shift_left : int -> int -> int                  Left shift and int by the specified number of bits 
shift_right : int -> int -> int                 Right shift and int by the specified number of bits
trace : 'a -> 'a                                Debugging function to print a value to the standard output
trace2 : 'a -> string -> 'a                     Debugging function to print a prefied value to the standard output
xor_bits : int -> int -> int                    Compute the exclusive or of two integers 
List operations
append : 'a list -> 'a list -> 'a list          Append a list to another
cons : 'a -> 'a list                            Cons a new list from an value and an existing list
contains : 'a -> 'a list -> bool                Test if a list contains an element
exists : (fn : 'a -> bool) -> 'a list           Test if a list element satisfies a predicate
foldl : (fn: 'a -> 'a -> 'a) -> 'a list         Fold a list   
head : 'a list -> 'a                            Get the head of a list
intersection : 'a list -> 'a list 'a list       Compute the intersection of two lists
iter : (fn: 'a -> 'b) -> 'a list                Iterate a function over a list
length : 'a list -> int                         Compute the length of a list
subtraction : 'a list -> 'a list -> 'a list     Get the set subtraction of two lists
tail : 'a list -> 'a list                       Get the tail of a list
union : 'a list -> 'a list -> 'a list           Compute the set union of two lists
String operations
concat : string -> string -> string             Concatenate two strings
ends_with : string -> string -> bool            Test if a string ends with a substring
ends_withi : string -> string -> bool           Test if a string ends with a substring - case insensitive
join_strings : string list -> string            Combine a list of strings into a single string
ssize : string -> int                           Compute the size of a string
starts_with : string -> string -> bool          Test if a string begins with a specified substring
starts_withi : string -> string -> bool         Test if a string begins with a specified substring - case insensitive
stof : string -> float                          Convert a string to a float
stoi : string -> int                            Convert a string to an integer
substring : string -> int                       Get the substring starting at the specified index
substring2 : string -> int -> int               Get the substring starting at the specified range
Usage and Example
The Typical system encompasses three different programming languages and their compilers. The source language, the meta-language, and the bootstrap language. The source language is the language whose type-checker is being implemented. The meta-language is the language in which the type checker specification is written; i.e. Typical. The Typical compiler will generate code in the bootstrap language, which in the current implementation is Java.
As an introduction to Typical the following example presents the implementation of a complete type checker using the Simply Typed Lambda Calculus as the source language. In this example, the calculus is treated not as a formal system, but as a programming language whose front end we wish to implement.

Below we have the calculus' syntax. The corresponding grammar (found in lang/TypedLambda.rats) is written for the Rats! parser generator and largely follows the syntactic specification shown.
Expression      <- Application EOF
Application     <- Application BasicExpression / BasicExpression
Abstraction     <- LAMBDA Identifier COLON FunctionType DOT Application 
BasicExpression <- Abstraction / Identifier / IntegerConstant / StringConstant  / OPEN Application CLOSE
Identifier      <- [a-zA-Z]+ 
IntegerConstant <- [1-9] [0-9]* 
StringConstant  <- ["] (!["] _)* ["]"
FunctionType    <- BasicType ARROW FunctionType / BasicType
BasicType       <- IntegerType / StringType / OPEN FunctionType CLOSE
IntegerType     <- "int"
StringType      <- "string"
LAMBDA          <- "\\"
COLON           <- ":"
DOT             <- "."
ARROW           <- "->"
OPEN            <- "("
CLOSE           <- ")"
Figure 2.

Next, we use the Typical compiler and the complete TypedLambda.tpcl specification below in Figure 3 to create the type checker.
1.  (**
2.  * Typing rules for the simply typed lambda calculus.
3.  *
4.  * @author Robert Grimm
5.  * @version $Revision: 1.14 $
6.  *)
7.  module xtc.lang.TypedLambda;
8. 
9.  mltype node =
10.   | Application of node * node
11.   | Abstraction of node * node * node
12.   | Identifier of string
13.   | IntegerConstant of string
14.   | StringConstant of string
15.   | FunctionType of node * node
16.   | IntegerType
17.   | StringType
18.   ;
19. 
20. mltype raw_type = IntegerT | StringT | FunctionT of type * type ;
21. 
22. scope Abstraction (id, _, body) -> Scope(Anonymous("lambda"), [id, body]) ;
23. 
24. namespace default : type = Identifier (id) -> SimpleName(id) ;
25. 
26. mlvalue analyze = function
27.   | Application (lambda, expr) ->
28.       let tl = analyze lambda
29.       and tr = analyze expr in 
30.         require (predicate FunctionT _) tl.type
31.           error "application of non-function" in begin
32.             match tl.type, tr with
33.               | FunctionT (param, res), param -> res
34.               | _ -> error "argument/parameter type mismatch"
35.           end
36.   | Abstraction (id, type, body) ->
37.       let param = analyze type in
38.       let _     = define id param in
39.       let res   = analyze body in
40.         { type = FunctionT (param, res) }
41.   | Identifier _ as id ->
42.       let t = lookup id in
43.          require t != bottom error (get_name id) ^ " undefined" in t
44.   | IntegerConstant _ ->
45.       { type = IntegerT }
46.   | StringConstant _ ->
47.       { type = StringT }
48.   | FunctionType (parameter, result) ->
49.       { type = FunctionT (analyze parameter, analyze result) }
50.   | IntegerType ->
51.       { type = IntegerT }
52.   | StringType ->
53.       { type = StringT }
54.   ;
55. 
56. mlvalue get_name = function
57.   | Identifier (name) -> name
58.   | _ -> bottom
59.   ;
60.
Figure 3. Type checker specification for the Simply Typed Lambda Calculus

The specification begins with the module declaration on line 7.

Lines 9-17 declare the variant type node to represent the STLC's abstract syntax tree structure. The variants of the node type indicate the names and the type of the children of all the nodes that may appear in the STLC AST to be type checked. Note that this specification need not be written by hand, but can automatically generated from a Rats! grammar specification using the -ast option.

Line 20 declares the variant type raw_type to represent the type structure of our STLC. In this case we have three types: an integer type, a string type, and a function type.

Line 22 uses Typical's scope construct to declaratively specify scope management for the STLC type checker. The declaration can be interpreted as follows: If an Abstraction node is encountered while traversing the AST the type checker will enter a new Anonymous scope with a name derived from "lambda", and the first and third children of the Abstraction node will belong to the new scope.

Line 24 uses Typical's scope construct to declaratively specify the STLC's namespace management. The declaration can be interpreted as follows: All STLC names belong to a single default namespace and have value 'type'. Symbol table names are obtained by extracting the string value from an Identifier node, and using this string to create a new SimpleName.

Lines 26-55 define the type rules of our STLC as pattern matches from nodes to type. Lines 27-34 typecheck an Application node by first typechecking each child of the application, then using the require construct to enforce that the first child has function type, and the error construct to report failure. Finally the error construct is used to report an error on type/argument mismatch. Lines 36-40 process an abstraction node by analysing the second and third children, and adding an symbol table entry using the define construct. Since the define construct has built in error management, an error will be reported if the identifier was previously defined. Successful processing of the Abstraction node returns the appropriate FunctionType. Line 41 types an Identifier via a symbol table lookup on the node. Lines 44-54 are straightforward mapping of the IntegerConstant, StringConstant, FunctionType, IntegerType, and StringType nodes to their corresponding types.

Finally the specification can be compiled using the command java xtc.typical.Typical TypedLambda.tpcl. This generates three files: TypedLambdaTypes.java which contains definitions for all the types used by the STCL type checker, TypedLambdaSupport.java which contains general supporting infrastructure, and TypedLambdaAnalyzer.java which is the STLC typechecker itself. TypedLambdaAnalyzer can be incorporated into a Compiler or Driver (for example, TypedLambda.java) for processing STLC programs.