Implements a simple, dynamic type system for API generation.
author: | Anthony Scopatz <scopatz@gmail.com> |
---|
This module provides a suite of tools for denoting, describing, and converting between various data types and the types coming from various systems. This is achieved by providing canonical abstractions of various kinds of types:
All types are known by their name (a string identifier) and may be aliased with other names. However, the string id of a type is not sufficient to fully describe most types. The system here implements a canonical form for all kinds of types. This canonical form is itself hashable, being comprised only of strings, ints, and tuples.
First, let us examine the base types and the forms that they may take. Base types are fiducial. The type system itself may not make any changes (refinements, template filling) to types of this kind. They are basically a collection of bits. (The job of ascribing meaning to these bits falls on someone else.) Thus base types may be referred to simply by their string identifier. For example:
'str'
'int32'
'float64'
'MyClass'
Aliases to these – or any – type names are given in the type_aliases dictionary:
type_aliases = {
'i': 'int32',
'i4': 'int32',
'int': 'int32',
'complex': 'complex128',
'b': 'bool',
}
Furthermore, length-2 tuples are used to denote a type and the name or flag of its predicate. A predicate is a function or transformation that may be applied to verify, validate, cast, coerce, or extend a variable of the given type. A common usage is to declare a pointer or reference of the underlying type. This is done with the string flags ‘*’ and ‘&’:
('char', '*')
('float64', '&')
If the predicate is a positive integer, then this is interpreted as a homogeneous array of the underlying type with the given length. If this length is zero, then the tuple is often interpreted as a scalar of this type, equivalent to the type itself. The length-0 scalar interpretation depends on context. Here are some examples of array types:
('char', 42) # length-42 character array
('bool', 1) # length-1 boolean array
('f8', 0) # scalar 64-bit float
Note
length-1 tuples are converted to length-2 tuples with a 0 predicate, i.e. ('char',) will become ('char', 0).
The next kind of type are refinement types or refined types. A refined type is a sub-type of another type but restricts in some way what constitutes a valid element. For example, if we first take all integers, the set of all positive integers is a refinement of the original. Similarly, starting with all possible strings the set of all strings starting with ‘A’ is a refinement.
In the system here, refined types are given their own unique names (e.g. ‘posint’ and ‘astr’). The type system has a mapping (refined_types) from all refinement type names to the names of their super-type. The user may refer to refinement types simply by their string name. However the canonical form refinement types is to use the refinement as the predicate of the super-type in a length-2 tuple, as above:
('int32', 'posint') # refinement of integers to positive ints
('str', 'astr') # refinement of strings to str starting with 'A'
It is these refinement types that give the second index in the tuple its ‘predicate’ name. Additionally, the predicate is used to look up the converter and validation functions when doing code generation or type verification.
The last kind of types are known as dependent types or template types, similar in concept to C++ template classes. These are meta-types whose instantiation requires one or more parameters to be filled in by further values or types. Dependent types may nest with themselves or other dependent types. Fully qualifying a template type requires the resolution of all dependencies.
Classic examples of dependent types include the C++ template classes. These take other types as their dependencies. Other cases may require only values as their dependencies. For example, suppose we want to restrict integers to various ranges. Rather than creating a refinement type for every combination of integer bounds, we can use a single ‘intrange’ type that defines high and low dependencies.
The template_types mapping takes the dependent type names (e.g. ‘map’) to a tuple of their dependency names (‘key’, ‘value’). The refined_types mapping also accepts keys that are tuples of the following form:
('<type name>', '<dep0-name>', ('dep1-name', 'dep1-type'), ...)
Note that template names may be reused as types of other template parameters:
('name', 'dep0-name', ('dep1-name', 'dep0-name'))
As we have seen, dependent types may either be base types (when based off of template classes), refined types, or both. Their canonical form thus follows the rules above with some additional syntax. The first element of the tuple is still the type name and the last element is still the predicate (default 0). However the type tuples now have a length equal to 2 plus the number of dependencies. These dependencies are placed between the name and the predicate: ('<name>', <dep0>, ..., <predicate>). These dependencies, of course, may be other type names or tuples! Let’s see some examples.
In the simplest case, take analogies to C++ template classes:
('set', 'complex128', 0)
('map', 'int32', 'float64', 0)
('map', ('int32', 'posint'), 'float64', 0)
('map', ('int32', 'posint'), ('set', 'complex128', 0), 0)
Now consider the intrange type from above. This has the following definition and canonical form:
refined_types = {('intrange', ('low', 'int32'), ('high', 'int32')): 'int32'}
# range from 1 -> 2
('int32', ('intrange', ('low', 'int32', 1), ('high', 'int32', 2)))
# range from -42 -> 42
('int32', ('intrange', ('low', 'int32', -42), ('high', 'int32', 42)))
Note that the low and high dependencies here are length three tuples of the form ('<dep-name>', <dep-type>, <dep-value>). How the dependency values end up being used is solely at the discretion of the implementation. These values may be anything, though they are most useful when they are easily convertible into strings in the target language.
Warning
Do not confuse length-3 dependency tuples with length-3 type tuples! The last element here is a value, not a predicate.
Next, consider a ‘range’ type which behaves similarly to ‘intrange’ except that it also accepts the type as dependency. This has the following definition and canonical form:
refined_types = {('range', 'vtype', ('low', 'vtype'), ('high', 'vtype')): 'vtype'}
# integer range from 1 -> 2
('int32', ('range', 'int32', ('low', 'int32', 1), ('high', 'int32', 2)))
# positive integer range from 42 -> 65
(('int32', 'posint'), ('range', ('int32', 'posint'),
('low', ('int32', 'posint'), 42),
('high', ('int32', 'posint'), 65)))
The canonical forms for types contain all the information needed to fully describe different kinds of types. However, as human-facing code, they can be exceedingly verbose. Therefore there are number of shorthand techniques that may be used to also denote the various types. Converting from these shorthands to the fully expanded version may be done via the the canon(t) function. This function takes a single type and returns the canonical form of that type. The following are operations that canon() performs:
Base type are returned as their name:
canon('str') == 'str'
Aliases are resolved:
canon('f4') == 'float32'
Expands length-1 tuples to scalar predicates:
t = ('int32',)
canon(t) == ('int32', 0)
Determines the super-type of refinements:
canon('posint') == ('int32', 'posint')
Applies templates:
t = ('set', 'float')
canon(t) == ('set', 'float64', 0)
Applies dependencies:
t = ('intrange', 1, 2)
canon(t) = ('int32', ('intrange', ('low', 'int32', 1), ('high', 'int32', 2)))
t = ('range', 'int32', 1, 2)
canon(t) = ('int32', ('range', 'int32', ('low', 'int32', 1), ('high', 'int32', 2)))
Performs all of the above recursively:
t = (('map', 'posint', ('set', ('intrange', 1, 2))),)
canon(t) == (('map',
('int32', 'posint'),
('set', ('int32',
('intrange', ('low', 'int32', 1), ('high', 'int32', 2))), 0)), 0)
These shorthands are thus far more useful and intuitive than canonical form described above. It is therefore recommended that users and developers write code that uses the shorter versions, Note that canon() is guaranteed to return strings, tuples, and integers only – making the output of this function hashable.
Template type definitions that come stock with xdress:
template_types = {
'map': ('key_type', 'value_type'),
'dict': ('key_type', 'value_type'),
'pair': ('key_type', 'value_type'),
'set': ('value_type',),
'list': ('value_type',),
'tuple': ('value_type',),
'vector': ('value_type',),
}
Refined type definitions that come stock with xdress:
refined_types = {
'nucid': 'int32',
'nucname': 'str',
('enum', ('name', 'str'), ('aliases', ('dict', 'str', 'int32', 0))): 'int32',
('function', ('arguments', ('list', ('pair', 'str', 'type'))), ('returns', 'type')): 'void',
('function_pointer', ('arguments', ('list', ('pair', 'str', 'type'))), ('returns', 'type')): ('void', '*'),
}
Holistically, the following classes are important to type system:
A class that is used for checking whether a type matches a given pattern.
Parameters : | pattern : nested tuples, str, int
|
---|
A class representing a type system.
Parameters : | base_types : set of str, optional
template_types : dict, optional
refined_types : dict, optional
humannames : dict, optional
extra_types : str, optional
dtypes : str, optional
stlcontainers : str, optional
argument_kinds : dict, optional
variable_namespace : dict, optional
type_aliases : dict, optional
cpp_types : dict, optional
numpy_types : dict, optional
from_pytypes : dict, optional
cython_ctypes : dict, optional
cython_cytypes : dict, optional
cython_pytypes : dict, optional
cython_cimports : dict, optional
cython_cyimports : dict, optional
cython_pyimports : dict, optional
cython_functionnames : dict, optional
cython_classnames : dict, optional
cython_c2py_conv : dict, optional
cython_py2c_conv : dict, optional
typestring : typestr or None, optional
|
---|
Turns the type into its canonical form. See module docs for more information.
This returns a name for a function based on its name, rather than its type. The name may be either a string or a tuple of the form (‘name’, template_arg1, template_arg2, ...). The argkinds argument here refers only to the template arguments, not the function signature default arguments. This is not meant to replace cpp_type(), but complement it.
Given a varibale name and type, returns cython code (declaration, body, and return statements) to convert the variable from C/C++ to Python.
Helps find the approriate c2py value for a given concrete type key.
Returns the cimport lines associated with a type or a set of seen tuples.
Given a type t, and possibly previously seen cimport tuples (set), return the set of all seen cimport tuples. These tuple have four possible interpretations based on the length and values:
Given a type t, returns the corresponding Cython C/C++ type declaration.
This returns a name for a function based on its name, rather than its type. The name may be either a string or a tuple of the form (‘name’, template_arg1, template_arg2, ...). The argkinds argument here refers only to the template arguments, not the function signature default arguments. This is not meant to replace cython_functionname(), but complement it.
Computes variable or function names for cython types.
Returns the import lines associated with a type or a set of seen tuples.
Given a type t, and possibly previously seen import tuples (set), return the set of all seen import tuples. These tuple have four possible interpretations based on the length and values:
Any of these may be used.
Given a type t, returns the corresponding numpy type. If depth is greater than 0 then this returns of a list of numpy types for all internal template types, ie the float in (‘vector’, ‘float’, 0).
Given a varibale name and type, returns cython code (declaration, body, and return statement) to convert the variable from Python to C/C++.
Computes variable or function names for cython types.
Deletes a single key from a method on this type system instance.
Removes a type and its argument kind tuple from the type system.
This function will remove a previously registered class from the type system.
This function will remove a previously registered refinement from the type system.
This function will remove previously registered template specialization.
Saves a type system out to disk.
Parameters : | filename : str
format : str, optional
mode : str, optional
|
---|
Loads a type system from disk into a new type system instance. This is a class method.
Parameters : | filename : str
format : str, optional
mode : str, optional
|
---|
A context manager for making sure the given classes are local.
Registers an argument kind tuple into the type system for a template type.
Classes are user specified types. This function will add a class to the type system so that it may be used normally with the rest of the type system.
Registers a class with the type system from only its name, and relevant header file information.
Parameters : | classname : str or tuple package : str
pxd_base : str
cpppxd_base : str
cpp_classname : str or tuple, optional
make_dtypes : bool, optional
|
---|
This function will add a type to the system as numpy dtype that lives in the dtypes module.
This function will add a refinement to the type system so that it may be used normally with the rest of the type system.
This function will add a template specialization so that it may be used normally with the rest of the type system.
Registers a variable and its namespace in the typesystem.
A context manager for temporarily swapping out the dtypes value with a new value and replacing the original value before exiting.
A context manager for temporarily swapping out the stlcontainer value with a new value and replacing the original value before exiting.
Updates the type system in-place. Only updates the data attributes named in ‘datafields’. This may be called with any of the following signatures:
ts.update(<TypeSystem>)
ts.update(<dict-like>)
ts.update(key1=value1, key2=value2, ...)
Valid keyword arguments are the same here as for the type system constructor. See this documentation for more detail.
This is class whose attributes are properties that expose various string representations of a type. This is useful for the Python string formatting mini-language where attributes of an object may be accessed. For example:
“This is the Cython C/C++ type: {t.cython_ctype}”.format(t=typestr(t, ts))
This mechanism is used for accessing type information in conversion strings.
Parameters : | t : str or tuple
ts : TypeSystem
|
---|
The Cython C/C++ representation of the NumPy type without predicates.
The Cython C/C++ representation of the NumPy types without predicates.
The Cython Cython representation of the NumPy type without predicates.
The Cython Cython representation of the NumPy types without predicates.
The Cython Python representation of the NumPy type without predicates.