Automatic Descriptions

This module creates descriptions of C/C++ classes, functions, and variables from source code, by using external parsers (GCC-XML, Clang AST) and the type system.

This module is available as an xdress plugin by the name xdress.autodescribe.

author:Anthony Scopatz <scopatz@gmail.com>

Descriptions

A key component of API wrapper generation is having a a top-level, abstract representation of the software that is being wrapped. In C++ there are three basic constructs which may be wrapped: variables, functions, and classes.

The abstract representation of a C++ class is known as a description (abbr. desc). This description is simply a Python dictionary with a specific structure. This structure makes heavy use of the type system to declare the types of all needed parameters.

The Name Key

The name key is a dictionary that represents the API name of the element being described. This contains exactly the same keys that the utils.apiname() type has fields. While apiname is used for user input and validation, the values here must truly describe the API element. The following keys – and only the following keys – are allowed in the name dictionary.

srcname:str or tuple, the element’s API name in the original source code, eg. MyClass.
srcfiles:tuple of str, this is a sequence of unique strings which represents the file paths where the API element may be defined. For example, (‘myfile.c’, ‘myfile.h’). If the element is defined outside of these files, then the automatic discovery or description may fail. Since these files are parsed they must actually exist on the filesystem.
tarbase:str, the base portion of all automatically generated (target) files. This does not include the directory or the file extension. For example, if you wanted cythongen to create a file name ‘mynewfile.pyx’ then the value here would be simply ‘mynewfile’.
tarname:str or tuple, the element’s API name in the automatically generated (target) files, e.g. MyNewClass.
incfiles:tuple of str, this is a sequence of all files which must be #include’d to access the srcname at compile time. This should be as minimal of a set as possible, preferably only one file. For example, ‘hdf5.h’.
sidecars:tuple of str, this is a sequence of all sidecar files to use for this API element. Like srcfiles, these files must exist for xdress to run. For example, ‘myfile.py’.
language:str, flag for the language that the srcfiles are implemented in. Valid options are: ‘c’, ‘c++’, ‘f’, ‘fortran’, ‘f77’, ‘f90’, ‘python’, and ‘cython’.

Variable Description Top-Level Keys

The following are valid top-level keys in a variable description dictionary: name, namespace, type, docstring, and extra.

name:dict, the variable name, see above
namespace:str or None, the namespace or module the variable lives in.
type:str or tuple, the type of the variable
docstring:str, optional, this is a documentation string for the variable.
extra:dict, optional, this stores arbitrary metadata that may be used with different backends. It is not added by any auto-describe routine but may be inserted later if needed. One example use case is that the Cython generation looks for the pyx, pxd, and cpppxd keys for strings of supplemental Cython code to insert directly into the wrapper.

Function Description Top-Level Keys

The following are valid top-level keys in a function description dictionary: name, namespace, signatures, docstring, and extra.

name:

dict, the function name, see above

namespace:

str or None, the namespace or module the function lives in.

signatures:

dict or dict-like, the keys of this dictionary are function call signatures and the values are dicts of non-signature information. The signatures themselves are tuples. The first element of these tuples is the method name. The remaining elements (if any) are the function arguments. Arguments are themselves length-2 tuples whose first elements are the argument names and the second element is the argument type. The values are themselves dicts with the following keys:

return:the return type of this function. Unlike class constuctors and destructors, the return type may not be None (only ‘void’ values are allowed).
defaults:a length-N tuple of length-2 tuples of the default argument kinds and values. N must be the number of arguments in the signature. In the length-2 tuples, the first element must be a member of the utils.Arg enum and the second element is the associated default value. If no default argument exists use utils.Args.NONE as the kind and by convention set the value to None, though this should be ignored in all cases.
docstring:

str, optional, this is a documentation string for the function.

extra:

dict, optional, this stores arbitrary metadata that may be used with different backends. It is not added by any auto-describe routine but may be inserted later if needed. One example use case is that the Cython generation looks for the pyx, pxd, and cpppxd keys for strings of supplemental Cython code to insert directly into the wrapper.

Class Description Top-Level Keys

The following are valid top-level keys in a class description dictionary: name, parents, namespace, attrs, methods, docstrings, and extra.

name:

dict, the class name, see above

parents:

possibly empty list of strings, the immediate parents of the class (not grandparents).

namespace:

str or None, the namespace or module the class lives in.

attrs:

dict or dict-like, the names of the attributes (member variables) of the class mapped to their types, given in the format of the type system.

methods:

dict or dict-like, similar to the attrs except that the keys are now function signatures and the values are dicts of non-signature information. The signatures themselves are tuples. The first element of these tuples is the method name. The remaining elements (if any) are the function arguments. Arguments are themselves length-2 tuples whose first elements are the argument names and the second element is the argument type. The values are themselves dicts with the following keys:

return:the return type of this function. If the return type is None (as opposed to ‘void’), then this method is assumed to be a constructor or destructor.
defaults:a length-N tuple of length-2 tuples of the default argument kinds and values. N must be the number of arguments in the signature. In the length-2 tuples, the first element must be a member of the utils.Arg enum and the second element is the associated default value. If no default argument exists use utils.Args.NONE as the kind and by convention set the value to None, though this should be ignored in all cases.
construct:

str, optional, this is a flag for how the class is implemented. Accepted values are ‘class’ and ‘struct’. If this is not present, then ‘class’ is assumed. This is most useful from wrapping C structs as Python classes.

docstrings:

dict, optional, this dictionary is meant for storing documentation strings. All values are thus either strings or dictionaries of strings. Valid keys include: class, attrs, and methods. The attrs and methods keys are dictionaries which may include keys that mirror the top-level keys of the same name.

extra:

dict, optional, this stores arbitrary metadata that may be used with different backends. It is not added by any auto-describe routine but may be inserted later if needed. One example use case is that the Cython generation looks for the pyx, pxd, and cpppxd keys for strings of supplemental Cython code to insert directly into the wrapper.

Toaster Example

Suppose we have a C++ class called Toaster that takes bread and makes delicious toast. A valid description dictionary for this class would be as follows:

class_desc = {
    'name': {
        'language': 'c++',
        'incfiles': ('toaster.h',),
        'srcfiles': ('src/toaster.h', 'src/toaster.cpp'),
        'srcname': 'Toaster',
        'sidecars': ('src/toaster.py',),
        'tarbase': 'toaster',
        'tarname': 'Toaster',
        },
    'parents': ['FCComp'],
    'namespace': 'bright',
    'construct': 'class',
    'attrs': {
        'n_slices': 'int32',
        'rate': 'float64',
        'toastiness': 'str',
        },
    'methods': {
        ('Toaster',): {'return': None, 'defaults': ()},
        ('Toaster', ('name', 'str')): {'return': None,
            'defaults': ((Args.LIT, ""),)},
        ('Toaster', ('paramtrack', ('set', 'str')), ('name', 'str', '""')): {
            'return': None,
            'defaults': ((Args.NONE, None), (Args.LIT, ""))},
        ('~Toaster',): {'return': None, 'defaults': ()},
        ('tostring',): {'return': 'str', 'defaults': ()},
        ('calc',): {'return': 'Material', 'defaults': ()},
        ('calc', ('incomp', ('map', 'int32', 'float64'))): {
            'return': 'Material',
            'defaults': ((Args.NONE, None),)},
        ('calc', ('mat', 'Material')): {
            'return': 'Material',
            'defaults': ((Args.NONE, None),)},
        ('write', ('filename', 'str')): {
            'return': 'void',
            'defaults': ((Args.LIT, "toaster.txt"),)},
        ('write', ('filename', ('char' '*'), '"toaster.txt"')): {
            'return': 'void',
            'defaults': ((Args.LIT, "toaster.txt"),)},
        },
    'docstrings': {
        'class': "I am a toaster!",
        'attrs': {
            'n_slices': 'the number of slices',
            'rate': 'the toast rate',
            'toastiness': 'the toastiness level',
            },
        'methods': {
            'Toaster': "Make me a toaster!",
            '~Toaster': "Noooooo",
            'tostring': "string representation of the toaster",
            'calc': "actually makes the toast.",
            'write': "persists the toaster state."
            },
        },
    'extra': {
        'pyx': 'toaster = Toaster()  # make toaster singleton'
        },
    }

Automatic Description Generation

The purpose of this module is to create description dictionaries like those above by automatically parsing C++ classes. In theory this parsing step may be handled by visiting any syntax tree of C++ code. Two options were pursued here: GCC-XML and the Python bindings to the Clang AST. Unfortunately, the Clang AST bindings lack exposure for template argument types. These are needed to use any standard library containers. Thus while the Clang method was pursued to a mostly working state, the GCC-XML version is the only fully functional automatic describer for the moment.

Automatic Descriptions API

class xdress.autodescribe.GccxmlBaseDescriber(name, root=None, onlyin=None, ts=None, verbose=False)[source]

Base class used to generate descriptions via GCC-XML output. Sub-classes need only implement a visit() method and optionally a constructor. The default visitor methods are valid for classes.

Parameters :

name : str

The name to describe.

root : element tree node, optional

The root element node.

onlyin : str, optional

Filename the class or struct described must live in. Prevents finding elements of the same name coming from other libraries.

ts : TypeSystem, optional

A type system instance.

verbose : bool, optional

Flag to display extra information while visiting.

context(id)[source]

Resolves the context from its id and information in the element tree.

type(id)[source]

Resolves the type from its id and information in the root element tree.

visit_argument(node)[source]

visits a constructor, destructor, or method argument.

visit_arraytype(node)[source]

visits an array type and maps it to a ‘*’ refinement type.

visit_base(node)[source]

visits a base class.

visit_class(node)[source]

visits a class or struct.

visit_constructor(node)[source]

visits a class constructor.

visit_cvqualifiedtype(node)[source]

visits constant, volatile, and restricted types and maps them to ‘const’, ‘volatile’, and ‘restrict’ refinement types.

visit_destructor(node)[source]

visits a class destructor.

visit_field(node)[source]

visits a member variable.

visit_function(node)[source]

visits a non-member function.

visit_functiontype(node)[source]

visits an function type and returns a ‘function’ dependent refinement type.

visit_fundamentaltype(node)[source]

visits a base C++ type, mapping it to the approriate type in the type system.

visit_method(node)[source]

visits a member function.

visit_namespace(node)[source]

visits the namespace that a node is defined in.

visit_pointertype(node)[source]

visits a pointer and maps it to a ‘*’ refinement type.

visit_referencetype(node)[source]

visits a reference and maps it to a ‘&’ refinement type.

visit_struct(node)

visits a class or struct.

visit_typedef(node)[source]

visits a type definition anywhere.

class xdress.autodescribe.GccxmlClassDescriber(name, root=None, onlyin=None, ts=None, verbose=False)[source]

Class used to generate class descriptions via GCC-XML output.

Parameters :

name : str

The class name, this may not have a None value.

root : element tree node, optional

The root element node of the class or struct to describe.

onlyin : str, optional

Filename the class or struct described must live in. Prevents finding classes of the same name coming from other libraries.

ts : TypeSystem, optional

A type system instance.

verbose : bool, optional

Flag to display extra information while visiting the class.

visit(node=None)[source]

Visits the class node and all sub-nodes, generating the description dictionary as it goes.

Parameters :

node : element tree node, optional

The element tree node to start from. If this is None, then the top-level class node is found and visited.

class xdress.autodescribe.GccxmlFuncDescriber(name, root=None, onlyin=None, ts=None, verbose=False)[source]

Class used to generate function descriptions via GCC-XML output.

Parameters :

name : str

The function name, this may not have a None value.

root : element tree node, optional

The root element node of the function to describe.

onlyin : str, optional

Filename the function described must live in. Prevents finding functions of the same name coming from other libraries.

ts : TypeSystem, optional

A type system instance.

verbose : bool, optional

Flag to display extra information while visiting the function.

visit(node=None)[source]

Visits the function node and all sub-nodes, generating the description dictionary as it goes.

Parameters :

node : element tree node, optional

The element tree node to start from. If this is None, then the top-level class node is found and visited.

class xdress.autodescribe.GccxmlVarDescriber(name, root=None, onlyin=None, ts=None, verbose=False)[source]

Class used to generate variable descriptions via GCC-XML output.

Parameters :

name : str

The function name, this may not have a None value.

root : element tree node, optional

The root element node of the function to describe.

onlyin : str, optional

Filename the function described must live in. Prevents finding functions of the same name coming from other libraries.

ts : TypeSystem, optional

A type system instance.

verbose : bool, optional

Flag to display extra information while visiting the function.

visit(node=None)[source]

Visits the variable node and all sub-nodes, generating the description dictionary as it goes.

Parameters :

node : element tree node, optional

The element tree node to start from. If this is None, then the top-level class node is found and visited.

class xdress.autodescribe.XDressPlugin[source]

This plugin creates automatic description dictionaries of all souce and target files.

adddesc2env(desc, env, name)[source]

Adds a description to environment.

compute_classes(rc)[source]

Computes class descriptions and loads them into the environment.

compute_desc(name, kind, rc)[source]

Returns a description dictionary for a class or function implemented in a source file and bound into a target file.

Parameters :

name : apiname

API element name to describe.

kind : str

The kind of type to describe, valid flags are ‘class’, ‘func’, and ‘var’.

rc : xdress.utils.RunControl

Run contoler for this xdress execution.

Returns :

desc : dict

Description dictionary.

compute_functions(rc)[source]

Computes function descriptions and loads them into the environment.

compute_variables(rc)[source]

Computes variables descriptions and loads them into the environment.

defaultrc()[source]

This plugin adds the env dictionary to the rc.

load_pysrcmod(sidecar, rc)[source]

Loads a module dictionary from a sidecar file into the pysrcenv cache.

load_sidecars(rc)[source]

Loads all sidecar files.

rcdocs()[source]

This plugin adds the env dictionary to the rc.

register_classes(rc)[source]

Registers classes with the type system. This can and should be done trying to describe the class.

setup(rc)[source]

Expands variables, functions, and classes in the rc based on copying src filenames to tar filename.

xdress.autodescribe.clang_describe(filename, name, kind, includes=(), defines=('XDRESS', ), undefines=(), extra_parser_args=(), ts=None, verbose=False, debug=False, builddir=None, onlyin=None, language='c++', clang_includes=())[source]

Use Clang to describe the class.

Parameters :

filename : str

The path to the file.

name : str

The name to describe.

kind : str

The kind of type to describe, valid flags are ‘class’, ‘func’, and ‘var’.

includes: list of str, optional :

The list of extra include directories to search for header files.

defines: list of str, optional :

The list of extra macro definitions to apply.

undefines: list of str, optional :

The list of extra macro undefinitions to apply.

extra_parser_args : list of str, optional

Further command line arguments to pass to the parser.

ts : TypeSystem, optional

A type system instance.

verbose : bool, optional

Flag to diplay extra information while describing the class.

debug : bool, optional

Flag to enable/disable debug mode. Currently ignored.

builddir : str, optional

Ignored. Exists only for compatibility with gccxml_describe.

onlyin : set of str, optional

The paths to the files that the definition is allowed to exist in.

language : str

Valid language flag.

Returns :

desc : dict

A dictionary describing the class which may be used to generate API bindings.

xdress.autodescribe.clang_describe_class(cls)[source]

Describe the class at the given clang AST node

xdress.autodescribe.clang_describe_function(func)[source]

Describe the function at the given clang AST node.

xdress.autodescribe.clang_describe_functions(funcs)[source]

Describe the function at the given clang AST nodes. If more than one node is given, we verify that they match and find argument names where we can.

xdress.autodescribe.clang_describe_template_arg(arg, loc)[source]

Describe a template argument

xdress.autodescribe.clang_describe_template_args(node)[source]

TODO: Broken version handling defaults automatically:

_, defaults = clang_template_arg_info(node.specialized_template)
args = [clang_describe_template_arg(a) for a in node.get_template_args()]
for i in xrange(len(defaults)):
    if defaults[-1-i] == args[-1]:
        args.pop()
return tuple(args)

TODO: Needs a better docstring.

xdress.autodescribe.clang_describe_type(typ, loc)[source]

Describe the type reference at the given cursor

xdress.autodescribe.clang_describe_var(var)[source]

Describe the var at the given clang AST node

xdress.autodescribe.clang_expand_template_args(node, args)[source]

TODO: Broken version handling defaults automatically:

count,defaults = clang_template_arg_info(node)
print('SP %s, COUNT %s, %s, %s'%(node.spelling,count,defaults,args))
if len(args) < count:
    return tuple(args) + defaults[(count-len(args)):]
return tuple(args)+defaults[count-len(args)]

TODO: Needs a better docstring.

xdress.autodescribe.clang_find_class(tu, name, ts, namespace=None, filename=None, onlyin=None)[source]

Find the node for a given class in the given translation unit.

xdress.autodescribe.clang_find_decls(tu, name, kinds, onlyin, namespace=None)[source]

Find all declarations of the given name and kind in the given scopes.

xdress.autodescribe.clang_find_function(tu, name, ts, namespace=None, filename=None, onlyin=None)[source]

Find all nodes corresponding to a given function. If there is a separate declaration and definition, they will be returned as separate nodes, in the order given in the file.

xdress.autodescribe.clang_find_scopes(tu, onlyin, namespace=None)[source]

Find all ‘toplevel’ scopes, optionally restricting to a given namespace

xdress.autodescribe.clang_find_var(tu, name, ts, namespace=None, filename=None, onlyin=None)[source]

Find the node for a given var.

xdress.autodescribe.clang_fix_onlyin(onlyin)[source]

Make sure onlyin is a set and add ./path versions for each relative path

xdress.autodescribe.clang_range_str(source_range)[source]

Get the text present on a source range.

xdress.autodescribe.clang_template_param_kinds(node)[source]

Find the Arg kind of each template argument of node

xdress.autodescribe.clearmemo()[source]

Clears all function memoizations for autodescribers.

xdress.autodescribe.describe(filename, name=None, kind='class', includes=(), defines=('XDRESS', ), undefines=(), extra_parser_args=(), parsers='gccxml', ts=None, verbose=False, debug=False, builddir='build', language='c++', clang_includes=())[source]

Automatically describes an API element in a file. This is the main entry point.

Parameters :

filename : str or container of strs

The path to the file or a list of file paths. If this is a list to many files, a temporary file will be created that #includes all of the files in this list in order. This temporary file is the one which will be parsed.

name : str

The name to describe.

kind : str, optional

The kind of type to describe, valid flags are ‘class’, ‘func’, and ‘var’.

includes: list of str, optional :

The list of extra include directories to search for header files.

defines: list of str, optional :

The list of extra macro definitions to apply.

undefines: list of str, optional :

The list of extra macro undefinitions to apply.

extra_parser_args : list of str, optional

Further command line arguments to pass to the parser.

parsers : str, list, or dict, optional

The parser / AST to use to use for the file. Currently ‘clang’, ‘gccxml’, and ‘pycparser’ are supported, though others may be implemented in the future. If this is a string, then this parser is used. If this is a list, this specifies the parser order to use based on availability. If this is a dictionary, it specifies the order to use parser based on language, i.e. {'c' ['pycparser', 'gccxml'], 'c++': ['gccxml', 'pycparser']}.

ts : TypeSystem, optional

A type system instance.

verbose : bool, optional

Flag to diplay extra information while describing the class.

debug : bool, optional

Flag to enable/disable debug mode.

builddir : str, optional

Location of – often temporary – build files.

language : str

Valid language flag.

clang_includes : list of str, optional

clang-specific include paths.

Returns :

desc : dict

A dictionary describing the class which may be used to generate API bindings.

xdress.autodescribe.gccxml_describe(filename, name, kind, includes=(), defines=('XDRESS', ), undefines=(), extra_parser_args=(), ts=None, verbose=False, debug=False, builddir='build', onlyin=None, language='c++', clang_includes=())[source]

Use GCC-XML to describe the class.

Parameters :

filename : str

The path to the file.

name : str

The name to describe.

kind : str

The kind of type to describe, valid flags are ‘class’, ‘func’, and ‘var’.

includes: list of str, optional :

The list of extra include directories to search for header files.

defines: list of str, optional :

The list of extra macro definitions to apply.

undefines: list of str, optional :

The list of extra macro undefinitions to apply.

extra_parser_args : list of str, optional

Further command line arguments to pass to the parser.

ts : TypeSystem, optional

A type system instance.

verbose : bool, optional

Flag to diplay extra information while describing the class.

debug : bool, optional

Flag to enable/disable debug mode.

builddir : str, optional

Location of – often temporary – build files.

onlyin: set of str :

The paths to the files that the definition is allowed to exist in.

language : str

Valid language flag.

clang_includes : ignored

Returns :

desc : dict

A dictionary describing the class which may be used to generate API bindings.

xdress.autodescribe.pycparser_describe(filename, name, kind, includes=(), defines=('XDRESS', ), undefines=(), extra_parser_args=(), ts=None, verbose=False, debug=False, builddir='build', onlyin=None, language='c', clang_includes=())[source]

Use pycparser to describe the fucntion or struct (class).

Parameters :

filename : str

The path to the file.

name : str

The name to describe.

kind : str

The kind of type to describe, valid flags are ‘class’, ‘func’, and ‘var’.

includes: list of str, optional :

The list of extra include directories to search for header files.

defines: list of str, optional :

The list of extra macro definitions to apply.

undefines: list of str, optional :

The list of extra macro undefinitions to apply.

extra_parser_args : list of str, optional

Further command line arguments to pass to the parser.

ts : TypeSystem, optional

A type system instance.

verbose : bool, optional

Flag to diplay extra information while describing the class.

debug : bool, optional

Flag to enable/disable debug mode.

builddir : str, optional

Location of – often temporary – build files.

onlyin : set of str

The paths to the files that the definition is allowed to exist in.

language : str

Must be ‘c’.

clang_includes : ignored

Returns :

desc : dict

A dictionary describing the class which may be used to generate API bindings.