Note
This historical document was written prior to PyXB development in order to bound the scope of the project and record the critical use cases. It has not been updated in response to lessons learned during development.
PyXB is intended to support automated generation of Python language classes that conform in structure to data types defined with the W3C XML Schema.
PyXB was developed to support interaction with arbitrary web services from Python.
Service oriented architecture, specifically in the form of web services, has gained much traction in recent years. While languages like Java and C++ have a variety of commmercial, free, and open source utilities to help bridge the gap between web service standards and software, Python has been much less developed. Built-in support is limited to processing of XML documents.
Some external packages are available to support SOAP and WSDL, but they were found to be unsatisfactory for a variety of reasons, both an inability to process complex schema like those for KML or SAML and a Python interface that was not natural.
Reliance on a SAX processor, and use of a code generation scheme that was difficult to modify, made the code difficult to work with. It seemed infeasible to extend the implementation to support Pythonic schema bindings that used different styles.
This system supports generation of bindings that allow web services to be invoked, but XML schema support is secondary to SOAP and WSDL.
PyXB must be able to process the schema used to describe standard web services documents, including but not limited to the following namespaces:
Standard conformance is a high priority. In addition to official standards such as those from W3C and OASIS, industry prevalence should guide selection among conflicting standards. This applies not only to standards that are supported by PyXB, but those used in its development.
Python data structures bound to data types in these namespace should be as close as possible to those that would have been hand-written by a user. As an example, elements that may only appear once should correspond to fields with a object value; those that may appear multiple times should correspond to fields with an iterator value. Values recognized through a model group definition should be broken down into direct references to the underlying element (not passed through some intermediate structure).
Speed of the resulting bindings when used within Python code is a secondary concern to usability of the bindings. Speed of conversion between bound objects and XML is a tertiary concern.
The bindings must be able to generate Python objects corresponding to the data held within an XML document that validates against the schema from which the bindings were derived.
The bindings must be able to generate XML that conforms to the schema from which they were derived.
A facility must be provided through which custom behavior can be attached to the Python objects derived from PyXB.
PyXB should support generating a module on a per-namespace basis. The module should include the necessary information to process schemas that depend on the namespace, as well as bindings for the namespace. Modules supporting the namespaces listed above should be provided along with PyXB. It should be possible for users to dynamically add available namespaces.
It should be possible to customize the generated binding code. It is a goal to allow PyXB to generate code for other languages, though the capabilities of that code may be inherently limited or require additional tools.
PyXB and its generated bindings must work with a relatively current Python 2.x installation. Bindings generated by PyXB must require no packages or external programs except those that come as part of a standard Python distribution. At a minimum, bindings must be supported by Python distributions 2.3 and later.
While the generated bindings should not require additional packages, they should allow use of such packages when available. For example, use in a system that provided a high-speed DOM implementation that would improve performance should not be excluded.
PyXB itself should require no additional Python packages except those that come as part of a standard Python package. There may be additional functionality that is supported when certain packages are available. At a minimum, PyXB SHOULD run under Python version 2.4 and later.
It is acceptable to require that the schemas processed by PyXB validate against the W3C specification. PyXB is not itself required to perform any validation. Given an invalid schema, PyXB SHOULD provide a diagnostic indicating any fault that prevents processing of the schema.
It is not required that generated bindings support validation of instance documents. This function may be added through a customized binding generator.
It should be possible to implement PyXB in terms of the bindings it generates from the XMLSchema namespace. At least conceptually.
The licensing of PyXB and code it generates should be consistent with the Open Source Definition. An exception is acceptable for material produced by a custom generator that does not incorporate PyXB material in its output.