Generating Binding Classes

The following sections reference example schema and programs that are available in the examples/manual subdirectory of the PyXB distribution.

Note

PyXB is developed and tested assuming a POSIX file system, as is used on Linux. While PyXB works perfectly well on Windows, some accommodation is required. In particular, when providing file URIs as command-line arguments to pyxbgen it may be necessary to explicitly note that the parameter is a URI by using the Windows file URI form of the file path.

For example, something like this will generally not work:

pyxbgen -m x -u "c:\\Windows\My Documents\x.xsd" # DO NOT USE

This should be expressed as:

pyxbgen -m x -u file://c:/Windows/My%20Documents/x.xsd

Self-contained schema

The following schema po1.xsd is a condensed version of the purchase order schema in the XMLSchema Primer:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <xsd:element name="purchaseOrder" type="PurchaseOrderType"/>
  <xsd:element name="comment" type="xsd:string"/>
  <xsd:complexType name="PurchaseOrderType">
    <xsd:sequence>
      <xsd:element name="shipTo" type="USAddress"/>
      <xsd:element name="billTo" type="USAddress"/>
      <xsd:element ref="comment" minOccurs="0"/>
      <xsd:element name="items"  type="Items"/>
    </xsd:sequence>
    <xsd:attribute name="orderDate" type="xsd:date"/>
  </xsd:complexType>
  <xsd:complexType name="USAddress">
    <xsd:sequence>
      <xsd:element name="name"   type="xsd:string"/>
      <xsd:element name="street" type="xsd:string"/>
      <xsd:element name="city"   type="xsd:string"/>
      <xsd:element name="state"  type="xsd:string"/>
      <xsd:element name="zip"    type="xsd:decimal"/>
    </xsd:sequence>
    <xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>
  </xsd:complexType>
  <xsd:complexType name="Items">
    <xsd:sequence>
      <xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
        <xsd:complexType>
          <xsd:sequence>
            <xsd:element name="productName" type="xsd:string"/>
            <xsd:element name="quantity">
              <xsd:simpleType>
                <xsd:restriction base="xsd:positiveInteger">
                  <xsd:maxExclusive value="100"/>
                </xsd:restriction>
              </xsd:simpleType>
            </xsd:element>
            <xsd:element name="USPrice"  type="xsd:decimal"/>
            <xsd:element ref="comment"   minOccurs="0"/>
            <xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
          </xsd:sequence>
          <xsd:attribute name="partNum" type="SKU" use="required"/>
        </xsd:complexType>
      </xsd:element>
    </xsd:sequence>
  </xsd:complexType>
  <!-- Stock Keeping Unit, a code for identifying products -->
  <xsd:simpleType name="SKU">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="\d{3}-[A-Z]{2}"/>
    </xsd:restriction>
  </xsd:simpleType>

</xsd:schema>

Translate this into Python with the following command:

pyxbgen \
  -u po1.xsd  -m po1

The -u parameter identifies a schema document describing contents of a namespace. The parameter may be a path to a file on the local system, or a URL to a network-accessible location like http://www.weather.gov/forecasts/xml/DWMLgen/schema/DWML.xsd. The -m parameter specifies the name to be used by the Python module holding the bindings generated for the namespace in the preceding schema. After running this, the Python bindings will be in a file named po1.py.

With the bindings available, this program (demo1.py):

from __future__ import print_function
import po1

xml = open('po1.xml').read()
order = po1.CreateFromDocument(xml)

print('%s is sending %s %d thing(s):' % (order.billTo.name, order.shipTo.name, len(order.items.item)))
for item in order.items.item:
    print('  Quantity %d of %s at $%s' % (item.quantity, item.productName, item.USPrice))

processing this document:

<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20">
  <shipTo country="US">
    <name>Alice Smith</name>
    <street>123 Maple Street</street>
    <city>Anytown</city><state>AK</state><zip>12341</zip>
  </shipTo>
  <billTo country="US">
    <name>Robert Smith</name>
    <street>8 Oak Avenue</street>
    <city>Anytown</city><state>AK</state><zip>12341</zip>
  </billTo>
  <items>
    <item partNum="833-AA">
      <productName>Lapis necklace</productName>
      <quantity>1</quantity>
      <USPrice>99.95</USPrice>
      <comment>Want this for the holidays!</comment>
      <shipDate>1999-12-05</shipDate>
    </item>
    <item partNum="833-AB">
      <productName>Plastic necklace</productName>
      <quantity>4</quantity>
      <USPrice>3.95</USPrice>
      <shipDate>1999-12-24</shipDate>
    </item>
  </items>
</purchaseOrder>

produces the following output:

Robert Smith is sending Alice Smith 2 thing(s):
  Quantity 1 of Lapis necklace at $99.95
  Quantity 4 of Plastic necklace at $3.95

Multi-document schema

Complex schema are more easy to manage when they are separated into multiple documents, each of which contains a cohesive set of types. In the example above, the USAddress type can be abstracted to handle a variety of addresses, and maintained as its own document address.xsd:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <xsd:complexType name="Address">
    <xsd:sequence>
      <xsd:element name="name"   type="xsd:string"/>
      <xsd:element name="street" type="xsd:string"/>
      <xsd:element name="city"   type="xsd:string"/>
    </xsd:sequence>
  </xsd:complexType>

  <xsd:complexType name="USAddress">
    <xsd:complexContent>
      <xsd:extension base="Address">
        <xsd:sequence>
          <xsd:element name="state" type="USState"/>
          <xsd:element name="zip"   type="xsd:positiveInteger"/>
        </xsd:sequence>
        <xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>
      </xsd:extension>
    </xsd:complexContent>
  </xsd:complexType>

  <xsd:complexType name="UKAddress">
    <xsd:complexContent>
      <xsd:extension base="Address">
        <xsd:sequence>
          <xsd:element name="postcode" type="UKPostcode"/>
        </xsd:sequence>
        <attribute name="exportCode" type="xsd:positiveInteger" fixed="1"/>
      </xsd:extension>
    </xsd:complexContent>
  </xsd:complexType>

  <!-- other Address derivations for more countries -->

  <xsd:simpleType name="USState">
    <xsd:restriction base="xsd:string">
      <xsd:enumeration value="AK"/>
      <xsd:enumeration value="AL"/>
      <xsd:enumeration value="AR"/>
      <xsd:enumeration value="AZ"/>
      <!-- and so on ... -->
    </xsd:restriction>
  </xsd:simpleType>

  <!-- simple type definition for UKPostcode -->
  <!-- *** pyxb mod: provide missing STD *** -->
  <xsd:simpleType name="UKPostcode">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="[A-Z]{2}\d\s\d[A-Z]{2}"/>
    </xsd:restriction>
  </xsd:simpleType>

</xsd:schema>

The XMLSchema include directive can be used to incorporate this document into po2.xsd:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <xsd:include schemaLocation="file:address.xsd"/>
  <xsd:element name="purchaseOrder" type="PurchaseOrderType"/>
  <xsd:element name="comment" type="xsd:string"/>
  <xsd:complexType name="PurchaseOrderType">
    <xsd:sequence>
      <xsd:element name="shipTo" type="USAddress"/>
      <xsd:element name="billTo" type="USAddress"/>
      <xsd:element ref="comment" minOccurs="0"/>
      <xsd:element name="items"  type="Items"/>
    </xsd:sequence>
    <xsd:attribute name="orderDate" type="xsd:date"/>
  </xsd:complexType>
  <xsd:complexType name="Items">
    <xsd:sequence>
      <xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
        <xsd:complexType>
          <xsd:sequence>
            <xsd:element name="productName" type="xsd:string"/>
            <xsd:element name="quantity">
              <xsd:simpleType>
                <xsd:restriction base="xsd:positiveInteger">
                  <xsd:maxExclusive value="100"/>
                </xsd:restriction>
              </xsd:simpleType>
            </xsd:element>
            <xsd:element name="USPrice"  type="xsd:decimal"/>
            <xsd:element ref="comment"   minOccurs="0"/>
            <xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
          </xsd:sequence>
          <xsd:attribute name="partNum" type="SKU" use="required"/>
        </xsd:complexType>
      </xsd:element>
    </xsd:sequence>
  </xsd:complexType>
  <!-- Stock Keeping Unit, a code for identifying products -->
  <xsd:simpleType name="SKU">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="\d{3}-[A-Z]{2}"/>
    </xsd:restriction>
  </xsd:simpleType>

</xsd:schema>

Translation of this document and execution of the test program is just as it was in the previous section:

pyxbgen \
  -u po2.xsd -m po2

Note that you do not need to explicitly list the address.xsd file. PyXB detects the include directive and reads the second schema by resolving its schemaLocation relative to the base URI of the containing document. Because the contents of the two schema files belong to the same namespace, their combined bindings are placed into the po2.py module.

Working with Namespaces

Documents of significant complexity are likely to require references to multiple namespaces. Notice that the schemas we’ve looked at so far have no namespace for both their target and default namespaces. The following schema nsaddress.xsd places the types that are in address.xsd into the namespace URN:address by defining a target namespace then including the namespace-less schema:

<xsd:schema
   targetNamespace="URN:address"
   xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <xsd:include schemaLocation="address.xsd"/>
</xsd:schema>

Note that this technique takes advantage of the chameleon schema pattern.

There are several ways you can prepare to process documents with multiple namespaces. If you have no expectation of using the imported namespace directly, you can process the importing schema just as before:

pyxbgen \
  -u po3.xsd -m po3

PyXB will detect the import statement, read the corresponding schema, and create bindings for its types. However, since the pyxbgen invocation did not mention the URN:address namespace, the bindings are written into a private binding file. The generated module file _address.py is created with a prefixed underscore indicating that it is not expected to be referenced directly. The public module po3.py will locally import module _address so that the required classes are available, but will not expose them to code that imports only module po3. The demonstration program demo3.py shows that things work as expected without the new namespace being made explicit.

from __future__ import print_function
import po3

order = po3.CreateFromDocument(open('po3.xml').read())

print('%s is sending %s %d thing(s):' % (order.billTo.name, order.shipTo.name, len(order.items.item)))
for item in order.items.item:
    print('  Quantity %d of %s at $%s' % (item.quantity, item.productName, item.USPrice))

More often, you will want to be able to import the module defining bindings from the additional namespaces. To do this, explicitly reference the additional schema and provide it with a module name:

pyxbgen \
  -u po3.xsd -m po3 \
  -u nsaddress.xsd -m address

Here each namespace is represented in its own module (address for URN:address and po3 for module with an absent namespace). In this case, the demonstration program is unchanged; see Creating Instances in Python Code for additional examples.

Sharing Namespace Bindings

Most often, if you have a common utility namespace like URN:address, you will want to generate its bindings once, and reference them in other schema without regenerating them. To do this, PyXB must be provided with an archive containing the schema components that were defined in that namespace, so they can be referenced in independent generation activities.

To generate the archive, you add the –archive-to-file flag to the binding generation command:

pyxbgen \
  -u nsaddress.xsd -m address \
  --archive-to-file address.wxs

In addition to generating the address Python module, this causes a archive of the schema contents to be saved in the corresponding file, which by convention ends with the extension .wxs. Any anonymous names that were generated with the bindings are also recorded in this archive, so that cross-namespace extension works correctly.

You can then generate bindings for importing namespaces by providing PyXB with the information necessary to locate this archive:

pyxbgen \
  -u po3.xsd -m po3 \
  --archive-path .:+

The –archive-path directive indicates that the current directory (.) should be searched for files that end in .wxs, and any namespaces found in such files implicitly made available for reference when they are encountered in an import instruction. (The second path component + causes the standard search path to be used after searching the current directory.)

In this case, when the import instruction is encountered, PyXB detects that it has an archive address.wxs that defines the contents of the imported namespace. Instead of reading and processing the schema, it generates references to the existing binding modules. Again, the demonstration program is unchanged.

Advanced Topics

Schemas Defined in WSDL Documents

It is a common, if regrettable, practice that web services define the structure of their documents using XML schema elements encoded directly into a types element of a WSDL specification rather than having that elements import complete standalone schema. To accommodate this, pyxbgen supports the –wsdl-location argument as an alternative to –schema-location. For example, the following will generate a module ndfd containing bindings required to communicate with the National Digital Forecast Database:

pyxbgen \
 --wsdl-location=http://www.weather.gov/forecasts/xml/DWMLgen/wsdl/ndfdXML.wsdl --module=ndfd \
 --archive-path=${PYXB_ROOT}/pyxb/bundles/wssplat//:+

Note that it will be necessary to have the WS-* bindings available, as provided by the –archive-path option above.

Customizing Binding Classes

PyXB permits you to customize the bindings that it generates by creating a module that imports the generated classes and instances, then extends them with subclasses with additional behavior. As long as you do not make major changes to the structure and names used in your namespaces, you can fine-tune the schema without changing the custom code.

The –write-for-customization option causes PyXB to generate all the Python modules in a subdirectory raw. Then you write a module that imports the generated bindings and extends them.

Until this documentation is enhanced significantly, users interested in generating custom bindings are referred to the extensions for WSDL 1.1 that are provided in the WS-* support bundle as pyxb.bundles.wssplat.wsdl11.py. An excerpt of the sort of thing done there is:

from pyxb.bundles.wssplat.raw.wsdl11 import *
import pyxb.bundles.wssplat.raw.wsdl11 as raw_wsdl11

class tParam (raw_wsdl11.tParam):
  def __getMessageReference (self):
      return self.__messageReference
  def _setMessageReference (self, message_reference):
      self.__messageReference = message_reference
  __messageReference = None
  messageReference = property(__getMessageReference)
raw_wsdl11.tParam._SetSupersedingClass(tParam)

The first line brings in all the public identifiers from the generated binding. The second makes them available in a qualified form that ensures we use the generated value rather than the customized value.

The class definition shows how to extend the generated bindings for the tParam complex type so that it has a field that can hold the instance of tMessage that was identified by the message attribute in an operation element. Following the class is a directive that tells PyXB to create instances of the customized class when automatically generating tParam instances from XML documents.

To customize bindings, you will need to be familiar with the _DynamicCreate_mixin class.

Be aware that _SetSupersedingClass only affects the behavior of Factory, and does not change the Python inheritance tree. This means that the superseding class is only invoked when the content model requires an instance of the original type. When an instance of a subclass of a superseded class (that is not itself superseded) is needed by the content model, this infrastructure is bypassed, the normal Python inheritance mechanism takes control, and the instance will not be an instance of the superseding class. This will happen both when instances are created in Python directly and when they are created due to presence in the binding model.

This is probably not what you will want, and to avoid it you must customize all subclasses of a customized class. A detailed example customization is in the examples/customization subdirectory of the distribution. In particular, it shows how to introspect the binding model extracted from the generated Python module and programmatically create custom binding classes without manually reproducing the content hierarchy, making the customizing module more compact and stable.

Fine-Grained Namespace Control

In certain cases, schema developers will presume that it is within their purview to re-declare or extend the contents of namespaces that belong to others. Supporting this while preserving or re-using the original namespace contents requires finesse.

For example, when generating the bindings for the OpenGIS Sensor Observation Service, you would find that this service extends the http://www.opengis.net/ogc namespace, normally defined in the OpenGIS Filter Encoding, with temporal operators that are defined in a local schema ogc4sos.xsd.

Because http://www.opengis.net/ogc is defined in a namespace archive, PyXB would normally assume that any import commands related to that namespace are redundant with the contents of that archive, and would ignore the import directive. In this case, that assumption is mistaken, and the ogc4sos.xsd schema must be read to define the additional elements and types. The required build command is:

pyxbgen \
  --schema-location=${SCHEMA_DIR}/sos/1.0.0/sosAll.xsd --module sos_1_0 \
  --archive-path=${ARCHIVE_DIR} \
  --import-augmentable-namespace=http://www.opengis.net/ogc

The –import-augmentable-namespace directive causes PyXB to allow import directives within the schema to add material to the content already loaded from an archive. Consequently, when reference to the ogc4sos.xsd schema is encountered, PyXB detects that, although it already has definitions for components in that namespace, this particular schema has not yet been read. PyXB reads the additional components, and generates bindings for the additional material into a private module _ogc which is then imported into the sos_1_0 module.