xml — XML utilities and interfaces for handling XMPP XML streams

This module provides a few classes and functions which are useful when generating and parsing XML streams for XMPP.

Generating XML streams

The most useful class here is the XMPPXMLGenerator:

class aioxmpp.xml.XMPPXMLGenerator(out, short_empty_elements=True, sorted_attributes=False)[source]

XMPPXMLGenerator works similar to xml.sax.saxutils.XMLGenerator, but has a few key differences:

  • It supports only namespace-conforming XML documents
  • It automatically chooses namespace prefixes if a namespace has not been declared
  • It is in general stricter on (explicit) namespace declarations, to avoid ambiguities
  • It always uses utf-8 ☺
  • It allows explicit flushing

out must be a file-like supporting both file.write() and file.flush(). encoding specifies the encoding which is used and must be utf-8 for XMPP.

If short_empty_elements is true, empty elements are rendered as <foo/> instead of <foo></foo>, unless a flush occurs before the call to endElementNS(), in which case the opening is finished before flushing, thus the long form is generated.

If sorted_attributes is True, attributes are emitted in the lexical order of their qualified names (except for namespace declarations, which are always sorted and always before the normal attributes). The default is not to do this, for performance. During testing, however, it is useful to have a consistent oder on the attributes.

Implementation of the SAX content handler interface (see xml.sax.handler.ContentHandler):

startDocument()[source]

Start the document. This method must be called before any other content handler method.

startPrefixMapping(prefix, uri)[source]

Start a prefix mapping which maps the given prefix to the given uri.

Note that prefix mappings are handled transactional. All announcements of prefix mappings are collected until the next call to startElementNS(). At that point, the mappings are collected and start to override the previously declared mappings until the corresponding endElementNS() call.

Also note that calling startPrefixMapping() is not mandatory; you can use any namespace you like at any time. If you use a namespace whose URI has not been associated with a prefix yet, a free prefix will automatically be chosen. To avoid unneccessary performance penalties, do not use prefixes of the form "{:d}".format(n), for any non-negative number of n.

It is however required to call endPrefixMapping() after a endElementNS() call for all namespaces which have been announced directly before the startElementNS() call (except for those which have been chosen automatically). Not doing so will result in a RuntimeError at the next startElementNS() or endElementNS() call.

During a transaction, it is not allowed to declare the same prefix multiple times.

startElementNS(name, qname, attributes=None)[source]

Start a sub-element. name must be a tuple of (namespace_uri, localname) and qname is ignored. attributes must be a dictionary mapping attribute tag tuples ((namespace_uri, attribute_name)) to string values. To use unnamespaced attributes, namespace_uri can be false (e.g. None or the empty string).

To use unnamespaced elements, namespace_uri in name must be false and no namespace without prefix must be currently active. If a namespace without prefix is active and namespace_uri in name is false, ValueError is raised.

Attribute values are of course automatically escaped.

characters(chars)[source]

Put character data in the currently open element. Special characters (such as <, > and &) are escaped.

If chars contains any ASCII control character, ValueError is raised.

endElementNS(name, qname)[source]

End a previously started element. name must be a (namespace_uri, localname) tuple and qname is ignored.

endPrefixMapping(prefix)[source]

End a prefix mapping declared with startPrefixMapping(). See there for more details.

endDocument()[source]

This must be called at the end of the document. Note that this does not call flush().

The following SAX content handler methods have deliberately not been implemented:

setDocumentLocator(locator)[source]

Not supported; there is no use case. Raises NotImplementedError.

skippedEntity(name)[source]

Not supported; there is no use case. Raises NotImplementedError.

ignorableWhitespace(whitespace)[source]

Not supported; could be mapped to characters().

startElement(name, attributes=None)[source]

Not supported; only elements with proper namespacing are supported by this generator.

endElement(name)[source]

Not supported; only elements with proper namespacing are supported by this generator.

These methods produce content which is invalid in XMPP XML streams and thus always raise ValueError:

processingInstruction(target, data)[source]

Not supported; explicitly forbidden in XMPP. Raises ValueError.

In addition to the SAX content handler interface, the following methods are provided:

flush()[source]

Call flush() on the object passed to the out argument of the constructor. In addition, any unfinished opening tags are finished, which can lead to expansion of the generated XML code (see note on the short_empty_elements argument at the class documentation).

The following generator function can be used to send several XSO instances along an XMPP stream without bothering with any cleanup.

aioxmpp.xml.write_xmlstream(f, to, from_=None, version=(1, 0), nsmap={}, sorted_attributes=False)[source]

Return a generator, which writes an XMPP XML stream on the file-like object f.

First, the generator writes the stream header and declares all namespaces given in nsmap plus the xmlstream namespace, then the output is flushed and the generator yields.

to must be a JID which refers to the peer. from_ may be the JID identifying the local side, but see RFC 6120 for considerations. version is the tuple of integers representing the locally supported XMPP version.

sorted_attributes is passed to the XMPPXMLGenerator which is used by this function.

Now, user code can send XSO objects to the generator using its send() method. These objects get serialized to the XML stream. Any exception raised during that is re-raised and the stream is closed.

Using the throw() method to throw a AbortStream exception will immediately stop the generator without closing the stream properly, but with a last flush call to the writer. This can be used to reset the stream.

aioxmpp.xml.write_objects(writer, *, autoflush=False)[source]

Return a generator. All xso.XSO objects sent into the generator (using it’s send() method) are written to the given writer. writer must be an object supporting the namespace-aware SAX interface.

If autoflush is true, flush() is called on writer after each object. Note that not all writers support flush(), as it is not part of the official SAX specification.

class aioxmpp.xml.AbortStream[source]

This is a signal exception which causes write_xmlstream() to stop immediately without closing the stream.

Processing XML streams

To convert streams of SAX events to XSO instances, the following classes and functions can be used:

class aioxmpp.xml.XMPPXMLProcessor[source]

This class is a xml.sax.handler.ContentHandler. It can be used to parse an XMPP XML stream.

When used with a xml.sax.xmlreader.XMLReader, it gradually processes the incoming XML stream. If any restricted XML is encountered, an appropriate StreamError is raised.

Warning

To achieve compliance with XMPP, it is recommended to use XMPPLexicalHandler as lexical handler, using xml.sax.xmlreader.XMLReader.setProperty():

parser.setProperty(xml.sax.handler.property_lexical_handler,
                   XMPPLexicalHandler)

Otherwise, invalid XMPP XML such as comments, entity references and DTD declarations will not be caught.

Exception handling: When an exception occurs while parsing a stream-level element, such as a stanza, the exception is stored internally and exception handling is invoked. During exception handling, all SAX events are dropped, until the stream-level element has been completely processed by the parser. Then, if available, on_exception is called, with the stored exception as the only argument. If on_exception is false (e.g. None), the exception is re-raised from the endElementNS() handler, in turn most likely destroying the SAX parsers internal state.

on_exception

May be a callable or None. If not false, the value will get called when exception handling has finished, with the exception as the only argument.

May be a callable or None. If not false, the value will get called whenever a stream footer is processed.

on_stream_header

May be a callable or None. If not false, the value will get called whenever a stream header is processed.

stanza_parser[source]

A XSOParser object (or compatible) which will receive the sax-ish events used in xso. It is driven using an instance of SAXDriver.

This object can only be set before startDocument() has been called (or after endDocument() has been called).

class aioxmpp.xml.XMPPLexicalHandler[source]

A lexical handler which rejects certain contents which are invalid in an XMPP XML stream:

  • comments,
  • dtd declarations,
  • non-predefined entities.

The class can be used as lexical handler directly; all methods are stateless and can be used both on the class and on objects of the class.

aioxmpp.xml.make_parser()[source]

Create a parser which is suitably configured for parsing an XMPP XML stream. It comes equipped with XMPPLexicalHandler.

Utility functions

aioxmpp.xml.serialize_single_xso(x)[source]

Serialize a single XSO x to a string. This is potentially very slow and should only be used for debugging purposes. It is generally more efficient to use a XMPPXMLGenerator to stream elements.