xml — XML utilities and interfaces for handling XMPP XML streams

This module provides a few classes and functions which are useful when generating and parsing XML streams for XMPP.

Generating XML streams

The most useful class here is the XMPPXMLGenerator:

class aioxmpp.xml.XMPPXMLGenerator(out, short_empty_elements=True, sorted_attributes=False, additional_escapes=[])[source]

Class to generate XMPP-conforming XML bytes.

Parameters
  • out – File-like object to which the bytes are written.

  • short_empty_elements (bool) – Write empty elements as <foo/> instead of <foo></foo>.

  • sorted_attributes (bool) – Sort the attributes in the output. Note: this comes with a performance penalty. See below.

  • additional_escapes (Iterable of 1-codepoint str objects.) – Sequence of characters to escape in CDATA.

XMPPXMLGenerator works similar to xml.sax.saxutils.XMLGenerator, but has a few key differences:

  • It supports only namespace-conforming XML documents

  • It automatically chooses namespace prefixes if a namespace has not been declared, while avoiding to use prefixes at all if possible

  • It is in general stricter on (explicit) namespace declarations, to avoid ambiguities

  • It always uses utf-8 ☺

  • It allows explicit flushing

out must be a file-like supporting both file.write() and file.flush().

If short_empty_elements is true, empty elements are rendered as <foo/> instead of <foo></foo>, unless a flush occurs before the call to endElementNS(), in which case the opening is finished before flushing, thus the long form is generated.

If sorted_attributes is true, attributes are emitted in the lexical order of their qualified names (except for namespace declarations, which are always sorted and always before the normal attributes). The default is not to do this, for performance. During testing, however, it is useful to have a consistent oder on the attributes.

All characters in additional_escapes are escaped using XML entities. Note that <, > and & are always escaped. additional_escapes is converted to a dictionary for use with escape() and quoteattr(). Passing a dictionary to additional_escapes or passing multi-character strings as elements of additional_escapes is not supported since it may be (ab-)used to create invalid XMPP XML. additional_escapes affects both CDATA in XML elements as well as attribute values.

Implementation of the SAX content handler interface (see xml.sax.handler.ContentHandler):

startDocument()[source]

Start the document. This method must be called before any other content handler method.

startPrefixMapping(prefix, uri)[source]

Start a prefix mapping which maps the given prefix to the given uri.

Note that prefix mappings are handled transactional. All announcements of prefix mappings are collected until the next call to startElementNS(). At that point, the mappings are collected and start to override the previously declared mappings until the corresponding endElementNS() call.

Also note that calling startPrefixMapping() is not mandatory; you can use any namespace you like at any time. If you use a namespace whose URI has not been associated with a prefix yet, a free prefix will automatically be chosen. To avoid unnecessary performance penalties, do not use prefixes of the form "ns{:d}".format(n), for any non-negative number of n.

It is however required to call endPrefixMapping() after a endElementNS() call for all namespaces which have been announced directly before the startElementNS() call (except for those which have been chosen automatically). Not doing so will result in a RuntimeError at the next startElementNS() or endElementNS() call.

During a transaction, it is not allowed to declare the same prefix multiple times.

startElementNS(name, qname, attributes=None)[source]

Start a sub-element. name must be a tuple of (namespace_uri, localname) and qname is ignored. attributes must be a dictionary mapping attribute tag tuples ((namespace_uri, attribute_name)) to string values. To use unnamespaced attributes, namespace_uri can be false (e.g. None or the empty string).

To use unnamespaced elements, namespace_uri in name must be false and no namespace without prefix must be currently active. If a namespace without prefix is active and namespace_uri in name is false, ValueError is raised.

Attribute values are of course automatically escaped.

characters(chars)[source]

Put character data in the currently open element. Special characters (such as <, > and &) are escaped.

If chars contains any ASCII control character, ValueError is raised.

endElementNS(name, qname)[source]

End a previously started element. name must be a (namespace_uri, localname) tuple and qname is ignored.

endPrefixMapping(prefix)[source]

End a prefix mapping declared with startPrefixMapping(). See there for more details.

endDocument()[source]

This must be called at the end of the document. Note that this does not call flush().

The following SAX content handler methods have deliberately not been implemented:

setDocumentLocator(locator)[source]

Not supported; there is no use case. Raises NotImplementedError.

skippedEntity(name)[source]

Not supported; there is no use case. Raises NotImplementedError.

ignorableWhitespace(whitespace)[source]

Not supported; could be mapped to characters().

startElement(name, attributes=None)[source]

Not supported; only elements with proper namespacing are supported by this generator.

endElement(name)[source]

Not supported; only elements with proper namespacing are supported by this generator.

These methods produce content which is invalid in XMPP XML streams and thus always raise ValueError:

processingInstruction(target, data)[source]

Not supported; explicitly forbidden in XMPP. Raises ValueError.

In addition to the SAX content handler interface, the following methods are provided:

flush()[source]

Call flush() on the object passed to the out argument of the constructor. In addition, any unfinished opening tags are finished, which can lead to expansion of the generated XML code (see note on the short_empty_elements argument at the class documentation).

buffer()[source]

Context manager to temporarily buffer the output.

Raises

RuntimeError – If two buffer() context managers are used nestedly.

If the context manager is left without exception, the buffered output is sent to the actual sink. Otherwise, it is discarded.

In addition to the output being buffered, buffer also captures the entire state of the XML generator and restores it to the previous state if the context manager is left with an exception.

This can be used to fail-safely attempt to serialise a subtree and return to a well-defined state if serialisation fails.

flush() is not called automatically.

If flush() is called while a buffer() context manager is active, no actual flushing happens (but unfinished opening tags are closed as usual, see the short_empty_arguments parameter).

class aioxmpp.xml.XMLStreamWriter(f, to, from_=None, version=(1, 0), nsmap={}, sorted_attributes=False)[source]

A convenient class to write a standard conforming XML stream.

Parameters
  • f – File-like object to write to.

  • to (aioxmpp.JID) – Address to which the connection is addressed.

  • from (aioxmpp.JID) – Optional address from which the connection originates.

  • version (tuple of (int, int)) – Version of the XML stream protocol.

  • nsmap – Mapping of namespaces to declare at the stream header.

Note

The constructor does not send a stream header. start() must be called explicitly to send a stream header.

The generated stream header follows RFC 6120 and has the to and version attributes as well as optionally the from attribute (controlled by from_). In addition, the namespace prefixes defined by nsmap (mapping prefixes to namespace URIs) are declared on the stream header.

Note

It is unfortunately not allowed to use namespace prefixes in stanzas which were declared in stream headers as convenient as that would be. The option is thus only useful to declare the default namespace for stanzas.

closed

True if the stream has been closed by abort() or close(). Read-only.

The following methods are used to generate output:

start()[source]

Send the stream header as described above.

send(xso)[source]

Send a single XML stream object.

Parameters

xso (aioxmpp.xso.XSO) – Object to serialise and send.

Raises

Exception – from any serialisation errors, usually ValueError.

Serialise the xso and send it over the stream. If any serialisation error occurs, no data is sent over the stream and the exception is re-raised; the send() method thus provides strong exception safety.

Warning

The behaviour of send() after abort() or close() and before start() is undefined.

abort()[source]

Abort the stream.

The stream is flushed and the internal data structures are cleaned up. No stream footer is sent. The stream is closed afterwards.

If the stream is already closed, this method does nothing.

close()[source]

Close the stream.

The stream footer is sent and the internal structures are cleaned up.

If the stream is already closed, this method does nothing.

Processing XML streams

To convert streams of SAX events to XSO instances, the following classes and functions can be used:

class aioxmpp.xml.XMPPXMLProcessor[source]

This class is a xml.sax.handler.ContentHandler. It can be used to parse an XMPP XML stream.

When used with a xml.sax.xmlreader.XMLReader, it gradually processes the incoming XML stream. If any restricted XML is encountered, an appropriate StreamError is raised.

Warning

To achieve compliance with XMPP, it is recommended to use XMPPLexicalHandler as lexical handler, using xml.sax.xmlreader.XMLReader.setProperty():

parser.setProperty(xml.sax.handler.property_lexical_handler,
                   XMPPLexicalHandler)

Otherwise, invalid XMPP XML such as comments, entity references and DTD declarations will not be caught.

Exception handling: When an exception occurs while parsing a stream-level element, such as a stanza, the exception is stored internally and exception handling is invoked. During exception handling, all SAX events are dropped, until the stream-level element has been completely processed by the parser. Then, if available, on_exception is called, with the stored exception as the only argument. If on_exception is false (e.g. None), the exception is re-raised from the endElementNS() handler, in turn most likely destroying the SAX parsers internal state.

on_exception

May be a callable or None. If not false, the value will get called when exception handling has finished, with the exception as the only argument.

May be a callable or None. If not false, the value will get called whenever a stream footer is processed.

on_stream_header

May be a callable or None. If not false, the value will get called whenever a stream header is processed.

stanza_parser

A XSOParser object (or compatible) which will receive the sax-ish events used in xso. It is driven using an instance of SAXDriver.

This object can only be set before startDocument() has been called (or after endDocument() has been called).

class aioxmpp.xml.XMPPLexicalHandler[source]

A lexical handler which rejects certain contents which are invalid in an XMPP XML stream:

  • comments,

  • dtd declarations,

  • non-predefined entities.

The class can be used as lexical handler directly; all methods are stateless and can be used both on the class and on objects of the class.

aioxmpp.xml.make_parser()[source]

Create a parser which is suitably configured for parsing an XMPP XML stream. It comes equipped with XMPPLexicalHandler.

Utility functions

aioxmpp.xml.serialize_single_xso(x)[source]

Serialize a single XSO x to a string. This is potentially very slow and should only be used for debugging purposes. It is generally more efficient to use a XMPPXMLGenerator to stream elements.

aioxmpp.xml.write_single_xso(x, dest)[source]

Write a single XSO x to a binary file-like object dest.

aioxmpp.xml.read_xso(src, xsomap)[source]

Read a single XSO from a binary file-like input src containing an XML document.

xsomap must be a mapping which maps XSO subclasses to callables. These will be registered at a newly created xso.XSOParser instance which will be used to parse the document in src.

The xsomap is thus used to determine the class parsing the root element of the XML document. This can be used to support multiple versions.

aioxmpp.xml.read_single_xso(src, type_)[source]

Read a single XSO of the given type_ from the binary file-like input src and return the instance.