xml
— XML utilities and interfaces for handling XMPP XML streams¶
This module provides a few classes and functions which are useful when generating and parsing XML streams for XMPP.
Generating XML streams¶
The most useful class here is the XMPPXMLGenerator
:
-
class
aioxmpp.xml.
XMPPXMLGenerator
(out, short_empty_elements=True, sorted_attributes=False, additional_escapes=[])[source]¶ Class to generate XMPP-conforming XML bytes.
- Parameters
out – File-like object to which the bytes are written.
short_empty_elements (
bool
) – Write empty elements as<foo/>
instead of<foo></foo>
.sorted_attributes (
bool
) – Sort the attributes in the output. Note: this comes with a performance penalty. See below.additional_escapes (
Iterable
of 1-codepointstr
objects.) – Sequence of characters to escape in CDATA.
XMPPXMLGenerator
works similar toxml.sax.saxutils.XMLGenerator
, but has a few key differences:It supports only namespace-conforming XML documents
It automatically chooses namespace prefixes if a namespace has not been declared, while avoiding to use prefixes at all if possible
It is in general stricter on (explicit) namespace declarations, to avoid ambiguities
It always uses utf-8 ☺
It allows explicit flushing
out must be a file-like supporting both
file.write()
andfile.flush()
.If short_empty_elements is true, empty elements are rendered as
<foo/>
instead of<foo></foo>
, unless a flush occurs before the call toendElementNS()
, in which case the opening is finished before flushing, thus the long form is generated.If sorted_attributes is true, attributes are emitted in the lexical order of their qualified names (except for namespace declarations, which are always sorted and always before the normal attributes). The default is not to do this, for performance. During testing, however, it is useful to have a consistent oder on the attributes.
All characters in additional_escapes are escaped using XML entities. Note that
<
,>
and&
are always escaped. additional_escapes is converted to a dictionary for use withescape()
andquoteattr()
. Passing a dictionary to additional_escapes or passing multi-character strings as elements of additional_escapes is not supported since it may be (ab-)used to create invalid XMPP XML. additional_escapes affects both CDATA in XML elements as well as attribute values.Implementation of the SAX content handler interface (see
xml.sax.handler.ContentHandler
):-
startDocument
()[source]¶ Start the document. This method must be called before any other content handler method.
-
startPrefixMapping
(prefix, uri)[source]¶ Start a prefix mapping which maps the given prefix to the given uri.
Note that prefix mappings are handled transactional. All announcements of prefix mappings are collected until the next call to
startElementNS()
. At that point, the mappings are collected and start to override the previously declared mappings until the correspondingendElementNS()
call.Also note that calling
startPrefixMapping()
is not mandatory; you can use any namespace you like at any time. If you use a namespace whose URI has not been associated with a prefix yet, a free prefix will automatically be chosen. To avoid unnecessary performance penalties, do not use prefixes of the form"ns{:d}".format(n)
, for any non-negative number of n.It is however required to call
endPrefixMapping()
after aendElementNS()
call for all namespaces which have been announced directly before thestartElementNS()
call (except for those which have been chosen automatically). Not doing so will result in aRuntimeError
at the nextstartElementNS()
orendElementNS()
call.During a transaction, it is not allowed to declare the same prefix multiple times.
-
startElementNS
(name, qname, attributes=None)[source]¶ Start a sub-element. name must be a tuple of
(namespace_uri, localname)
and qname is ignored. attributes must be a dictionary mapping attribute tag tuples ((namespace_uri, attribute_name)
) to string values. To use unnamespaced attributes, namespace_uri can be false (e.g.None
or the empty string).To use unnamespaced elements, namespace_uri in name must be false and no namespace without prefix must be currently active. If a namespace without prefix is active and namespace_uri in name is false,
ValueError
is raised.Attribute values are of course automatically escaped.
-
characters
(chars)[source]¶ Put character data in the currently open element. Special characters (such as
<
,>
and&
) are escaped.If chars contains any ASCII control character,
ValueError
is raised.
-
endElementNS
(name, qname)[source]¶ End a previously started element. name must be a
(namespace_uri, localname)
tuple and qname is ignored.
-
endPrefixMapping
(prefix)[source]¶ End a prefix mapping declared with
startPrefixMapping()
. See there for more details.
-
endDocument
()[source]¶ This must be called at the end of the document. Note that this does not call
flush()
.
The following SAX content handler methods have deliberately not been implemented:
-
setDocumentLocator
(locator)[source]¶ Not supported; there is no use case. Raises
NotImplementedError
.
-
skippedEntity
(name)[source]¶ Not supported; there is no use case. Raises
NotImplementedError
.
-
ignorableWhitespace
(whitespace)[source]¶ Not supported; could be mapped to
characters()
.
-
startElement
(name, attributes=None)[source]¶ Not supported; only elements with proper namespacing are supported by this generator.
-
endElement
(name)[source]¶ Not supported; only elements with proper namespacing are supported by this generator.
These methods produce content which is invalid in XMPP XML streams and thus always raise
ValueError
:-
processingInstruction
(target, data)[source]¶ Not supported; explicitly forbidden in XMPP. Raises
ValueError
.
In addition to the SAX content handler interface, the following methods are provided:
-
flush
()[source]¶ Call
flush()
on the object passed to the out argument of the constructor. In addition, any unfinished opening tags are finished, which can lead to expansion of the generated XML code (see note on the short_empty_elements argument at the class documentation).
-
buffer
()[source]¶ Context manager to temporarily buffer the output.
- Raises
RuntimeError – If two
buffer()
context managers are used nestedly.
If the context manager is left without exception, the buffered output is sent to the actual sink. Otherwise, it is discarded.
In addition to the output being buffered, buffer also captures the entire state of the XML generator and restores it to the previous state if the context manager is left with an exception.
This can be used to fail-safely attempt to serialise a subtree and return to a well-defined state if serialisation fails.
flush()
is not called automatically.If
flush()
is called while abuffer()
context manager is active, no actual flushing happens (but unfinished opening tags are closed as usual, see the short_empty_arguments parameter).
-
class
aioxmpp.xml.
XMLStreamWriter
(f, to, from_=None, version=(1, 0), nsmap={}, sorted_attributes=False)[source]¶ A convenient class to write a standard conforming XML stream.
- Parameters
f – File-like object to write to.
to (
aioxmpp.JID
) – Address to which the connection is addressed.from (
aioxmpp.JID
) – Optional address from which the connection originates.version (
tuple
of (int
,int
)) – Version of the XML stream protocol.nsmap – Mapping of namespaces to declare at the stream header.
Note
The constructor does not send a stream header.
start()
must be called explicitly to send a stream header.The generated stream header follows RFC 6120 and has the
to
andversion
attributes as well as optionally thefrom
attribute (controlled by from_). In addition, the namespace prefixes defined by nsmap (mapping prefixes to namespace URIs) are declared on the stream header.Note
It is unfortunately not allowed to use namespace prefixes in stanzas which were declared in stream headers as convenient as that would be. The option is thus only useful to declare the default namespace for stanzas.
The following methods are used to generate output:
-
send
(xso)[source]¶ Send a single XML stream object.
- Parameters
xso (
aioxmpp.xso.XSO
) – Object to serialise and send.- Raises
Exception – from any serialisation errors, usually
ValueError
.
Serialise the xso and send it over the stream. If any serialisation error occurs, no data is sent over the stream and the exception is re-raised; the
send()
method thus provides strong exception safety.
Processing XML streams¶
To convert streams of SAX events to XSO
instances, the following classes and functions can be used:
-
class
aioxmpp.xml.
XMPPXMLProcessor
[source]¶ This class is a
xml.sax.handler.ContentHandler
. It can be used to parse an XMPP XML stream.When used with a
xml.sax.xmlreader.XMLReader
, it gradually processes the incoming XML stream. If any restricted XML is encountered, an appropriateStreamError
is raised.Warning
To achieve compliance with XMPP, it is recommended to use
XMPPLexicalHandler
as lexical handler, usingxml.sax.xmlreader.XMLReader.setProperty()
:parser.setProperty(xml.sax.handler.property_lexical_handler, XMPPLexicalHandler)
Otherwise, invalid XMPP XML such as comments, entity references and DTD declarations will not be caught.
Exception handling: When an exception occurs while parsing a stream-level element, such as a stanza, the exception is stored internally and exception handling is invoked. During exception handling, all SAX events are dropped, until the stream-level element has been completely processed by the parser. Then, if available,
on_exception
is called, with the stored exception as the only argument. Ifon_exception
is false (e.g.None
), the exception is re-raised from theendElementNS()
handler, in turn most likely destroying the SAX parsers internal state.-
on_exception
¶ May be a callable or
None
. If not false, the value will get called when exception handling has finished, with the exception as the only argument.
May be a callable or
None
. If not false, the value will get called whenever a stream footer is processed.
-
-
class
aioxmpp.xml.
XMPPLexicalHandler
[source]¶ A lexical handler which rejects certain contents which are invalid in an XMPP XML stream:
comments,
dtd declarations,
non-predefined entities.
The class can be used as lexical handler directly; all methods are stateless and can be used both on the class and on objects of the class.
-
aioxmpp.xml.
make_parser
()[source]¶ Create a parser which is suitably configured for parsing an XMPP XML stream. It comes equipped with
XMPPLexicalHandler
.
Utility functions¶
-
aioxmpp.xml.
serialize_single_xso
(x)[source]¶ Serialize a single XSO x to a string. This is potentially very slow and should only be used for debugging purposes. It is generally more efficient to use a
XMPPXMLGenerator
to stream elements.
-
aioxmpp.xml.
write_single_xso
(x, dest)[source]¶ Write a single XSO x to a binary file-like object dest.
-
aioxmpp.xml.
read_xso
(src, xsomap)[source]¶ Read a single XSO from a binary file-like input src containing an XML document.
xsomap must be a mapping which maps
XSO
subclasses to callables. These will be registered at a newly createdxso.XSOParser
instance which will be used to parse the document in src.The xsomap is thus used to determine the class parsing the root element of the XML document. This can be used to support multiple versions.