This subpackage deals with XML Stream Objects. XSOs can be stanzas, but in general anything which is sent after the XML stream header.
The facilities in this subpackage are supposed to help developers of XEP plugins, as well as the main development of aioxmpp. The subpackage is split in two parts, aioxmpp.xso.model, which provides facilities to allow declarative-style parsing and un-parsing of XML subtrees into XSOs and the aioxmpp.xso.types module, which provides classes which implement validators and type parsers for content represented as strings in XML.
An XSO is an object whose class inherits from aioxmpp.xso.XSO.
Tags, as used by etree, are used throughout this module. Note that we are representing tags as tuples of (namespace_uri, localname), where namespace_uri may be None.
See also
The functions normalize_tag() and tag_to_str() are useful to convert from and to ElementTree compatible strings.
This module uses suspendable functions, implemented as generators, at several points. These may also be called coroutines, but have nothing to do with coroutines as used by asyncio, which is why we will call them suspendable functions here.
Suspendable functions possibly take arguments and then operate on input which is fed to them in a push-manner step by step (using the send() method). The main usage in this module is to process SAX events: The SAX events are processed step-by-step by the functions, and when the event is fully processed, it suspends itself (using yield) until the next event is sent into it.
Normalize an XML element tree tag into the tuple format. The following input formats are accepted:
Return a two-tuple consisting the (namespace_uri, localpart) format.
This module provides facilities to create classes which map to full XML stream subtrees (for example stanzas including payload).
To create such a class, derive from XSO and provide attributes using the Attr, Text, Child and ChildList descriptors.
The following descriptors can be used to load XSO attributes from XML. There are two fundamentally different descriptor types: scalar and non-scalar (e.g. list) descriptors. scalar descriptor types always accept a value of None, which represents the absence of the object (unless it is required by some means, e.g. Attr(required=True)). Non-scalar descriptors generally have a different way to describe the absence and in addition have a mutable value. Assignment to the descriptor attribute is strictly type-checked.
Many of the arguments and attributes used for the scalar descriptors are similar. They are described in detail on the Attr class and not repeated that detailed on the other classes. Refer to the documentation of the Attr class in those cases.
When assigned to a class’ attribute, it binds that attribute to the XML attribute with the given tag. tag must be a valid input to normalize_tag().
The following arguments occur at several of the descriptor classes, and are all available at Attr.
Parameters: |
|
---|
Note
The default argument does not need to comply with either type_ or validator. This can be used to convey meaning with the absence of the attribute. Note that assigning the default value is not possible if it does not comply with type_ or validator and the del operator must be used instead.
Convert the given value using the set type_ and store it into instance’ attribute.
Handle a missing attribute on instance. This is called whenever no value for the attribute is found during parsing. The call to missing() is independent of the value of required.
If the missing callback is not None, it is called with the instance and the ctx as arguments. If the returned value is not None, it is used as the value of the attribute (validation takes place as if the value had been set from the code, not as if the value had been received from XML) and the handler returns.
If the missing callback is None or returns None, the handling continues as normal: if required is true, a ValueError is raised.
The LangAttr is identical to Attr, except that the type_, tag and missing arguments are already bound. The tag is set to the (namespaces.xml, "lang") value to match xml:lang attributes. type_ is a xso.LanguageTag instance and missing is set to lang_attr().
When assigned to a class’ attribute, it collects any child which matches any XSO.TAG of the given classes.
The tags among the classes must be unique, otherwise ValueError is raised on construction.
Instead of the default argument like supplied by Attr, Child only supports required: if required is a false value (the default), a missing child is tolerated and None is valid value for the described attribute. Otherwise, a missing matching child is an error and the attribute cannot be set to None.
Return a dictionary mapping the tags of the supported classes to the classes themselves. Can be used to obtain a set of supported tags.
Detect the object to instanciate from the arguments ev_args of the "start" event. The new object is stored at the corresponding descriptor attribute on instance.
This method is suspendable.
When assigned to a class’ attribute, this descriptor represents the presence or absence of a single child with a tag from a given set of valid tags.
tags must be an iterable of valid arguments to normalize_tag(). If normalize_tag() returns a false value (such as None) as namespace_uri, it is replaced with default_ns (defaulting to None, which makes this sentence a no-op). This allows a benefit to readability if you have many tags which share the same namespace.
text_policy, child_policy and attr_policy describe the behaviour if the child element unexpectedly has text, children or attributes, respectively. The default for each is to fail with a ValueError.
If allow_none is True, assignment of None to the attribute to which this descriptor belongs is allowed and represents the absence of the child element.
If declare_prefix is not False (note that None is a valid, non-False value in this context!), the namespace is explicitly declared using the given prefix when serializing to SAX.
When assigned to a class’ attribute, it binds that attribute to the XML character data of a child element with the given tag. tag must be a valid input to normalize_tag().
The type_, validate, validator and default arguments behave like in Attr.
child_policy is applied when from_events() encounters an element in the child element of which it is supposed to extract text. Likewise, attr_policy is applied if an attribute is encountered on the element.
declare_prefix works as for ChildTag.
Starting with the element to which the start event information in ev_args belongs, parse text data. If any children are encountered, child_policy is enforced (see UnknownChildPolicy). Likewise, if the start event contains attributes, attr_policy is enforced (c.f. UnknownAttrPolicy).
The extracted text is passed through type_ and validator and if it passes, stored in the attribute on the instance with which the property is associated.
This method is suspendable.
When assigned to a class’ attribute, it collects all character data of the XML element.
Note that this destroys the relative ordering of child elements and character data pieces. This is known and a WONTFIX, as it is not required in XMPP to keep that relative order: Elements either have character data or other elements as children.
The type_, validator, validate and default arguments behave like in Attr.
Convert the given value using the set type_ and store it into instance’ attribute.
The ChildList works like Child, with two key differences:
Like Child.from_events(), but instead of replacing the attribute value, the new object is appended to the list.
Like Child.to_node(), but instead of serializing a single object, all objects in the list are serialized.
The ChildMap class works like ChildList, but instead of storing the child objects in a list, they are stored in a map which contains an XSOList of objects for each tag.
key may be callable. If it is given, it is used while parsing to determine the dictionary key under which a newly parsed XSO will be put. For that, the key callable is called with the newly parsed XSO as the only argument and is expected to return the key.
Like ChildList.from_events(), but the object is appended to the list associated with its tag in the dict.
Serialize all objects in the dict associated with the descriptor at instance to the given parent.
The order of elements within a tag is preserved; the order of the tags relative to each other is undefined.
The following utility function is useful when filling data into descriptors using this class:
Take an iterable of items and group it into the given dest dict, using the key function.
The dest dict must either already contain the keys which are generated by the key function for the items in items, or must default them suitably. The values of the affected keys must be sequences or objects with an append() method which does what you want it to do.
The ChildLangMap class is a specialized version of the ChildMap, which uses a key function to group the children by their XML language tag.
It is expected that the language tag is available as lang attribute on the objects stored in this map.
When assigned to a class’ attribute, it collects all children which are not known to any other descriptor into a list of XML subtrees.
The default is fixed at an empty list.
Collect the events and convert them to a single XML subtree, which then gets appended to the list at instance. ev_args must be the arguments of the "start" event of the new child.
This method is suspendable.
The child lists in ChildList, ChildMap and ChildLangMap descriptors use a specialized list-subclass which provides advanced capabilities for filtering XSO objects.
A list subclass; it provides the complete list interface with the addition of the following methods:
Return an iterable which produces a sequence of the elements inside this XSOList, filtered by the criteria given as arguments. The fucntion starts with a working sequence consisting of the whole list.
If type_ is not None, elements which are not an instance of the given type are excluded from the working sequence.
If lang is not None, it must be either a LanguageRange or an iterable of language ranges. The set of languages present among the working sequence is determined and used for a call to lookup_language. If the lookup returns a language, all elements whose lang is different from that value are excluded from the working sequence.
Note
If an iterable of language ranges is given, it is evaluated into a list. This may be of concern if a huge iterable is about to be used for language ranges, but it is an requirement of the lookup_language function which is used under the hood.
Note
Filtering by language assumes that the elements have a LangAttr descriptor named lang.
If attrs is not empty, the filter iterates over each key-value pair. For each iteration, all elements which do not have an attribute of the name in key or where that attribute has a value not equal to value are excluded from the working sequence.
In general, the iterable returned from filter() can only be used once. It is dynamic in the sense that changes to elements which are in the list behind the last element returned from the iterator will still be picked up when the iterator is resumed.
This method is a convencience wrapper around filter() which evaluates the result into a list and returns that list.
In the future, methods to add indices to XSOList instances may be added; right now, there is no need for the huge complexity which would arise from keeping the indices up-to-date with changes in the elements attributes.
To parse XSOs, an asynchronous approach which uses SAX-like events is followed. For this, the suspendable functions explained earlier are used. The main class to parse a XSO from events is XSOParser. To drive that suspendable callable from SAX events, use a SAXDriver.
A generic XSO parser which supports a dynamic set of XSOs to parse. XSOParser objects are callable and they are suspendable methods (i.e. calling a XSOParser returns a generator which parses stanzas from sax-ish events. Use with SAXDriver).
Example use:
# let Message be a XSO class, like in the XSO example
result = None
def catch_result(value):
nonlocal result
result = value
parser = aioxmpp.xso.XSOParser()
parser.add_class(Message, catch_result)
sd = aioxmpp.xso.SAXDriver(parser)
lxml.sax.saxify(lmxl.etree.fromstring(
"<message id='foo' from='bar' type='chat' />"
))
The following methods can be used to dynamically add and remove top-level XSO classes.
Add a class cls for parsing as root level element. When an object of cls type has been completely parsed, callback is called with the object as argument.
Remove a XSO class cls from parsing. This method raises KeyError with the classes TAG attribute as argument if removing fails because the class is not registered.
Return the internal mapping which maps tags to tuples of (cls, callback).
Warning
The results of modifying this dict are undefined. Make a copy if you need to modify the result of this function.
This is a xml.sax.handler.ContentHandler subclass which only supports namespace-conforming SAX event sources.
dest_generator_factory must be a function which returns a new suspendable method supporting the interface of XSOParser. The SAX events are converted to an internal event format and sent to the suspendable function in order.
on_emit may be a callable. Whenever a suspendable function returned by dest_generator_factory returns, with the return value as sole argument.
When you are done with a SAXDriver, you should call close() to clean up internal parser state.
Clean up all internal state.
The XSO base class makes use of the XMLStreamClass metaclass and provides implementations for utility methods. For an object to work with this module, it must derive from XSO or provide an identical interface.
XSO is short for XML Stream Object and means an object which represents a subtree of an XML stream. These objects can also be created and validated on-the-fly from SAX-like events using XSOParser.
The constructor does not require any arguments and forwards them directly the next class in the resolution order. Note that during deserialization, __init__ is not called. It is assumed that all data is loaded from the XML stream and thus no initialization is required.
This is beneficial to applications, as it allows them to define mandatory arguments for __init__. This would not be possible if __init__ was called during deserialization. A way to execute code after successful deserialization is provided through xso_after_load().
XSO objects support copying. Like with deserialisation, __init__ is not called during copy. The default implementation only copies the XSO descriptors’ values (with deepcopy, they are copied deeply). If you have more attributes to copy, you need to override __copy__ and __deepcopy__ methods.
Changed in version 0.4: Copy and deepcopy support has been added. Previously, copy copied not enough data, while deepcopy copied too much data (including descriptor objects).
To declare an XSO, inherit from XSO and provide the following attributes on your class:
See also
The documentation of xso.model.XMLStreamClass holds valuable information with respect to subclassing and modifying XSO subclasses, as well as restrictions on the use of the said attribute descriptors.
Note
Attributes whose name starts with xso_ are reserved for use by the XSO implementation. Do not use these in your code if you can possibly avoid it.
To further influence the parsing behaviour of a class, two attributes are provided which give policies for unexpected elements in the XML tree:
A value from the UnknownChildPolicy enum which defines the behaviour if a child is encountered for which no matching attribute is found.
Note that this policy has no effect if a Collector descriptor is present, as it takes all children for which no other descriptor exists, thus all children are known.
A value from the UnknownAttrPolicy enum which defines the behaviour if an attribute is encountered for which no matching descriptor is found.
Example:
class Body(aioxmpp.xso.XSO):
TAG = ("jabber:client", "body")
text = aioxmpp.xso.Text()
class Message(aioxmpp.xso.XSO):
TAG = ("jabber:client", "message")
UNKNOWN_CHILD_POLICY = aioxmpp.xso.UnknownChildPolicy.DROP
type_ = aioxmpp.xso.Attr(tag="type", required=True)
from_ = aioxmpp.xso.Attr(tag="from", required=True)
to = aioxmpp.xso.Attr(tag="to")
id_ = aioxmpp.xso.Attr(tag="id")
body = aioxmpp.xso.Child([Body])
Beyond the validation of the individual descriptor values, it is possible to implement more complex validation steps by overriding the validate() method:
Validate the objects structure beyond the values of individual fields (which have their own validators).
This first calls _PropBase.validate_contents() recursively on the values of all child descriptors. These may raise (or re-raise) errors which occur during validation of the child elements.
To implement your own validation logic in a subclass of XSO, override this method and call it via super() before doing your own validation.
Validate is called by the parsing stack after an object has been fully deserialized from the SAX event stream. If the deserialization fails due to invalid values of descriptors or due to validation failures in child objects, this method is obviously not called.
The following methods are available on instances of XSO:
The following class methods are provided by the metaclass:
Create an instance of this class, using the events sent into this function. ev_args must be the event arguments of the "start" event.
See also
You probably should not call this method directly, but instead use XSOParser with a SAXDriver.
Note
While this method creates an instance of the class, __init__ is not called. See the documentation of xso.XSO() for details.
This method is suspendable.
Register a new XMLStreamClass instance child_cls for a given Child descriptor prop.
Warning
This method cannot be used after a class has been derived from this class. This is for consistency: the method modifies the bookkeeping attributes of the class. There would be two ways to deal with the situation:
Obviously, (2) is bad, which is why it is not supported anymore. (1) might be supported at some point in the future.
Attempting to use register_child() on a class which already has subclasses results in a TypeError.
Note that first using register_child() and only then deriving clasess is a valid use: it will still lead to a consistent inheritance hierarchy and is a convenient way to break reference cycles (e.g. if an XSO may be its own child).
To customize behaviour of deserialization, these methods are provided which can be re-implemented by subclasses:
After an object has been successfully deserialized, this method is called. Note that __init__ is never called on objects during deserialization.
This method is called whenever an error occurs while parsing.
If an exception is raised by the parsing function of a descriptor attribute, such as Attr, the descriptor is passed as first argument, the exc_info tuple as third argument and the arguments which led to the descriptor being invoked as second argument.
If an unknown child is encountered and the UNKNOWN_CHILD_POLICY is set to UnknownChildPolicy.FAIL, descriptor and exc_info are passed as None and ev_args are the arguments to the "start" event of the child (i.e. a triple (namespace_uri, localname, attributes)).
If the error handler wishes to suppress the exception, it must return a true value. Otherwise, the exception is propagated (or a new exception is raised, if the error was not caused by an exception). The error handler may also raise its own exception.
Warning
Suppressing exceptions can cause invalid input to reside in the object or the object in general being in a state which violates the schema.
For example, suppressing exceptions about missing attributes will cause the attribute to remain uninitialized (i.e. left at its default value).
Even if the error handler suppresses an exception caused by a broken child, that child will not be added to the object.
The metaclass takes care of collecting the special descriptors in attributes where they can be used by the SAX event interpreter to fill the class with data. It also provides a class method for late registration of child classes.
There should be no need to use this metaclass directly when implementing your own XSO classes. Instead, derive from XSO.
The following restrictions apply when a class uses the XMLStreamClass metaclass:
Objects of this metaclass (i.e. classes) have some useful attributes. The following attributes are gathered from the namespace of the class, by collecting the different XSO-related descriptors:
The Text descriptor object associated with this class. This is None if no attribute using that descriptor is declared on the class.
The Collector descriptor object associated with this class. This is None if no attribute using that descriptor is declared on the class.
A dictionary mapping element tags to the Child (or similar) descriptor objects which accept these child elements.
A dictionary which defines the namespace mappings which shall be declared when serializing this element. It must map namespace prefixes (such as None or "foo") to namespace URIs.
For maximum compatibility with legacy XMPP implementations (I’m looking at you, ejabberd!), DECLARE_NS is set by this metaclass unless it is provided explicitly when declaring the class:
Warning
It is discouraged to use namespace prefixes of the format "ns{:d}".format(n), for any given number n. These prefixes are reserved for ad-hoc namespace declarations, and attempting to use them may have unwanted side-effects.
Changed in version 0.4: The automatic generation of the DECLARE_NS attribute was added in 0.4.
Note
XSO defines defaults for more attributes which also must be present on objects which are used as XSOs.
When inheriting from XMLStreamClass objects, the properties are merged sensibly.
Rebinding attributes of XMLStreamClass instances (i.e. classes using this metaclass) is somewhat restricted. The following rules cannot be broken, attempting to do so will result in TypeError being raised when setting the attribute:
The values of the following enumerations are used on “magic” attributes of XMLStreamClass instances (i.e. classes).
Describe the event which shall take place whenever a child element is encountered for which no descriptor can be found to parse it.
Raise a ValueError
Drop and ignore the element and all of its children
Describe the event which shall take place whenever a XML attribute is encountered for which no descriptor can be found to parse it.
Raise a ValueError
Drop and ignore the attribute
Describe the event which shall take place whenever XML character data is encountered on an object which does not support it.
Raise a ValueError
Drop and ignore the text
Control which ways to set a value in a descriptor are passed through a validator.
Values which are obtained from XML source are validated.
Values which are set through attribute access are validated.
All values, whether set by attribute or obtained from XML source, are validated.
The following exceptions are generated at some places in this module:
Subclass of ValueError. ev_args must be the arguments of the "start" event and are stored as the ev_args attribute for inspection.
The ev_args passed to the constructor.
The following special value is used to indicate that no default is used with a descriptor:
This is a special value which is used to indicate that no defaulting should take place. It can be passed to the default arguments of descriptors, and usually is the default value of these arguments.
It compares unequal to everything but itself, does not support ordering, conversion to bool, float or integer.
This module provides classes whose objects can be used as types and validators in model.
Types are used to convert strings obtained from XML character data or attribute contents to python types. They are valid values for type_ arguments e.g. for Attr.
This is the interface all types must implement.
Force the given value v to be of the type represented by this AbstractType. check() is called when user code assigns values to descriptors which use the type; it is notably not called when values are extracted from SAX events, as these go through parse() and that is expected to return correctly typed values.
If v cannot be sensibly coerced, TypeError is raised (in some rare occasions, ValueError may be ok too).
Return a coerced version of v or v itself if it matches the required type.
Convert the given string v into a value of the appropriate type this class implements and return the result.
If conversion fails, ValueError is raised.
The result of parse() must pass through check().
Convert the value v of the type this class implements to a str.
This conversion does not fail.
Interpret the input value as string.
Optionally, a stringprep function prepfunc can be applied on the string. A stringprep function must take the string and prepare it accordingly; if it is invalid input, it must raise ValueError. Otherwise, it shall return the prepared string.
If no prepfunc is given, this type is the identity operation.
Parse the value as boolean:
Parse the value as ISO datetime, possibly including microseconds and timezone information.
Timezones are handled as constant offsets from UTC, and are converted to UTC before the datetime object is returned (which is correctly tagged with UTC tzinfo). Values without timezone specification are not tagged.
This class makes use of pytz.
Parse the value as base64 and return the bytes object obtained from decoding.
If empty_as_equal is True, an empty value is represented using a single equal sign. This is used in the SASL protocol.
Parse the value as hexadecimal blob and return the bytes object obtained from decoding.
Parse the value as Jabber ID using fromstr() and return the aioxmpp.structs.JID object.
Parse the value as a host-port pair, as for example used for Stream Management reconnection location advisories.
Parses the value as Language Tag using fromstr().
Type coercion requires that any value assigned to a descriptor using this type is an instance of LanguageTag.
Validators validate the python values after they have been parsed from XML-sourced strings or even when being assigned to a descriptor attribute (depending on the choice in the validate argument).
They can be useful both for defending and rejecting incorrect input and to avoid producing incorrect output.
This is the interface all validators must implement. In addition, a validators documentation should clearly state on which types it operates.
Return True if the value adheres to the restrictions imposed by this validator and False otherwise.
By default, this method calls validate_detailed() and returns True if validate_detailed() returned an empty result.
Return an empty list if the value adheres to the restrictions imposed by this validator.
If the value does not comply, return a list of UserValueError instances which each represent a condition which was violated in a human-readable way.
Restrict the possible values to the values from values. Operates on any types.
Restrict the possible strings to the NMTOKEN specification of XML Schema Definitions. The validator only works with strings.
Warning
This validator is probably incorrect. It is a good first line of defense to avoid creating obvious incorrect output and should not be used as input validator.
It most likely falsely rejects valid values and may let through invalid values.
This validator checks that the value is an instance of any of the classes given in valid_classes.
valid_classes is not copied into the IsInstance instance, but instead shared; it can be mutated after the construction of IsInstance to allow addition and removal of classes.
Some patterns reoccur when using this subpackage. For these, base classes are provided which faciliate the use.
One of the recurring patterns when using xso is the use of a XSO subclass to represent an XML node which has only character data and an xml:lang attribute.
The text and lang arguments to the constructor can be used to initialize the attributes.
This class provides exactly that. It inherits from XSO.
The xml:lang of the node, as LanguageTag.
The textual content of the node (XML character data).
Example use as base class:
class Subject(xso.AbstractTextChild):
TAG = (namespaces.client, "subject")
The full example can also be found in the source code of stanza.Subject.