Baseclass for Parser implementations. As parsers are sitletons, they have to use the same metaclass. You usually will additionally want to derive from TweakSitleton for support for site-wide configuration of your parser:
# order of inheritance matters!
class RestParser(Tweaks.TweakSitleton, ParserBase):
__metaclass__ = Registry.SitletonMeta
def __init__(self, site):
super(RestParser, self).__init__(site,
# pass keyword arguments to TweakSitleton
)
def parse(self, fileref):
# fancy parsing happens here
return Document.Document()
Check out the TweakSitleton documentation for an example of arguments and their effects.
Parsers have to implement the parse() method.
Take a file name or filelike in fileref and parse the hell out of it. Return a Document instance with all relevant data filled in.
header_offset must be a non-negative integer. That amount of header levels will be added to any <h:hN /> elements encountered in the body element tree. A header_offset of 1 will thus convert all <h:h1 /> to <h:h2 />, all <h:h2 /> to <h:h2 /> and so on.
If the conversion would result in a <h:h7 /> or above, the tag is converted into a <h:p /> tag.
Note
This operation is in-place and returns None.
This class parses PyWebXML documents. Usually, you don’t create instances of this, you just access it using via the parser_registry attribute of your Site instance.
Parse the file referenced by fileref as PyWebXML document and return the resulting Document instance.
Take the root element of an ElementTree and interpret it as PyWebXML document. Return the resulting Document instance on success and raise on error.
header_offset works as documented in the base class’ transform_headers() method.