class documentation

class DumperClass(Visitor): (source)

Known subclasses: zim.formats.html.Dumper, zim.formats.plain.Dumper

View In Hierarchy

Base class for dumper classes. Dumper classes serialize the content of a parse tree back to a text representation of the page content. Therefore this class implements the visitor API, so it can be used with any parse tree implementation or parser object that supports this API.

To implement a dumper class, you need to define handlers for all tags that can appear in a page. Tags that are represented by a simple prefix and postfix string can be defined in the dictionary TAGS. For example to define the italic tag in html output the dictionary should contain a definition like: EMPHASIS: ('<i>', '</i>').

For tags that require more complex logic you can define a method to format the tag. Typical usage is to format link attributes in such a method. The method name should be dump_ + the name of the tag, e.g. dump_link() for links (see the constants with tag names for the other tags). Such a sump method will get 3 arguments: the tag name itself, a dictionary with the tag attributes and a list of strings that form the tag content. The method should return a list of strings that represents the formatted text.

This base class takes care of a stack of nested formatting tags and when a tag is closed either picks the appropriate prefix and postfix from TAGS or calls the corresponding dump_ method. As a result tags are serialized depth-first.

Method __init__ Undocumented
Method append Convenience function to open a tag, append text and close it immediatly.
Method dump Format a parsetree to text @param tree: a parse tree object that supports a visit() method @returns: a list of lines
Method dump_object Dumps objects defined by InsertedObjectType
Method dump_object_fallback Method to serialize objects that do not have their own handler for this format. @implementation: must be implemented in sub-classes
Method encode_text Optional method to encode text elements in the output
Method end No summary
Method get_lines Return the dumped content as a list of lines Should only be called after closing the top level element
Method isrtl No summary
Method prefix_lines No summary
Method start Start formatted region
Method text Append text @param text: text to be appended as string @implementation: optional for subclasses
Constant TAGS Undocumented
Constant TEMPLATE_OPTIONS Undocumented
Instance Variable context the stack of open tags maintained by this class. Can be used in dump_ methods to inspect the parent scope of the format. Elements on this stack have "tag", "attrib" and "text" attributes. Keep in mind that the parent scope is not yet complete when a tag is serialized.
Instance Variable linker the (optional) Linker object, used to resolve links
Instance Variable template_options a ConfigDict with options that may be set in a template (so inherently not safe !) to control the output style. Formats using this need to define the supported keys in the dict TEMPLATE_OPTIONS.
Instance Variable _text Undocumented
def __init__(self, linker=None, template_options=None): (source)

Undocumented

def append(self, tag, attrib=None, text=None): (source)

Convenience function to open a tag, append text and close it immediatly.

Can raise VisitorStop or VisitorSkip, see start() for the conditions.

Parameters
tagthe tag name
attriboptional dict with attributes
textformatted text
Unknown Field: implementation
optional for subclasses, default implementation calls start(), text(), and end()
def dump(self, tree): (source)
Format a parsetree to text
Parameters
treea parse tree object that supports a visit() method
Returns
a list of lines
def dump_object(self, tag, attrib, strings=[]): (source)
Dumps objects defined by InsertedObjectType
def dump_object_fallback(self, tag, attrib, strings=None): (source)
Method to serialize objects that do not have their own handler for this format.
Unknown Field: implementation
must be implemented in sub-classes
def encode_text(self, tag, text): (source)
Optional method to encode text elements in the output
Parameters
tagformatting tag
texttext to be encoded
Returns
encoded text
Note
Do not apply text encoding in the dump_ methods, the list of strings given there may contain prefix and postfix formatting of nested tags.
Unknown Field: implementation
optional, default just returns unmodified input
def end(self, tag): (source)
End formatted region
Parameters
tagthe tag name
Raises
AssertionErrorwhen tag does not match current state
Unknown Field: implementation
optional for subclasses
def get_lines(self): (source)
Return the dumped content as a list of lines Should only be called after closing the top level element
def isrtl(self, text): (source)
Check for Right To Left script
Parameters
textthe text to check
Returns
True if text starts with characters in a RTL script, or None if direction is not determined.
def prefix_lines(self, prefix, strings): (source)
Convenience method to wrap a number of lines with e.g. an indenting sequence.
Parameters
prefixa string to prefix each line
stringsa list of pieces of text
Returns
a new list of lines, each starting with prefix
def start(self, tag, attrib=None): (source)

Start formatted region

Visitor objects can raise two exceptions in this method to influence the tree traversal:

  1. VisitorStop will cancel the current parsing, but without raising an error. So code implementing a visit method should catch this.
  2. VisitorSkip can be raised when the visitor wants to skip a node, and should prevent the implementation from further decending into this node
Parameters
tagthe tag name
attriboptional dict with attributes
Note
If the visitor modifies the attrib dict on nodes, this will modify the tree. If this is not intended, the implementation needs to take care to copy the attrib to break the reference.
Unknown Field: implementation
optional for subclasses
def text(self, text): (source)
Append text
Parameters
texttext to be appended as string
Unknown Field: implementation
optional for subclasses

Undocumented

Value
{}
TEMPLATE_OPTIONS: dict = (source)

Undocumented

Value
{}
context = (source)
the stack of open tags maintained by this class. Can be used in dump_ methods to inspect the parent scope of the format. Elements on this stack have "tag", "attrib" and "text" attributes. Keep in mind that the parent scope is not yet complete when a tag is serialized.
linker = (source)
the (optional) Linker object, used to resolve links
template_options = (source)
a ConfigDict with options that may be set in a template (so inherently not safe !) to control the output style. Formats using this need to define the supported keys in the dict TEMPLATE_OPTIONS.
_text: list = (source)

Undocumented