firmant.parsers

Parsers create objects that can be manipulated and rendered into the final webpage.

Modules in this package:

feeds
posts
static
staticrst
tags

Functions

firmant.parsers.class_name(cls)

The string representation of a class’s name.

For example if we define the class Foo, the full name of the class is firmant.utils.Foo. class_name() will return this as a string.

>>> class Foo(object): pass
...
>>> class_name(Foo)
'firmant.utils.Foo'

If an object that is not a class is passed to class_name(), then a TypeError will be raised.

>>> class_name('Foo')
Traceback (most recent call last):
TypeError: `cls` does not name a type.
firmant.parsers.log_uncaught_exceptions(func, log, message, save_traceback=False)

Catch and log exceptions of func.

Returns True if the function succeeds without throwing an exception.

message will be written to log if an exception is thrown, and False is returned. If save_trackback is true, the traceback will be saved to a temporary file.

In the normal case, the func will be called and True will be returned.

>>> def wont_raise_error():
...     print 'Success!'
>>> log_uncaught_exceptions(wont_raise_error, log, 'error!')
Success!
True

When func raises an error, the error will be caught, and message will be written to log as an error.

>>> def raises_error():
...     raise RuntimeError('Intentionally thrown')
>>> log_uncaught_exceptions(raises_error, log, 'error!')
Called log.error('error!')
Called log.info('traceback not saved')
False

If save_traceback is True, then a temporary file created with tempfile.mkstemp will be used to store the traceback.

>>> log_uncaught_exceptions(raises_error, log, 'error!', True) 
Called log.error('error!')
Called mkstemp(prefix='firmant', text=True)
Called log.error('...traceback saved to /...')
False

If an exception is thrown while saving to the file, it will warn about the potential for infinite recursion and stop:

>>> log_uncaught_exceptions(raises_error, log, 'error!', True)
Called log.error('error!')
Called mkstemp(prefix='firmant', text=True)
Called log.error("it's turtles all the way down")
False
firmant.parsers.publish_programmatically(source_class, source, source_path, destination_class, destination, destination_path, reader, reader_name, parser, parser_name, writer, writer_name, settings, settings_spec, settings_overrides, config_section, enable_exit_status)

Set up & run a Publisher for custom programmatic use. Return the encoded string output and the Publisher object.

Applications should not need to call this function directly. If it does seem to be necessary to call this function directly, please write to the Docutils-develop mailing list <http://docutils.sf.net/docs/user/mailing-lists.html#docutils-develop>.

Parameters:

  • source_class required: The class for dynamically created source objects. Typically io.FileInput or io.StringInput.
  • source: Type depends on source_class:
    • If source_class is io.FileInput: Either a file-like object (must have ‘read’ and ‘close’ methods), or None (source_path is opened). If neither source nor source_path are supplied, sys.stdin is used.
    • If source_class is io.StringInput required: The input string, either an encoded 8-bit string (set the ‘input_encoding’ setting to the correct encoding) or a Unicode string (set the ‘input_encoding’ setting to ‘unicode’).
  • source_path: Type depends on source_class:
    • io.FileInput: Path to the input file, opened if no source supplied.
    • io.StringInput: Optional. Path to the file or object that produced source. Only used for diagnostic output.
  • destination_class required: The class for dynamically created destination objects. Typically io.FileOutput or io.StringOutput.
  • destination: Type depends on destination_class:
    • io.FileOutput: Either a file-like object (must have ‘write’ and ‘close’ methods), or None (destination_path is opened). If neither destination nor destination_path are supplied, sys.stdout is used.
    • io.StringOutput: Not used; pass None.
  • destination_path: Type depends on destination_class:
    • io.FileOutput: Path to the output file. Opened if no destination supplied.
    • io.StringOutput: Path to the file or object which will receive the output; optional. Used for determining relative paths (stylesheets, source links, etc.).
  • reader: A docutils.readers.Reader object.
  • reader_name: Name or alias of the Reader class to be instantiated if no reader supplied.
  • parser: A docutils.parsers.Parser object.
  • parser_name: Name or alias of the Parser class to be instantiated if no parser supplied.
  • writer: A docutils.writers.Writer object.
  • writer_name: Name or alias of the Writer class to be instantiated if no writer supplied.
  • settings: A runtime settings (docutils.frontend.Values) object, for dotted-attribute access to runtime settings. It’s the end result of the SettingsSpec, config file, and option processing. If settings is passed, it’s assumed to be complete and no further setting/config/option processing is done.
  • settings_spec: A docutils.SettingsSpec subclass or object. Provides extra application-specific settings definitions independently of components. In other words, the application becomes a component, and its settings data is processed along with that of the other components. Used only if no settings specified.
  • settings_overrides: A dictionary containing application-specific settings defaults that override the defaults of other components. Used only if no settings specified.
  • config_section: A string, the name of the configuration file section for this application. Overrides the config_section attribute defined by settings_spec. Used only if no settings specified.
  • enable_exit_status: Boolean; enable exit status at end of processing?

Classes

class firmant.parsers.ParsedObject(**kwargs)

Bases: object

A parsed object that represents the structures on disk in a form that is suitable for writing.

The constructor will accept keyword arguments to automatically fill slots.

>>> class SampleObject(ParsedObject):
...     __slots__ = ['someattr']
...     _attributes = property(lambda s: {'someattr': s.someattr})
...
>>> SampleObject(someattr='value').someattr
'value'
>>> SampleObject(permalink='http://').permalink
'http://'

It is an error to provide arguments that are not in the slots of the class (or its base classes).

>>> SampleObject(notinslots=True)
Traceback (most recent call last):
AttributeError: Excess attributes: 'notinslots'
class firmant.parsers.Parser(environment, objects, action=None)

Bases: firmant.chunks.AbstractChunk

The base class of all chunks.

This class defines an abstract base class that all parsers are required to adhere to. To use this class in the creation of a parser, create a subclass with all necessary methods and properties overwritten.

See also

Module abc
This module is part of the Python standard library in 2.6+.

To create a new type of parser, inherit from Parser:

>>> class SampleParser(Parser):
...     type = 'objs'
...     paths = '^numbers/[0-9]$'
...     cls = ParsedObject
...     def parse(self, environment, objects, path):
...         objects[self.type].append(str(path))
...     def attributes(self, environment, path):
...         return {'path': path}
...     def root(self, environment):
...          return 'testdata/pristine/sample'

The new parser meets the criteria for two different abstract base classes:

>>> import firmant.chunks
>>> issubclass(SampleParser, firmant.chunks.AbstractChunk)
True
>>> issubclass(SampleParser, Parser)
True

Warning

When creating a parser, do not store state in the parser itself. While it appears that a parser is a single object, it will actually share state across two or more chunks during typical usage.

If it is necessary to store state, place it in environment keyed to the parser class:

>>> environment[SampleParser] = 'stored state goes here'

This is because of the split between path selection and parsing.

The remainder of this section is devoted to describing the implementation details of Parser‘s template methods.

Chunks are passed environment and object dictionaries. While it is not technically a chunk, the Parser interface follows the same pattern. When called with an environment and set of objects, a parser will return one more chunk (in addition to the environment and object dictionaries).

>>> environment = {'log': logger
...               }
>>> objects = {}
>>> sp = SampleParser(environment, objects)
>>> sp.scheduling_order
10
>>> pprint(sp(environment, objects)) 
({'log': <logging.Logger instance at 0x...>},
 {},
 [<firmant.parsers.SampleParser object at 0x...>])

Note

The chunks returned do not share any state with the Parser that created them. The fact that the class name is the same is an implementation detail that may change in the future.

The first chunk performs all parsing. In the future, parsing may be broken into more fine-grained steps. Right now, this is unnecessary.

>>> environment, objects, (parse,) = sp(environment, objects)
>>> parse.scheduling_order
200
>>> pprint(parse(environment, objects)) 
({'log': <logging.Logger instance at 0x...>},
 {'objs': ['numbers/1', 'numbers/2', 'numbers/3']},
 [])
cls
The class object that should be used for new parsed objects.
parse(environment, objects, path)

Parse the object at path.

Any new objects that are created during the parsing of the object at path should be added directly to the objects dictionary (this includes the parsed object itself).

paths

Determine which paths should be parsed.

This is a regular expression that will be used to match pathnames relative to root().

root(environment)
The root under which all objects to be parsed by this parser reside.
scheduling_order

The following scheduling orders apply to parsers:

10
At timestep 10, the parser will create the chunks for finding paths and parsing.
200
Iterate over the paths in paths() and pass each path to the parse() method. All new objects should be placed in the dictionary by parse().
type

The type of the primary object to be parsed (e.g. posts).

Parsed objects will be added to the objects dictionary under this key.

This value has no impact on secondary objects that are generated (e.g. objects that are created from embedded LaTeX equations).

class firmant.parsers.Publisher(reader=None, parser=None, writer=None, source=None, source_class=<class docutils.io.FileInput at 0x8ee420c>, destination=None, destination_class=<class docutils.io.FileOutput at 0x8ee423c>, settings=None)

A facade encapsulating the high-level logic of a Docutils system.

apply_transforms()
debugging_dumps()
get_settings(usage=None, description=None, settings_spec=None, config_section=None, **defaults)

Set and return default settings (overrides in defaults dict).

Set components first (self.set_reader & self.set_writer). Explicitly setting self.settings disables command line option processing from self.publish().

process_command_line(argv=None, usage=None, description=None, settings_spec=None, config_section=None, **defaults)

Pass an empty list to argv to avoid reading sys.argv (the default).

Set components first (self.set_reader & self.set_writer).

process_programmatic_settings(settings_spec, settings_overrides, config_section)
publish(argv=None, usage=None, description=None, settings_spec=None, settings_overrides=None, config_section=None, enable_exit_status=None)
Process command line options and arguments (if self.settings not already set), run self.reader and then self.writer. Return self.writer‘s output.
report_Exception(error)
report_SystemMessage(error)
report_UnicodeError(error)
set_components(reader_name, parser_name, writer_name)
set_destination(destination=None, destination_path=None)
set_io(source_path=None, destination_path=None)
set_reader(reader_name, parser, parser_name)
Set self.reader by name.
set_source(source=None, source_path=None)
set_writer(writer_name)
Set self.writer by name.
setup_option_parser(usage=None, description=None, settings_spec=None, config_section=None, **defaults)
class firmant.parsers.RstMetaclass(name, bases, dct)

Bases: abc.ABCMeta

A metaclass for dealing with RstParsedObject.

This class prvoides the magic that allows the document tree to be mutated and have the attributes of a RstParsedObject update.

mro
mro() -> list return a type’s method resolution order
register(subclass)
Register a virtual subclass of an ABC.
class firmant.parsers.RstParsedObject(**kwargs)

Bases: firmant.parsers.ParsedObject

The base class of all objects parsed from reSt documents.

class firmant.parsers.RstParser(environment, objects, action=None)

Bases: firmant.parsers.Parser

A parser containing common functionality for parsing reStructuredTest.

cls
The class object that should be used for new parsed objects.
parse(environment, objects, path)
Parse the reStructuredText doc at path and pass the relevant pieces to rstparse().
paths

Determine which paths should be parsed.

This is a regular expression that will be used to match pathnames relative to root().

root(environment)
The root under which all objects to be parsed by this parser reside.
rstparse(environment, objects, path, pieces)

The main method to override when creating new reStructuredText parsers.

The pieces dictionary contains the following keys:

metadata
All pieces of metadata pulled from the document using firmant.du.meta_data_transform().
pub_parts
The pieces returned by the html writer.
document
The actual doctree of that was produced as a result of parsing, and applying transformations.
scheduling_order

The following scheduling orders apply to parsers:

10
At timestep 10, the parser will create the chunks for finding paths and parsing.
200
Iterate over the paths in paths() and pass each path to the parse() method. All new objects should be placed in the dictionary by parse().
type

The type of the primary object to be parsed (e.g. posts).

Parsed objects will be added to the objects dictionary under this key.

This value has no impact on secondary objects that are generated (e.g. objects that are created from embedded LaTeX equations).

Table Of Contents

Previous topic

firmant.paginate

Next topic

firmant.parsers.feeds

This Page