opensuse / dbxincluder Goto Github PK

View Code? Open in Web Editor NEW

2.0 9.0 3.0 1.39 MB

Transclusions for DocBook with XInclude 1.1

Home Page: https://opensuse.github.io/dbxincluder

License: GNU General Public License v3.0

Python 100.00%

xml xinclude lxml transclusion

dbxincluder's Introduction

Non-XInclude conformant preprocessor

Development and Setup

Use setup.py to install dbxincluder. You can simply call dbxincluder to use it. For development run setup.py develop and py.test to run the testsuite.

dbxincluder's People

Contributors

Stargazers

Watchers

Forkers

vikas-lamba

dbxincluder's Issues

Use Test Coverage with pytest-cov

I would like to see some test coverage reports. :)

Requirements:

Use the pytest-cov package
Distinguish in tox.ini between a "normal" test run and a test run with coverage enabled

The "coverage ..." calls can be replaced by:

{posargs:py.test --cov=dbxincluder --cov-fail-under=90 --cov-report=term-missing -vv}

Support inclusion of XML without root node

According to the spec, it is possible to include an XML file without a root node:

<a />
<b />

It is not clear what the set-xml-id attribute is supposed to do it that case (well, set it on ALL elements, but that is invalid XML).

Deploy Documentation on GitHub Pages

It would be nice to have an up-to-date documentation published on GitHub Pages after a successful Travis build.

Requirements:

A gh-pages branch
Require travis-sphinx in text_requirement.txt or in tox.ini
Update tox.ini with a doc_travis_deploy, see https://github.com/SUSE/kiwi/blob/master/tox.ini#L84 for an example.
Update .travis.yml

It should limit the build for the master and develop branches only (use the --branches option). You also need to create a token on GitHub which is then pasted in the Travis settings dialog. Find more information on its homepage https://github.com/Syntaf/travis-sphinx

Improve error message (trans:idfixup)

According to the DocBook Transclusion spec, the trans:idfixup can have one of the values none, suffix, or auto.

When using trans:idfixup with foo as an invalid value, I get:

Error at tests/db/blub.xml:7: idfixup type 'foo' not implemented
Included by tests/db/article.01.xml

The error itself is correct of course, but the message is misleading. I was confused too at a first glance. ;-)
To make it more clear, I would propose to change it like this:

Error at tests/db/blub.xml:7: idfixup contains 'foo', but only 'none', 'suffix', or 'auto' are allowed.

I think that makes it clearer.

Use a declarative approach for setup.py/setup.cfg

Situation

The setup.py file shows its age. After several years, the Python ecosystem provides a more declarative approach. This is the more recommended way and is not yet covered in setup.py and setup.cfg.

Suggestion

Use the declarative config as shown in the setuptools documentation.
Move specific pytest configuration into pytest.ini
Move specific bumpversion configuration into .bumpversion.cfg

Validate xi:include attributes

Currently dbxincluder accepts invalid xi:include elements, such as
<xi:include asdf='lol' this='wrong'/>

It should verify all non-namespaced attributes before processing.

Implement inheriting transclusion attributes

Although I don't see how it can be useful, the trans:suffix and trans:linkscope attribute can be inherited, but trans:idfixup can't.

Support -h option

The option --help works as expected, but not -h.

Make version number consistent (0.0 -> 0.1.0)

The .bumpversion.cfg configuration file contains version 0.1.0, but src/dbxincluder/__init__.py contains "0.0".

Support DocBook transclusion

Support the trans:* and local:* attributes on <xi:include/> elements, please. For more information, see http://docbook.org/docs/transclusion

The page contains in section "B. Special ID/IDREF processing" some examples which can be used as a test case (abridged, of course).

Support -o/--output option

At the moment, the program doesn't have any option where to save the output.

Of course the output could be redirected to a file, but I think it would be nice to have an additional -o/--output option.

Allow referencing to the same document

According to the XInclude spec, the href attribute is optional:

The href attribute is optional; the absence of this attribute is the same as specifying href="", that is, the reference is to the same document.

I think it does make sense to allow referencing and including parts from the same document.

Some ideas:

When there is no href attribute (or href="), the spec requires to have either an xpointer or fragid attribute.
ATM it's enough to support fragid only.
I think, we should also check if set-xml-id is there. We don't want the same IDs in the result document.
If there is no such an attribute, should dbxincluder print a warning? Abort?

Catch lxml.etree.XMLSyntaxError Exception

Assume you have the following XML document with a syntax error (missing close tag </title>):

<book xmlns="http://docbook.org/ns/docbook" version="5.1">
    <title><!--</title>-->
    <index/>
</book>

Running dbxincluder gives the following error exception:

dbxincluder tests/cases/wrong.xml 
Traceback (most recent call last):
  File "/local/doc/dbxincluder/.env/bin/dbxincluder", line 9, in <module>
    load_entry_point('dbxincluder==0.1.0', 'console_scripts', 'dbxincluder')()
  File "/local/doc/dbxincluder/src/dbxincluder/__init__.py", line 186, in main
    sys.stdout.write(process_xml(inputxml, base_url, path))
  File "/local/doc/dbxincluder/src/dbxincluder/__init__.py", line 127, in process_xml
    tree = lxml.etree.fromstring(xml, base_url=base_url)
  File "lxml.etree.pyx", line 3103, in lxml.etree.fromstring (src/lxml/lxml.etree.c:70569)
  File "parser.pxi", line 1828, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:106403)
  File "parser.pxi", line 1716, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:105194)
  File "parser.pxi", line 1086, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:99876)
  File "parser.pxi", line 580, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:94350)
  File "parser.pxi", line 690, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:95786)
  File "parser.pxi", line 620, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:94853)
lxml.etree.XMLSyntaxError: Opening and ending tag mismatch: title line 3 and book, line 5, column 8

You want to catch this exception. 😃

Enhance Example Section in Doc (fragid/set-xml-id)

We have an Example section in usage.rst. I think, this section deserves some more example.

Especially a short example showing the new fragid and set-xml-id attributes. One solution would be to name the subsections as a kind of "use case scenario". For example, "Including parts of an XML document" or "Changing ID of Root Element". Something like that.

Another idea would be to place the examples in the manpage.

XPointer support

Originated from openSUSE/daps#229

It would be nice if the dbxincluder could support some XPointer expressions.

Add Manpage to Documentation

I think it would be a good addition to have the manpage also included into our (main) documentation.

Support xi:fallback element

The XInclude specification allows the additional element xi:fallback. For example:

<xi:include href="something_not_available.xml">
  <xi:fallback>
    <para>This will be included</para>
  </xi:fallback>
</xi:include>

The XInclude processor tries to retrieve something_not_available.xml. If that fails, the xi:fallback is considered.

According to the spec (see above link):

The xi:fallback element appears as a child of an xi:include element. It provides a mechanism for recovering from missing resources. When a resource error is encountered, the xi:include element is replaced with the contents of the xi:fallback element. If the xi:fallback element is empty, the xi:include element is removed from the result. If the xi:fallback element is missing, a resource error results in a fatal error.

An interesting side note: the xi:fallback can contain a xi:include which can contain an xi:fallback...

Improve Modularization and Maintenance

In order to improve maintenance, I would suggest some changes:

Try to move everything (except __version__) in __init__.py into (a) different file(s).

It's common practise in Python to have mostly empty __init__.py files, maybe with a version string.
Use argparse in the function main() (maybe docopt is also an option).

At the moment we don't have any other options. However, as soon as we are adding more, it's better to do it with the right library module for this pupose.
Make sure the project can also be used as a library (which is probably already the case).

Check flake8 output

Flake8 is a modular source code checker for Python. It reports inconsistencies and style issues.

Install the flake8 script with pip install flake8 in your virtual environment and run it with:

$ flake8 src/dbxincluder

Check the output and fix it, if necessary. Probably you need to set the maximum character in a line a bit higher.

Include it also as a target in tox.ini

Support XML Catalogs

Currently, URLs are resolved through the urllib. However, this doesn't take XML catalogs into account.

I would propose the following steps:

Introduce an -c / --catalog option. If the user adds this, the given catalog is investigated.
Check for the environment variable XML_CATALOG_FILES. This variable can contain one or more catalogs separated by space.
If neither options given nor env variable is set, use the standard root catalog at /etc/xml/catalog.

ATM it seems there is no direct way to resolve a URI through lxml (see also http://stackoverflow.com/a/7229470). However, I've found http://lxml.de/resolvers.html which may be of some value. If there is no easy way we could use the xmlcatalog directly like this:

$ xmlcatalog CATALOG URI_TO_RESOLVE

Add the information about how to resolve URIs into the manpage.

Implement Transclusion properties

Write Documentation

At the moment, we don't have any up-to-date documentation. I think, you could improve that with the follwoing:

Write documentation in Sphinx/ReST format.
Use .bumpversion.cfg and add doc/conf.py as configuration.
Create the structure in a docs subdirectory. You can use sphinx-quickstart which asks you questions about what you want
Ignore the build and dist directories

The documentation itself should contain the following key points:

What is dbxincluder?
Short introduction about Transclusion
Feature Highlights
Installation Requirements and supported Python version(s)
Supported Attributes
Maybe a concise example
A manpage

Maybe you find additional or better entries. Think about what the user wants to know about this project.

Find some doc examples in KIWI: https://github.com/SUSE/kiwi/tree/master/doc/source

Small code improvements

Collect all namespaces and save it, for example, in a separate file (for example, core.py). Avoids any nasty typos. ;) Include it where necessary.
File __init__.py:
- If something is not implemented, I would raise the NotImplementedError exception (see line 117)
File xinclude.py:
- I would move the DBXIException exception class into a separate file
- Same here, include the namespaces from core.py (in that example).

Enough for today. ;-)