Code Monkey home page Code Monkey logo

dbxincluder's Introduction

Non-XInclude conformant preprocessor

Build Status

Development and Setup

Use setup.py to install dbxincluder. You can simply call dbxincluder to use it. For development run setup.py develop and py.test to run the testsuite.

dbxincluder's People

Contributors

tomschr avatar vogtinator avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

vikas-lamba

dbxincluder's Issues

Use Test Coverage with pytest-cov

I would like to see some test coverage reports. :)

Requirements:

  • Use the pytest-cov package

  • Distinguish in tox.ini between a "normal" test run and a test run with coverage enabled

  • The "coverage ..." calls can be replaced by:

    {posargs:py.test --cov=dbxincluder --cov-fail-under=90 --cov-report=term-missing -vv}
    

Support inclusion of XML without root node

According to the spec, it is possible to include an XML file without a root node:

<a />
<b />

It is not clear what the set-xml-id attribute is supposed to do it that case (well, set it on ALL elements, but that is invalid XML).

Deploy Documentation on GitHub Pages

It would be nice to have an up-to-date documentation published on GitHub Pages after a successful Travis build.

Requirements:

It should limit the build for the master and develop branches only (use the --branches option). You also need to create a token on GitHub which is then pasted in the Travis settings dialog. Find more information on its homepage https://github.com/Syntaf/travis-sphinx

Improve error message (trans:idfixup)

According to the DocBook Transclusion spec, the trans:idfixup can have one of the values none, suffix, or auto.

When using trans:idfixup with foo as an invalid value, I get:

Error at tests/db/blub.xml:7: idfixup type 'foo' not implemented
Included by tests/db/article.01.xml

The error itself is correct of course, but the message is misleading. I was confused too at a first glance. ;-)
To make it more clear, I would propose to change it like this:

Error at tests/db/blub.xml:7: idfixup contains 'foo', but only 'none', 'suffix', or 'auto' are allowed.

I think that makes it clearer.

Use a declarative approach for setup.py/setup.cfg

Situation

The setup.py file shows its age. After several years, the Python ecosystem provides a more declarative approach. This is the more recommended way and is not yet covered in setup.py and setup.cfg.

Suggestion

  • Use the declarative config as shown in the setuptools documentation.
  • Move specific pytest configuration into pytest.ini
  • Move specific bumpversion configuration into .bumpversion.cfg

Validate xi:include attributes

Currently dbxincluder accepts invalid xi:include elements, such as
<xi:include asdf='lol' this='wrong'/>

It should verify all non-namespaced attributes before processing.

Support -o/--output option

At the moment, the program doesn't have any option where to save the output.

Of course the output could be redirected to a file, but I think it would be nice to have an additional -o/--output option.

Allow referencing to the same document

According to the XInclude spec, the href attribute is optional:

The href attribute is optional; the absence of this attribute is the same as specifying href="", that is, the reference is to the same document.

I think it does make sense to allow referencing and including parts from the same document.

Some ideas:

  • When there is no href attribute (or href="), the spec requires to have either an xpointer or fragid attribute.
  • ATM it's enough to support fragid only.
  • I think, we should also check if set-xml-id is there. We don't want the same IDs in the result document.
  • If there is no such an attribute, should dbxincluder print a warning? Abort?

Catch lxml.etree.XMLSyntaxError Exception

Assume you have the following XML document with a syntax error (missing close tag </title>):

<book xmlns="http://docbook.org/ns/docbook" version="5.1">
    <title><!--</title>-->
    <index/>
</book>

Running dbxincluder gives the following error exception:

dbxincluder tests/cases/wrong.xml 
Traceback (most recent call last):
  File "/local/doc/dbxincluder/.env/bin/dbxincluder", line 9, in <module>
    load_entry_point('dbxincluder==0.1.0', 'console_scripts', 'dbxincluder')()
  File "/local/doc/dbxincluder/src/dbxincluder/__init__.py", line 186, in main
    sys.stdout.write(process_xml(inputxml, base_url, path))
  File "/local/doc/dbxincluder/src/dbxincluder/__init__.py", line 127, in process_xml
    tree = lxml.etree.fromstring(xml, base_url=base_url)
  File "lxml.etree.pyx", line 3103, in lxml.etree.fromstring (src/lxml/lxml.etree.c:70569)
  File "parser.pxi", line 1828, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:106403)
  File "parser.pxi", line 1716, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:105194)
  File "parser.pxi", line 1086, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:99876)
  File "parser.pxi", line 580, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:94350)
  File "parser.pxi", line 690, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:95786)
  File "parser.pxi", line 620, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:94853)
lxml.etree.XMLSyntaxError: Opening and ending tag mismatch: title line 3 and book, line 5, column 8

You want to catch this exception. ๐Ÿ˜ƒ

Enhance Example Section in Doc (fragid/set-xml-id)

We have an Example section in usage.rst. I think, this section deserves some more example.

Especially a short example showing the new fragid and set-xml-id attributes. One solution would be to name the subsections as a kind of "use case scenario". For example, "Including parts of an XML document" or "Changing ID of Root Element". Something like that.

Another idea would be to place the examples in the manpage.

Support xi:fallback element

The XInclude specification allows the additional element xi:fallback. For example:

<xi:include href="something_not_available.xml">
  <xi:fallback>
    <para>This will be included</para>
  </xi:fallback>
</xi:include>

The XInclude processor tries to retrieve something_not_available.xml. If that fails, the xi:fallback is considered.

According to the spec (see above link):

The xi:fallback element appears as a child of an xi:include element. It provides a mechanism for recovering from missing resources. When a resource error is encountered, the xi:include element is replaced with the contents of the xi:fallback element. If the xi:fallback element is empty, the xi:include element is removed from the result. If the xi:fallback element is missing, a resource error results in a fatal error.

An interesting side note: the xi:fallback can contain a xi:include which can contain an xi:fallback...

Improve Modularization and Maintenance

In order to improve maintenance, I would suggest some changes:

  • Try to move everything (except __version__) in __init__.py into (a) different file(s).

    It's common practise in Python to have mostly empty __init__.py files, maybe with a version string.

  • Use argparse in the function main() (maybe docopt is also an option).

    At the moment we don't have any other options. However, as soon as we are adding more, it's better to do it with the right library module for this pupose.

  • Make sure the project can also be used as a library (which is probably already the case).

Check flake8 output

Flake8 is a modular source code checker for Python. It reports inconsistencies and style issues.

Install the flake8 script with pip install flake8 in your virtual environment and run it with:

$ flake8 src/dbxincluder

Check the output and fix it, if necessary. Probably you need to set the maximum character in a line a bit higher.

Include it also as a target in tox.ini

Support XML Catalogs

Currently, URLs are resolved through the urllib. However, this doesn't take XML catalogs into account.

I would propose the following steps:

  1. Introduce an -c / --catalog option. If the user adds this, the given catalog is investigated.
  2. Check for the environment variable XML_CATALOG_FILES. This variable can contain one or more catalogs separated by space.
  3. If neither options given nor env variable is set, use the standard root catalog at /etc/xml/catalog.

ATM it seems there is no direct way to resolve a URI through lxml (see also http://stackoverflow.com/a/7229470). However, I've found http://lxml.de/resolvers.html which may be of some value. If there is no easy way we could use the xmlcatalog directly like this:

$ xmlcatalog CATALOG URI_TO_RESOLVE

Add the information about how to resolve URIs into the manpage.

Write Documentation

At the moment, we don't have any up-to-date documentation. I think, you could improve that with the follwoing:

  • Write documentation in Sphinx/ReST format.
  • Use .bumpversion.cfg and add doc/conf.py as configuration.
  • Create the structure in a docs subdirectory. You can use sphinx-quickstart which asks you questions about what you want
  • Ignore the build and dist directories

The documentation itself should contain the following key points:

  • What is dbxincluder?
  • Short introduction about Transclusion
  • Feature Highlights
  • Installation Requirements and supported Python version(s)
  • Supported Attributes
  • Maybe a concise example
  • A manpage

Maybe you find additional or better entries. Think about what the user wants to know about this project.

Find some doc examples in KIWI: https://github.com/SUSE/kiwi/tree/master/doc/source

Small code improvements

  • Collect all namespaces and save it, for example, in a separate file (for example, core.py). Avoids any nasty typos. ;) Include it where necessary.
  • File __init__.py:
    • If something is not implemented, I would raise the NotImplementedError exception (see line 117)
  • File xinclude.py:
    • I would move the DBXIException exception class into a separate file
    • Same here, include the namespaces from core.py (in that example).

Enough for today. ;-)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.