Code Monkey home page Code Monkey logo

stylesheets's Introduction

Stylesheets

GitHub release Stylesheets Tests

TEI XSL Stylesheets

This is a family of XSLT 3.0 stylesheets to transform TEI XML documents to various formats, including XHTML, LaTeX, XSL Formatting Objects, ePub, plain text, RDF, JSON; and to/from Word OOXML (docx) and OpenOffice (odt). They concentrate on the core TEI modules which are used for simple transcription and "born digital" writing. It is important to understand that they do not:

  • cover all TEI elements and possible attribute values
  • attempt to define a standard TEI processing or rendering model

and should not be treated as the definitive view of the TEI Consortium.

For more information, see https://tei-c.org/tools/stylesheets/

Prerequisites

The package assumes that you have several additional tools installed. Their availability on your system can be verified by issuing the command make check.

In particular, Stylesheets assume that you use ant version 1.9.x+. If for some reason, you need to use ant 1.8.x, you should remove all occurences of the attribute @zip64Mode from the file common/teianttasks.xml.

It is helpful to have the TEI environment installed locally. Please refer to http://www.tei-c.org/Guidelines/P5/get.xml for hints on how to do that.

It is also possible to avoid manual installation of additional tools, by resorting to the pre-built test environment in Docker described in https://teic.github.io/Documentation/TCW/testing_and_building.html .

Usage

The bin/ directory contains several executable files, which can be run on Linux, OS X, or other Unix operating systems. These perform a variety of transformations and are very useful for, e.g., generating a schema from a TEI ODD. Some examples:

bin/teitorelaxng --odd ../TEI/P5/Exemplars/tei_all.odd tei_all.rng

Assuming you have a copy of the TEI Guidelines repository alongside your copy of the Stylesheets, this will take the tei_all ODD and generate a RelaxNG XML schema for you. Similarly,

bin/teitornc --odd ../TEI/P5/Exemplars/tei_lite.odd tei_lite.rnc

will produce a RelaxNG Compact Syntax schema for TEI Lite.

Documentation

To build the documentation, run:

make doc

It will then be available at release/xslcommon/doc/tei-xsl/index.html.

About the Text Encoding Initiative (TEI)

The Text Encoding Initiative (TEI) is a community of practice in the area now known as textual digital humanities. Since 1994 the primary output of the TEI has been the TEI/XML guidelines, a standard for the interchange of textual data. A main focii of the TEI is the TEI-L mailing list; the TEI is also on GitHub and docker, a repository called TAPAS and an academic journal, the jTEI.

TEI/XML can be thought of as a sibling of HTML (they're approximately the same age, depending on how you measure it) which evolved with a focus on defined textual semantics rather than defined display semantics. TEI by example is a good introduction to TEI/XML. The Text Encoding Initiative Wikipedia article contains some short examples. The TEI/XML standard is used by content-based projects such as the British National Corpus, the Perseus Project, the Women Writers Project, the Oxford Text Archive, the Digital Tripitaka and SARIT, and tool-based projects such as CorrespSearch,
EpiDoc, Anthologize, Versioning Machine, and many more diverse projects.

stylesheets's People

Contributors

ahankinson avatar bansp avatar bleekere avatar bwbohl avatar cyocum avatar dmj avatar ebeshero avatar eduarddrenth avatar hcayless avatar helenasabel avatar jamescummings avatar janellejenstad avatar jbampton avatar joeytakeda avatar jure avatar lambdafu avatar lb42 avatar martinascholger avatar martindholmes avatar mpetris avatar peterstadler avatar raffazizzi avatar rvdb avatar sabineseifert avatar sebastianrahtz avatar sydb avatar tomazerjavec avatar trishaoconnor avatar tuurma avatar vvasuki avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stylesheets's Issues

A language that is needed for xml any where it might be

Revive xqib( xQuery in the broswer), and furthering it as a programming language:

http://www.xqib.org/

http://archive.xmlprague.cz/2011/presentations/xqib.pdf

https://websci.informatik.uni-freiburg.de/teaching/ws201112/xmldb/sheets/xqib-documentation.pdf

http://cs.brown.edu/~kraskat/pub/www09-xqib.pdf

Navigating through the XDM and DOM prove to be of a complex ordeal, and in it self in need of a language to do so(xLink, xPointer, xQuery, and xPath). The API are a mess, built on top of previous ones, and request feature ones. XML have a language to navigate through it. Make it to where we can utilize that language for our purpose. Xqib(rename it if you want XPL(XML, programming language, or XNL(xml navigation language)). Please consider this. Thank you for reading

ODDs invoking nested RNGs

I've got an odd file...
<moduleRef url="example-a.rng"/>
...that (as seen from the snippet above) invokes an RNG file...
<include href="example-b.rng"> <define name="pattern-x"> <interleave> <ref name="pattern-y"/> <optional> <ref name="pattern-z"/> </optional> </interleave> </define> </include>
...that (as seen from the snippet above) itself invokes and redefines another RNG file that has a snippet like this:
<define name="pattern-x"> <empty/> </define>
My odd transformations are successfully pulling in example-b.rng via the intermediary, but they are not using example-a.rng to redefine example-b.rng's patterns. That is, in this example, I'm getting a customized TEI schema that renders (in RNC syntax) pattern-x = empty not pattern-x = pattern-y & pattern-z?

It seems this is a bug, but I could be wrong. Thoughts?

embedding fonts in ePub

This comes from a TEI feature request by Pablo Rodrigues. He notes that
embedding a font requires adding the following code to the CSS file (adapted from http://johnmacfarlane.net/pandoc/README.html\):
@font-face {
font-family: FreeSans;
font-style: normal;
font-weight: normal;
src:url("/OPS/FreeSans.otf");
}
@font-face {
font-family: FreeSans;
font-style: normal;
font-weight: bold;
src:url("/OPS/FreeSansBold.otf");
}
@font-face {
font-family: FreeSans;
font-style: italic;
font-weight: normal;
src:url("/OPS/FreeSansOblique.otf");
}
@font-face {
font-family: FreeSans;
font-style: italic;
font-weight: bold;
src:url("/OPS/FreeSansBoldOblique.otf");
}
body { font-family: "FreeSans"; }
and actually including the four font files in the said directory.
With the files that include text with the serif typeface (such as the one cited as example), it is also be required to be able to embed the that fonts (and the same would be required for the monospaced typeface).

duplicate creation of constraintSpec

ok, here's another one (bug?) from today:

<schemaSpec ident="test" start="TEI">
    <moduleRef key="core"/>
    <moduleRef key="tei"/>
    <moduleRef key="header"/>
    <moduleRef key="textstructure"/>
    <moduleRef key="transcr"/>

    <elementSpec ident="addSpan" mode="change" module="transcr">
        <attList>
            <attDef ident="spanTo" mode="change" usage="req"/>
        </attList>
    </elementSpec>   

</schemaSpec>

gives me a duplicate "spanTo-constraint-spanTo-2" (in the resulting RelaxNG) resulting in a choking schematron validation.

closing square bracket removed from label of gloss list

In teitoslides conversion (testing only in oxford profile but don't see anything in that which would modify this behaviour) when I have a gloss list and include [...] in the <label> in generating pdf slides, the square bracket is moved out of the gloss label and put in the gloss item.

<list type="gloss">
<label>[...]</label>
<item>item here</item>
...
</list>

At a guess, I suspect this is likely to be because of the way the closing square brackets appear in the intermediate latex.

Flow chart conversion issue

While the conversion from word 2014 to xhtml (includes flow chart), we didn't got flowchart. But the text in flow chart has shown alone.

Building on paths with spaces produces a near-empty RNG file

When building the MEI source using teitorelaxng, a path to the MEI source with a space in it will cause the script to output a nearly empty file.

To reproduce:

-- cd to a checkout of the MEI source with a space in it, e.g., /home/joeblow/music encoding
-- create an output directory, e.g., build
-- run the teitorelaxng script: teitorelaxng --localsource=source/driver.xml customizations/mei-all.xml build/mei-all.rng

You should get a notice:

using /home/joeblow/music encoding/trunk/source/driver.xml as default source
Convert mei-all.xml to /home/joeblow/music encoding/trunk/build/mei-all.rng (tei to relaxng) using profile default

BUILD FAILED
Target "encoding/trunk/source/driver.xml" does not exist in the project "teitorelaxng".

Total time: 0 seconds

A partial fix for this is to place quotes around the path on line 177 of teitorelaxng (note the swapping of the single and double quotes):

localsource=' -DdefaultSource="$REALSOURCE" '

This will cause the script to exit successfully; however, the schema file generated in the build directory is only 661 bytes long, and contains an (essentially) empty RNG schema.

If you rename the directory to one without spaces, and revert the change on line 177, everything will work.

I have not been able to verify if this is the same with a TEI build.

ODD processing anomaly

Changing the valList for an attribute inherited from a TEI attribute class is possible only if the element concerned is in the TEI name space. Since TEI attributes are not namespaced, this seems like a mistake.

ODD by inclusion removes macros

… and there is no chance to include them explicitly?!

simple example:
try <moduleRef key="tagdocs" include="att"/> in your schemaSpec. This will remove macro.anyXML and macro.schemaPattern from the resulting schema.

While this is not very severe for creating TEI schemas (since most macros are defined in module TEI), it's a real show stopper for creating MEI schemas where macros are spread across several modules.

java.lang.ClassNotFoundException: net.sf.saxon.TransformerFactoryImpl

Dear all,
I would try to convert tei to latex with ... teitilatex . I work on fedora with a texlive installation. After run make ... and install jing and saxon and re-run makeI have this issu

 BUILD FAILED
/home/delaye/github/Stylesheets/docx/build-from.xml:33: java.lang.ClassNotFoundException:  net.sf.saxon.TransformerFactoryImpl
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:191)
    at org.apache.tools.ant.taskdefs.optional.TraXLiaison.getFactory(TraXLiaison.java:414)
    at org.apache.tools.ant.taskdefs.optional.TraXLiaison.getSource(TraXLiaison.java:247)
    at org.apache.tools.ant.taskdefs.optional.TraXLiaison.readTemplates(TraXLiaison.java:299)
    at org.apache.tools.ant.taskdefs.optional.TraXLiaison.createTransformer(TraXLiaison.java:317)
    at org.apache.tools.ant.taskdefs.optional.TraXLiaison.transform(TraXLiaison.java:178)
    at org.apache.tools.ant.taskdefs.XSLTProcess.process(XSLTProcess.java:850)
    at org.apache.tools.ant.taskdefs.XSLTProcess.execute(XSLTProcess.java:388)
    at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
    at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
    at org.apache.tools.ant.Task.perform(Task.java:348)
    at org.apache.tools.ant.Target.execute(Target.java:435)
    at org.apache.tools.ant.Target.performTasks(Target.java:456)
    at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
    at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
    at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
    at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
    at org.apache.tools.ant.Main.runBuild(Main.java:851)
    at org.apache.tools.ant.Main.startAnt(Main.java:235)
    at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
    at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)

Total time: 0 seconds
make[1]: *** [test-from-docx] Erreur 1
make[1] : on quitte le répertoire « /home/delaye/github/Stylesheets/Test »
make: *** [test] Erreur 2

Some one can lead me in the installation part ? Or is there a manual ?

thank you and best regards

E.

Suggestions for documentation

I think it would help if each profile directory contained a simple readme that explains what that particular profile is all about. For instance, I happen to know what dhoxss and ota are, but many people won't; I don't have any idea what acm or jsi might be. minimal may seem self-explanatory, but I have no idea when it might be appropriate to use it. What is podcasts about -- is this for transcriptions of podcasts in TEI, or does it convert podcast atom feeds to and from TEI?

Also, there are a couple of directories that lurk among the others which should perhaps be distinguished in some way -- I'd suggest using capitals. They are:

doc: this contains documentation, but I initially thought it related to MS doc files. I suggest this be called DOCUMENTATION.

profiles: this has a different function from its sibling directories; perhaps PROFILES would make this clearer.

ref @target (#inthetext) does not pass to html

I realise that <ref target="#OCED2012">OCED 2012</ref> gives no HTML link to a bibl element whose xml:id="OCDE2012" through xsl:import href="/usr/share/xml/tei/stylesheet/html/html.xsl".
It does work if the target is a p.
Is this a recent bug in the stylesheet or a recent change to the TEI? Is a bibl an illegitimate target?
[Can’t see where to attach a file. Thus here a snippet:]
<p rend="alinea">
Certaines sources d’énergie sont intermittentes. L’article <ref target="#OCED2012">OCED 2012
</ref> parle d’énergies « variables » et d’énergies « programmables », voulant dire par là
que les secondes sont sous le contrôle de l’humanité, par opposition aux premières.
</p>
<bibl xml:id="OCDE2012">
<hi rend="gras">OCDE 2012</hi>
« Énergies nucléaire et renouvelables : Effets systémiques dans les réseaux électriques
bas carbone », Synthèse, OCDE 2012. Voir aussi <ref target="http://www.oecdbookshop.org/"
>www.oecdbookshop.org</ref> pour une version complète en anglais.
</bibl>
[Parent is a div.]

separating elements in tei.css, margin or line height

Some block elements should be separated by horizontal withespace. In HTML+CSS, this is usually done with some margin-top and margin-bottom. This is the case with the p element in HTML, which is not obvious because it is default, but appears by experimenting with margin: 0px; while line-height cannot reduce whitespace above and after without affecting the contents of p.
In tei.css, this is obtained for titles with line-height. But the result is not only emphasis on the title by a separation from other blocks, but also a separation of all lines in the title, which in fact reduces the effect.
Some authors or publishers may have decided to compose titles in more than one line, which calls for a lb in TEI, giving a br in HTML. In such books, the title is emphasized by some whitespace before and after, but not between the lines. See http://www.d-meeus.be/marxisme/classiques/Capital-IVchap17para8.html for an example of the unintended effect. The main title contains lb elements in the TEI source.
I could of course handle the line-height in my own css, but I think more constructive for everybody to suggest the use of some margin-top and margin-bottom instead of line-height for h1, h2, h3 in tei.css.

tcp/tcp2tei.xsl should consider reversing order of revisionDesc

The revisionDesc entries in the TCP converted texts seems to have them in chronological order whereas the TEI Guidelines note that "Conventionally change elements should be given in reverse date order, with the most recent change at the start of the list." While this is clearly a minor thing, perhaps the order should be reversed in the TCP conversion stylesheet for the next time we run a full conversion?

A w:t in a specific location in a Docx file disappears in the conversion

Having a hard time debugging this one. It's basically a bit like this:

    <w:p w:rsidR="007C794E" w:rsidRPr="007C794E" w:rsidRDefault="007C794E" w:rsidP="007C794E">
      <w:pPr>
        <w:pStyle w:val="EndNoteBibliography"/>
        <w:spacing w:after="0" w:line="360" w:lineRule="auto"/>
        <w:ind w:left="720" w:hanging="720"/>
        <w:rPr>
          <w:sz w:val="24"/>
        </w:rPr>
      </w:pPr>
      <w:r w:rsidRPr="007C794E">
        <w:rPr>
          <w:rFonts w:cs="Arial"/>
          <w:sz w:val="24"/>
          <w:szCs w:val="24"/>
        </w:rPr>
        <w:fldChar w:fldCharType="begin"/>
      </w:r>
      <w:r w:rsidR="000E1C70" w:rsidRPr="001A62B7">
        <w:rPr>
          <w:rFonts w:cs="Arial"/>
          <w:sz w:val="24"/>
          <w:szCs w:val="24"/>
        </w:rPr>
        <w:instrText xml:space="preserve"> ADDIN EN.REFLIST </w:instrText>
      </w:r>
      <w:r w:rsidRPr="007C794E">
        <w:rPr>
          <w:rFonts w:cs="Arial"/>
          <w:sz w:val="24"/>
          <w:szCs w:val="24"/>
        </w:rPr>
        <w:fldChar w:fldCharType="separate"/>
      </w:r>
      <w:r w:rsidRPr="007C794E">
        <w:rPr>
          <w:sz w:val="24"/>
        </w:rPr>
        <w:t>1. Stenberg P, Larsson J (2011) Buffering and the evolution of chromosome-wide gene regulation. Chromosoma 120: 213-225.</w:t>
      </w:r>
    </w:p>

The text in w:t "1. Stenberg P, ..." disappears already from the TEI, the result for this paragraph is:

<p rend="EndNote Bibliography"><?biblio ADDIN EN.REFLIST?></p>

I know it has to do with the instruction processing, because of those fldChar elements, but I haven't really figured out what the cause is. I'm working on solving it as we speak, but any pointers are more than welcome!

table incorrectly processed by jTEI-odt conversion

There seems to be a problem with tables in which the @cols attribute is used to get spanning cells. If all the cells in a row are spanned (as in the example in the documentation) this works OK, but if only some of them are, the other columns are suppressed.

converting ODD to HTML doc: doc shows incorrect relationships for elements outside the TEI namespace

Summary

converting ODD to HTML doc: doc shows incorrect relationships for elements outside the TEI namespace

Versions

OS: Debian testing (up to date as of 2013-12-27)

Reproducible with:

  • tei-xsl 7.6.0
  • github repository at commit 37919b6

How to Reproduce

  1. Clone or otherwise download the file tree at: https://github.com/lddubeau/teibugs/tree/master/teibug33

  2. In the directory teibug33, issue:

    $ roma2 --doc --dochtml --nodtd --noxsd myTEI.xml out
    

(You can use --tei=... if you want to reproduce against a checked out version of the github repository.)

Actual Results

The documentation shows that element flerbl in the http://foo.foo/foo namespace is:

  • Contained by "empty element".

And that the element cit in the http://foo.foo/foo namespace:

  • May contain "empty element".
  • Has a declaration element cit { foo_flerbl }, without links to foo_flerbl.

Expected Results

The element flerbl in the http://foo.foo/foo namespace should be documented to:

  • Be contained by the element cit in the http://foo.foo/foo namespace.

The element cit in the http://foo.foo/foo namespace should be documented to:

  • Contain the element flerbl in the http://foo.foo/foo namespace.
  • Have a declaration element cit { foo_flerbl }, which links to foo_flerbl, just like all the elements in the TEI namespace have links in their declarations.

fo:marker contents

the fo:markers contain superfluous Number separators on $numberHeadings = 'true'
This is due to the fact that theses are already stored in the local variable $Number of the template "NumberHeadings".
There's no need in adding the seperator again.

"Delete" mode in customizations does not work as expected

In the MEI project, we have a customization for generating a schema only for Common Music Notation. In this customization we have four statements that are expected to delete classes in the mei-source.xml document. These statements can be seen here:

https://code.google.com/p/music-encoding/source/browse/trunk/customizations/mei-CMN.xml#53

When running Roma or the teitorelaxng script, the delete mode does not actually delete the definition from the resulting RelaxNG. Changing the mode to replace, however, does remove the definition from the resulting RNG schema.

Is this expected behaviour, or is there a bug in the deletion mode?

(Tagging @pe-ro since he's the one who discovered the bug. Also @raffazizzi might want to know about this too.)

rendering of element specs in HTML documentation for customizations

In the HTML documentation of Tite and Lite, the element specs include the text of the <desc>, followed by the section number of P5 referenced by the <ptr>(s) that occur in <listRef>. However, it does not display the <head> of the referenced section, nor is there a hyperlink.

Missing text pages of TEI composite to HTML pages split at div level under text

The TEI is http://www.d-meeus.be/linux/sandbox/sandbox.xml, the stylesheet is http://www.d-meeus.be/linux/sandbox/splitLevel1/sandbox1.xsl and the result http://www.d-meeus.be/linux/sandbox/splitLevel1/index.html. The pages actually produced are listed in the text file http://www.d-meeus.be/linux/sandbox/splitLevel1/foldercontents.

The pages for the divs first children of the texts chidren of group are produced, but the pages for these texts are missing.

In the table of contents and elsewhere, links to sub-divs are not to anchors in parent divs, but to anchors in the non existing text pages. (This was my fist surprise in my real world project.)

Pages are produced for a div in the main front (Foreword) and for a div in the main back, but these pages are missing in the Table of contents.

7.7.0 regression: generating RNG from ODD fails for elements in foreign namespace

I suspect this problem is related to #2 or #7 but I prefer to create a new issue.

Versions

OS: Debian testing (up to date as of 2014-01-07)

See below for the relevant TEI versions.

How to reproduce

(teibug33 was created for bug report #7 but it illustrates the problem I've found so I'm reusing it here.)

  1. Clone or otherwise download the file tree at: https://github.com/lddubeau/teibugs/tree/master/teibug33

  2. With tei-xsl 7.6.0 installed, in the directory teibug33, issue:

    $ roma2 --doc --dochtml --nodtd --noxsd myTEI.xml out_7.6.0
    
  3. With tei-xsl 7.7.0 installed, in the directory teibug33, issue:

    $ roma2 --doc --dochtml --nodtd --noxsd myTEI.xml out_7.7.0
    
  4. Compare the rng files:

    $ diff -u out_7.6.0/foo.rng out_7.7.0/foo.rng
    

Actual results

The rng file created by 7.7.0 expects {http://foo.foo/foo}cit to be empty whereas the rng file created by 7.6.0 expects it to contain a {http://foo.foo/foo}flerbl element.

The relevant part of the diff:

@@ -6736,17 +6736,7 @@
   <define name="foo_cit">
     <element name="cit" ns="http://foo.foo/foo">
       <a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0">This element encodes a citation according to the foo standard. </a:documentation>
-      <ref name="foo_flerbl"/>
-    </element>
-  </define>
-  <define name="foo_flerbl">
-    <element name="flerbl" ns="http://foo.foo/foo">
-      <a:documentation xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0">This element encodes a flerbl according to the foo standard. </a:documentation>
-      <ref name="tei_ref"/>
-      <oneOrMore>
-        <ref name="tei_macro.specialPara"/>
-      </oneOrMore>
-      <text/>
+      <rng:empty xmlns:rng="http://relaxng.org/ns/structure/1.0"/>
     </element>
   </define>
   <start>

Expected results

The rng produced by 7.7.0 should expect the same structure of document as the rng produced by 7.6.0.

Missing group in Table of contents of TEI composite to HTML pages split at text level

The TEI is http://www.d-meeus.be/linux/sandbox/sandbox.xml, the stylesheet is http://www.d-meeus.be/linux/sandbox/splitLevel0/sandbox0.xsl and the result http://www.d-meeus.be/linux/sandbox/splitLevel0/index.html. The pages actually produced are listed in the text file http://www.d-meeus.be/linux/sandbox/splitLevel0/foldercontents.

All of the group of texts is missing in the table of contents (though the pages do exist). Only appear a div of the main front matter and a div of the main back matter.

odd2dtd problem with overriding global attributes

This emerges out of an attempt to carry out https://sourceforge.net/p/tei/bugs/460/ by overriding the global @rend attribute to add a suggested value list on the TEI <list> element. DTD generation failed when that was done. That particular approach was abandoned, but the general case that one should be able to override a global attribute at the <elementSpec> level is not unreasonable. Council discussion has thrown up two opinions: one that this is simply a processing bug, and the other that there is something intrinsic to this scenario which makes it impossible to accomplish. I can't see how the latter can be true, so I think odd2dtd could be fixed to allow this.

xpath typo

Incidentally, i found an XPath typo in the file https://github.com/TEIC/Stylesheets/blob/master/rdf/crm.xsl, line 100: teiCorous instead of teiCorpus. I wonder if the second part of parent::TEI/teiHeader/fileDesc|parent::teiCorpus/teiHeader/fileDesc should be removed instead of fixing the typo:
In crm.odd and in tei2rdf.xsl, the named template E31 only concerns the text element, but text is not allowed as a child of teiCorpus, so there seems to be no way for parent::teiCorpus/teiHeader/fileDesc to match.

Deleting a model class removes it from schema but not from generated doc

My ODD file contains the following
<elementRef key="title"/>
<classSpec type="model" ident="model.emphLike" mode="delete"/>
so as to retain <title> only where it is explicitly referenced in a content model. The RELAXNG schema generated from it works as expected. However, the HTML generated includes documentation for the class I have deleted.

The ODD file is in the Oxford SVN repo at Talks/2014-07-dhoxss-tei/mb-minimal.odd

I am using the most recent release of the oxyGen framework.

Start tagging stylesheet releases

This came up during an EpiDoc meeting: since the TEI stylesheets are a crucial dependency for us and any major changes to them may require adjustments on our part, we'd be happier if there were a more established release process on this side. Specifically, if we could peg our releases to a particular tag on the TEI Stylesheet side, we could be certain of stability. Right now we can't, because the same EpiDoc XSLT release might be built with different versions of the TEI XSLTs and thus produce different outputs. This doesn't have to be terribly formal, but maybe tag the Stylesheets when there's a Guidelines release, for example...

storing tei tag in docx to TEI

i want stored TEI tag when convert Tei to Docx in document.xml.
When convert tei document to docx, TEST removed from docx (in document.xml-text only) and convert to <w:t>TEST</w:t>.
i want insert tag name (example:term) into <w:t />.
For example:
TEI =======> Docx
TEST =======> <w:t tagName="term" >TEST</w:t>

Lou Burnard wrote

I think this is a request to retain the name of a tag in the docx output. Why would that be useful? To implement it would be fairly simple -- just add an XSLT template to a special profile. But what about attribute names and values?

on Debian-type systems teitornc needs the tei-oxygen package installed to run

Versions

Ubuntu 13.10

I get this result using tei-xsl 7.4.0 and using a clone of the github repository.

Steps to reproduce

You need a system on which tei-oxygen is not installed and which does not somehow have trang.jar installed at /usr/share/oxygen/lib/trang.jar.

Run teitornc on any file.

Actual results

Eventually, this error:

     [java] Could not find com.thaiopensource.relaxng.translate.Driver. Make sure you have it in your classpath
     [java]     at org.apache.tools.ant.taskdefs.ExecuteJava.execute(ExecuteJava.java:138)

Expected results

No error.

Observations

libtrang-java's size: 887k (which would be enough to satisfy the dependency)
tei-oxygen's size: 264M

Perhaps the teitoX script could examine the environment and pass the location of the trang's jar to ant. The path to saxon's jar is passed to the ant task by the script.

White space not preserved in docx-to-tei transform

Specifically my problem lies here: https://github.com/TEIC/Stylesheets/blob/master/docx/from/textruns.xsl#L405

The first case handles a text run which is purely white space, with xml:space="preserve" by substituting a single space. That doesn't match the semantics of xml:space="preserve" at all.

In the second case, where a text run includes at least one non-whitespace character, the white space is preserved, but because xml:space="preserve" is not propagated into the TEI, this white space is liable to be stripped by a downstream processor.

I actually don't quite understand the logic here; perhaps this is just a simple bug but perhaps I'm missing something, and this is intentional.

But my own thinking is that white space in a docx document is significant, and should in general be preserved; white space should not be discarded, and should be marked with xml:space="preserve" in the TEI (unless there is no actual space to preserve).

I have made an experimental change like so, which implements the above.

<!--
 <xsl:when test="@xml:space='preserve' and string-length(normalize-space(.))=0">
      <seg><xsl:text> </xsl:text></seg>
 </xsl:when>
 <xsl:when test="@xml:space='preserve'">
      <xsl:value-of select="."/>
 </xsl:when>
 -->
<!-- preserve significant white space which might otherwise be lost -->
<xsl:when test="@xml:space='preserve' and normalize-space(.) != string(.)"
><seg xml:space="preserve"><xsl:value-of select="."/></seg></xsl:when>

I'm not entirely happy with it though, because it is overly cautious about preserving spaces. Sometimes a seg[@xml:space='preserve'] will be generated only because a text run ends with a space character, even though it may be followed in the TEI output by another text node which would have had the effect of preventing trimming of that trailing white space anyway. I'm not sure how best to do deal with that case; perhaps by post-processing?

fo:marker position

Following the w3c specification of xsl-fo markers always have to be the initial children of their parent. At the moment the tei stylesheets insert fo:marker elements for headings after the respective heading text.

global <constraintSpec> elements in ODD aren't processed for HTML documentation

When an ODD file's <schemaSpec> element contains global <constraintSpec> elements (that is, as direct children of <schemaSpec>), those constraints aren't processed when generating HTML documentation, e.g.:

<TEI xmlns="http://www.tei-c.org/ns/1.0" 
  xmlns:sch="http://purl.oclc.org/dsdl/schematron" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <!-- ... -->
  <schemaSpec ident="jtei" start="TEI">
    <!-- ... -->
    <constraintSpec ident="hyphens" scheme="isoschematron">
      <constraint>
        <sch:rule context="text()">
          <sch:assert test="not(contains(., '--'))"> Double hyphens should not be used for dashes.
            Please use the EM Dash (U+2014) instead. </sch:assert>
        </sch:rule>
      </constraint>
    </constraintSpec>
    <!-- ... -->    
  </schemaSpec>
</TEI>

This <constraintSpec> element isn't processed at all when generating HTML documentation.

It would be nice if the TEI stylesheets generating ODD documentation provided processing of <constraintSpec> elements in some way. If the problem is that <constraintSpec> can occur in too many unpredictable places, wouldn't it be possible to support a default case for global Schematron constraints (say, when they are expressed as direct children of <schemaSpec>)?

ODD: the resolution of unprefixed names in <rng:ref> has changed

Summary

ODD: the resolution of unprefixed names in <rng:ref> has changed

Versions

OS: Debian testing (up to date as of 2014-01-16)

tei-xsl: see below

How to Reproduce

  1. Clone or otherwise download the file tree at: https://github.com/lddubeau/teibugs/tree/master/teibug34

  2. Install tei-xsl 7.6.0.

  3. In the directory teibug34, issue:

    $ roma --nodtd --noxsd myTEI.xml out_7.6.0
    
  4. Install tei-xsl 7.8.0.

  5. In the directory teibug34, issue:

    $ roma --nodtd --noxsd myTEI.xml out_7.8.0
    
  6. Run:

    $ diff -u out_7.6.0/foo.rnc out_7.8.0/foo.rnc
    

Actual Results

The output of the diff (abbreviated to remove the timestamp difference) looks like this:

--- out_7.6.0/foo.rnc   2014-01-16 12:52:02.626985982 -0500
+++ out_7.8.0/foo.rnc   2014-01-16 12:52:27.734940939 -0500
@@ -4227,9 +4227,5 @@
 foo_a =

   ##
-  element ns2:a { foo_b }
-foo_b =
-
-  ##
-  element ns2:b { text }
+  element ns2:a { empty }
 start = tei_TEI

Expected Results

No difference in the expected structure of the XML document which foo.rnc would validate.

Observations

It seems in earlier versions roma would resolve <rng:ref name="b"/> by looking for unprefixed names and then looking for a prefix which without the prefix would be named "b". And now it does not do that.

At any rate, the net result is that an ODD which inadvertently hit on the earlier behavior will, from at least tei-xsl 7.8.0 onwards, generate a different schema.

improvements to extract-isosch.xsl needed

Two improvements still needed for the new, improved extract-isosch.xsl:

  1. When an "isosch" constraint is a descendant of a 〈classSpec type="model"〉 but does not have a 〈sch:rule〉, the generated rule does not have the right context=.
  2. When an @validUntil occurs on any of the following, it is not processed.
    classSpec
    constraintSpec
    macroSpec
    moduleSpec
    classSpec//valItem

Note (on #2): there is also no processing when @validUntil is on the following, but in these cases I, at least, don't know what kind of Schematron rule we would want in these cases.
schemaSpec (what does that mean?)
valDesc
valList

Date conversion in xlsxtotei

I have an xlsx spreadsheet that has a column of date times formatted as:
"25/05/2014 10:09:15" which get converted to a slightly odd number of: "41784.4230902778" or similar. I'm assuming something mathematical is happening to the date time?

FO line-height configuration

Currently transforming TEI to FO and PDF doesn't allow specifying line-height. I consider implementing corresponding independent parameters for front, body and back.
If I set the line-height attribute in the corresponding page-sequences this will affect all content in the corresponding fo:flow, including footnotes. Besides footnotes cit[@rend='block'] might be another case where increased line-height is rather unwanted.
So alternatively:

  • line-heights could either be set on div or p elements (excluding such things as footnotes and citations), or
  • other parameters might be introduced to configure line-height for footnotes and cit[@rend='block'] respectively

Any thoughts or opinions?

JSI profile for docxtotei only picks-up some italicization.

From document.xml:

This gets converted into <hi rend="italic">cui</hi>.

            <w:r>
                <w:rPr>
                    <w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman"
                        w:cs="Times New Roman"/>
                    <w:i/>
                </w:rPr>
                <w:t>cui</w:t>
            </w:r>
            <w:r>

yet this doesn't:

            <w:r w:rsidR="003B5C4D" w:rsidRPr="00E82F15">
                <w:rPr>
                    <w:i/>
                </w:rPr>
                <w:t>covient</w:t>
            </w:r>

I don't know how to force word to apply the former either. Any ideas please?

Missing main front and main back in Table of contents of TEI composite to one HTML page

Warning. I tried to use composite text to merge two existing TEIs. Instead of group-ing text A and text B, I could have em-body-ed div type="part" n=A" and div type="part" n="B" just as well — and even better. If nobody else is interested in composite texts to HTML, this is very low priority. As far as I am concerned, it is no priority at all. I was surprised by the results of my experience. I then made a test to see the extent of the problem with minimal TEI and minimal XSL to reduce the risk of errors in my real projects. This being done, I give this post (and other similar posts) purely as information, to make my test work available. Furthermore the main problem may be a weakness in the concept of elements text and group (what about the title of a text?) and these elements are maybe not worth an investment in stylesheet work until some deeper questions are clarified.

The TEI is http://www.d-meeus.be/linux/sandbox/sandbox.xml, the stylesheet is http://www.d-meeus.be/linux/sandbox/splitLevel-1/sandbox-1.xsl and the result http://www.d-meeus.be/linux/sandbox/splitLevel-1/index.html.

The two texts children of group have a poor presence in the table of contents as simply 1 and 2 without a link. The main front matter and the main back matter are completely missing from the table of contents. (A hand-made table of contents is to be found in the Foreword div of the main front matter.)

By the way, back matter may be appendixes, but some Afterword just as well, not to be called appendix. I am not sure that prefixing all back matter as Appendix N by default is appropriate (composite or not). This test shows three Appendix A in one HTML page. After all, divs in front have no prefix.

Provide xsl:output with indent="yes" for html output

When I generate an HTML documentation file from an ODD file, I get a nice web-page output and there are lots of useful parameters for adding CSS etc. But the output is all on one line (indent="no"). The W3C spec does not appear to supply a default for this parameter, but Saxon defaults to "no". A large unindented doc file takes a long time for the browser to load -- indented files seem to load much more quickly -- so I'd like to be able to specify this as a parameter, or to switch the default to "yes". I can't actually find the xsl:output element which controls the serialization of a transformation using html/html.xsl.

&lt;affiliation> element not properly handled by teitolatex

Following recommendations of the jtei package, I have a title statement like this:

     <titleStmt>
            <title type="main">How many standards do we need to model reality?</title>
            <author>
              <name><forename>Lou</forename>
                  <surname>Burnard</surname></name>
               <affiliation>Lou Burnard Consulting</affiliation> 
               <email>[email protected]</email>
            </author>
         </titleStmt>

This generates the following in my Latex file:

\usepackage[pdftitle={How many standards do we need to model reality?},
 pdfauthor={ {\name  Lou Burnard} \mbox{}\\ Lou Burnard Consulting }]{hyperref}
\hyperbaseurl{}

...

\author{ {\name  Lou Burnard} \mbox{}\\ Lou Burnard Consulting }\makeatletter 

which upsets xetex. The upset takes the form of going into a tight loop, soaking up 100% of my cpu and not saying anything at all, except in the log which after much hand waving says

I suspect you've forgotten a `}', causing me to apply this
control sequence to too much text. How can we recover?
My plan is to forget the whole thing and hope for the best.

Alas, not a good decision. Removing the <affiliation> element works fine, of course, but the JTEI schema requires it to be there.

conversion of hand in p4top5

A P4 document containing <hand xml:id="RG" style="ink correction" ink="black" character="regular" first="yes" resp="eds"/>
gets converted to <handNote xml:id="BG" style="typewritten" medium="black" character="schooled" first="no" resp="#eds"/> using the default p4top5 stylesheet used by oxgarage which is invalid: @character and @FIRST are not available on <handNote>

odd2html.xsl: incorrect output if elements in two different namespaces have the same local-name()

Summary

odd2html.xsl: incorrect output if elements in two different namespaces have the same local-name()

Versions

OS: Ubuntu 13.10

tei-p5-xsl2 6.34

How to Reproduce

  1. Clone or otherwise download the file tree at: https://github.com/lddubeau/teibugs/tree/master/teibug32

  2. In the directory teibug32, issue:

    $ roma2 --doc --dochtml --nodtd --noxsd myTEI.xml out

Actual Results

The file out/foo.doc.html does not allow to distinguish on the basis of a simple HTML link the documentation for TEI's cit from the documentation for the new element cit in the namespace http://foo.foo/foo (henceforth referred with the foo: prefix). The documentation fragments for each of these two element are given the same HTML id: "cit".

Note that in effect this makes some of the documentation incorrect because when a content model means to link to foo:cit, the browser brings the user to TEI's cit element instead.

Expected Results

A structure that allows distinguishing foo:cit from TEI's cit using simple HTML links.

Test/Makefile: foo target looks spurious

The foo target in Test/Makefile seems unused. If it is useful to keep it there, I'd suggest a) renaming it to something else than foo and b) adding a comment to elucidate why it needs to be kept even though it is not normally used.

result of html2tei is Invalid

I transformed a html which containes these tags:
google

test

after transform I pass to validation method (based on tei_all.xsd) and its result is these:



invalid
http://www.tei-c.org/ns/1.0
8
cvc-complex-type.3.2.2: Attribute 'type' is not allowed to appear in element 'graphic'.
cvc-complex-type.3.2.2: Attribute 'align' is not allowed to appear in element 'table'.
cvc-complex-type.3.2.2: Attribute 'border' is not allowed to appear in element 'table'.
cvc-complex-type.3.2.2: Attribute 'cellpadding' is not allowed to appear in element 'table'.
cvc-complex-type.3.2.2: Attribute 'cellspacing' is not allowed to appear in element 'table'.
cvc-complex-type.3.2.2: Attribute 'width' is not allowed to appear in element 'table'.



What should I do?
thanks

Stylesheet for FO output generates invalid FO output

Version 7.24.0 of fo/fo.xsl (the TEI stylesheet for FO output) creates invalid FO output from TEI P5 files generated from an ODD schema. For example the FO created by the fo/fo.xsl stylesheet from the TEI Lite XML file (with a file size of 2 MB) generated from the following ODD file:

<TEI xmlns="http://www.tei-c.org/ns/1.0">
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title>Title</title>
      </titleStmt>
      <publicationStmt>
        <p>Publication Information</p>
      </publicationStmt>
      <sourceDesc>
        <p>Information about the source</p>
      </sourceDesc>
    </fileDesc>
  </teiHeader>
  <text>
    <body>
      <schemaSpec ident="oddex1" start="TEI">
        <moduleRef key="header"/>
        <moduleRef key="core"/>
        <moduleRef key="tei"/>
        <moduleRef key="textstructure"/>
      </schemaSpec>
    </body>
  </text>
</TEI>

is a FO file (with a file size of 3 MB) which has this FO validation error:

The column-number or number of cells in the row overflows the number of fo:table-columns specified for the table. (See position 562:-1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.