ceurws / ceur-make Goto Github PK
View Code? Open in Web Editor NEWA set of scripts to semi-automatically generate workshop proceedings for CEUR-WS.org
License: GNU General Public License v3.0
A set of scripts to semi-automatically generate workshop proceedings for CEUR-WS.org
License: GNU General Public License v3.0
For now I'm going to blame this on CEUR-WS.org, whose template has always been invalid. See http://validator.w3.org/check?uri=http://ceur-ws.org/Vol-XXX/index.html&charset=(detect+automatically)&doctype=Inline&group=0&user-agent=W3C_Validator/1.3+http://validator.w3.org/services
In future I should discuss this with CEUR-WS.org, whether there is a way of making index.html valid without disrupting their publication workflow.
Standardise a way to mark them up in toc.xml (I expect little to no support from the EasyChair metadata) and choose one or a few "good practice" HTML output formats. Here are examples of recent joint proceedings:
… upon agreement with CEUR-WS.org
But simply matching 999*
doesn't work, as it would also match the 9999copyrights
directory, so we'd either need to skip that directory in the for
loops, or use extended globbing to match digits only.
Instead of ceur-ws/temp.bib
the Makefile should create a BibTeX file named by the workshop ID. Find out how the name of a make target can be parameterized; maybe using a second pass.
Use BIBO etc.
Depending on how one generates the proceedings volume from toc.xml (via toc.tex), the first paper doesn't start on page 1, as a title page, table of contents, etc. might occur before. As the page numbering in toc.xml (which are taken from EasyChair) propagates to ceur-ws/temp.bib as well as ceur-ws/index.html, it would make sense to specify an offset (e.g. “5 pages”), which is respected when generating toc.xml.
Jyrki Nummenmaa:
The instructions related to toc.xml generated from EasyChair project do not particularly mention what to do with frontmatter. Maybe it is self-evident to the editors?
We could consider auto-generating a LaTeX preface.pdf and adding it to the table of contents.
To avoid dependency issues with libraries
i.e. one subdirectory per paper
Currently we use display name for the author. Should we move or also incorporate givenName and familyName specifically? It would mean that the toc needs to have a field for it. Another thing to investigate: is easychair metadata making that distinction or only giving display name?
Editors can have their own homepage
URL, should it be possible to also add the affiliations urls (sort of like workplaceHomepage
)?
Jyrki Nummenmaa:
When I view the generated index.html file, it does not find the ceur-ws css file since the reference is relative. I do not think it would hurt to make the reference absolute in which case the file would automatically look ok with the right style.
Two possible solutions to make the relative link work:
ceur-make intends to output sane RDFa, i.e. RDFa that uses reasonable URIs for things and that is valid w.r.t. the vocabularies used.
However,
An easy way of implementing this would be a combination of
e.g. toc.xml
For each of them there should be a Relax NG schema.
The Makefile should (optionally?) invoke a validator, e.g. xmllint.
… so that users recognise the workshop when downloading papers.
Given an existing xml output and html output try replacing the processing with python
I believe the toc2ceurindex.xsl
is at CEURVERSION=2015-12-02
. The current version of Vol-XXX/index.html file is at CEURVERSION=2020-07-09
.
Would it be possible please to update the XSL file? I've seen most new additions to CEUR-WS follow the 2020 template, as requested by CEUR-WS ("Always use the latest template"), probably based on manual edition of the HTML template, but obviously don't benefit from your RDFa annotations, which is quite a pity.
Requirements:
Better alternatives than mere documentation:
Upon releasing this change, manually strip existing volumes from such comments.
As suggested by Olaf Hartig, it's not the proceedings but the workshop event that's a subevent of some conference. Model the RDFa this way (even though it's not straightforward given the existing HTML structure).
check if title already starts with "Proceedings of the ". If so, don't add this.
Solution: use different XPath expression
See @csarven's draft implementation in https://github.com/ceurws/ceur-make/blob/linked-research/toc2ceurindex.xsl#L188. This will require the toc.xml ad hoc schema to be extended (so maybe a task for @csarven and @clange to work on together).
From EasyChair we probably won't get session information. But we could extend the documentation of ceur-make as follows:
make toc.xml
to generate toc.xml
.toc.xml
Michael Cochez reported:
I am publishing a CEUR volume, and wanted to use index2main. Now. It
appears the volume has been created using ceur-make and the script is
not able to extract the information from the index.html file. Is that
a known issue?
ceurws@mars:~/www/Vol-2849$ index2main index.html
line 12 column 7 - Error:
I did now manually create the block for the homepage.
i.e. incorporate the diff between Vol-XXX/index.html and Vol-AIXIA/index.html into ceur-make and make it switchable by a flag.
A request from user Pascal Fontaine:
I think the process can be further automated, by automatically compiling
(with the right page numbers, uniform title emphasis style,
letter format, CEUR footnotes, etc...), creating the zip, maybe checking
correspondence of titles and authors between TeX and Easychair (I know
and fix various little things). You could restrict the input style to a
few of them.
If toc.xml contains sessions, the papers are not included, only the proceedings entry is generated.
Reproduce by creating a toc.xml with sessions and run 'make ceur-ws/temp.bib'
http://ceur-ws.org/Vol-994/ demonstrates how to mark them up in RDFa, but this is not part of the ceur-make workflow.
Hi, the generated index.html has some discrepancies with the one available at http://ceur-ws.org/Vol-XXX/index.html as an example, especially regarding the copyright information.
Thanks in advance,
Sonsoles
After running make, for some reason the link for each paper in index.html was "$pdf" instead of "paper-01.pdf" &c.
Editor's affiliation country
value is a string which makes it possible to have any value e.g., fullname of the country, ISO 3166-1-alpha-2 etc. It may be preferable to standardise on the format. It would mean that the values should be entered as ISO 3166-1-alpha-2 e.g., CA
, or a wikipedia URL e.g., http://en.wikipedia.org/wiki/Canada
(which we can map to dbpedia during transformation, like we do for the location of the event). Both can be mapped in any case.
Basic functionality to implement (sync with latest version of [http://ceur-ws.org/HOWTOSUBMIT.html#TOPERRORS](top mistakes)):
Should output a command line for error-report
. Initially just with --error
parameters for each error encountered, later with arguments (e.g. name of erroneous paper file).
Find out the right ISO standard for this, or implement a translation from ISO to what CEURLANG
uses.
Use BibLaTeX; support both Biber and BibTeX.
make retex currently runs Perl on its own to find out the “command to create document” (actually just the main LaTeX source, not the command, but that's a separate issue) from each paper's README_EASYCHAIR
file. This is something that easychair2xml.pl could easily do.
TODO fix this ticket to link to the relevant sources
TODO create ticket for the separate issue
The Makefile rule
ceur-ws/paper-01.pdf: ceur-ws ID
is re-executed whenever ceur-ws is newer than ceur-ws/paper-01.pdf. The timestamp of the directory ceur-ws gets updated whenever a directory entry is added/deleted/renamed.
But we mean that before creating ceur-ws/paper-01.pdf, the directory ceur-ws should be created.
This could be controlled by creating, in the same rule that creates the directory, a hidden file ceur-ws/.directory
, and depending on that file. However this file would have to be excluded from the ZIP.
output: one affiliation footnote shared by all authors having this affiliation
We currently say that a proceedings volume as "presented at" a workshop event. However weren't rather the individual papers presented a workshop?
i.e. use foaf:name and foaf:homepage
… and correct output to
RDFa:
time: http://www.w3.org/2006/time#
timeline: http://purl.org/NET/c4dm/timeline.owl#
<span rel="event:time" typeof="time:Interval"><span property="timeline:beginsAtDateTime" content="2013-07-09">July 9th</span>—<span property="timeline:endsAtDateTime" content="2013-07-10">10th</span></span>.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.