Code Monkey home page Code Monkey logo

dviasm's People

Contributors

aminophen avatar blipvert avatar khaledhosny avatar muzimuzhi avatar reutenauer avatar trueroad avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

dviasm's Issues

Dumping and compiling \special

There are problems in dumping/compiling contents of \special. MWE:

\documentclass[a4paper,dvipdfmx]{article}
\usepackage{hyperref}
\usepackage{pxjahyper}

\hypersetup{%
  pdftitle={日本語の表現テスト},
  pdfsubject={カタカナのソ},
  pdfauthor={全角 空白},
  pdfkeywords={キーワードたち},
  pdfdisplaydoctitle=true}

\begin{document}

\section{こことか}
\section{そことか}

\end{document}

pTeX

Japanese characters appearing in \special's are encoded in the internal kanji encoding (Windows: sjis, Unix: eucjp). However, dviasm option -p (or --ptex) does not know how to decode it.

$ platex spcj-p.tex
$ dviasm -p spcj-p.dvi -o spcj-p.dump
Traceback (most recent call last):
  File "/path/to/dviasm", line 1250, in <module>
    aDVI.Load(args[0])
  File "/path/to/dviasm", line 311, in Load
    self.LoadFromFile(fp)
  File "/path/to/dviasm", line 326, in LoadFromFile
    page = self.ProcessPage(fp)
  File "/path/to/dviasm", line 512, in ProcessPage
    q = fp.read(p).decode('utf8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc6 in position 20: invalid continuation byte

upTeX

Japanese characters appearing in \special's are encoded in utf8, and it seems that dviasm can dump it correctly. However, the compilation into DVI seems corrupt.

$ uplatex spcj.tex
$ dviasm spcj.dvi -o spcj.dump
$ dviasm spcj.dump -o spcj.dump.dvi
$ dvipdfmx spcj-p.txt.dvi
spcj.dump.dvi -> spcj.dump.pdf
[1
dvipdfmx:warning: Could not find any valid object.
dvipdfmx:warning: Could not find a value in dictionary object.
dvipdfmx:warning: Dictionary object expected but not found.
dvipdfmx:warning: Interpreting special command docinfo (pdf:) failed.
dvipdfmx:warning: >> at page="1" position="(72, 769.89)" (in PDF)
dvipdfmx:warning: >> xxx "pdf:docinfo<</Title(\xe6\x97\xa5\xe6\x9c\xac\xe8\xaa\x9e\xe3"
dvipdfmx:warning: >> Reading special command stopped around >><</Title(\xe6\x97\xa5\xe6\x9c\xac\xe8\xaa\x9e\xe3\x81\xae\xe.<<
dvipdfmx:fatal: No font selected!

No output PDF file written.

Wrong length

I've found that dviasm outputs dvi with some wrong lengths.
Here's example.

Input file:

[preamble]
id: 2
numerator: 25400000
denominator: 473628672
magnification: 1000
comment: ''

[postamble]
maxv: 43725786sp
maxh: 30785863sp
maxs: 1
pages: 1

[font definitions]
fntdef: cmr10 at 10pt

[page 1 0 0 0 0 0 0 0 0 0]
right: 1310720sp
fnt: cmr10 at 10pt
set: 'a'
w: 65536sp
set: 'a'
w: 65535sp
set: 'a'
w: 32768sp
set: 'a'
w: 32767sp
set: 'a'

Commands:

$ dviasm -u sp foobar.txt > foobar.dvi
$ dviasm -u sp foobar.dvi > foobar.2.txt
$ diff -u foobar.txt foobar.2.txt

Result:

--- foobar.txt
+++ foobar.2.txt
@@ -20,9 +20,9 @@
 set: 'a'
 w: 65536sp
 set: 'a'
-w: 65535sp
+w: -1sp
 set: 'a'
-w: 32768sp
+w: -32768sp
 set: 'a'
 w: 32767sp
 set: 'a'

65536sp and 32767sp are no problem.
65535sp and 32768sp are wrong lengths.

Subfont routine is not working

Test cases "ajt02omega" and "ajt06kr" (taken from the documentation mentioned in README) have been broken since the beginning of XDV native font command support.

I will add tests for XDV and examine what to do.

Problematic DVI format of dviasm: FNTDEF before BOP

$ tex \\relax a\\end                           % => "texput.dvi" is written
$ dviasm texput.dvi >texput.dump
$ dviasm texput.dump -o texput.dump.dvi

The resulting "texput.dump.dvi" seems valid (at least for dvitype), but FNTDEF comes before BOP; I think such a DVI format is valid, as dvitype and most of the major DVI drivers accept it. However, there are some issues:

(1) dvi2tty hangs after an error

$ dvi2tty texput.dump.dvi
dvi2tty: Missing beginning-of-page command

It waits for user [Return] input, instead of exiting right away.

(2) dvispc fails to print FNTDEF before BOP.

$ dvispc -a texput.dump.dvi >texput.dump.spc.txt

It ignores FNTDEF before BOP as written in "texput.dump.dvi".

(3) dvi2tty crashes while processing dvispc's result

$ dvispc -x texput.dump.spc.txt texput.dump.spc.dvi
$ dvi2tty texput.dump.spc.dvi
Segmentation fault: 11

The dvispc's result "texput.dump.spc.dvi" does not contain FNTDEF before BOP or before the appearance of SETCHAR, so it's actually problematic (dvipdfmx seems ok, but dvips fails)

$ dvips texput.dump.spc.dvi
"This is dvips(k) 2022.1 (TeX Live 2022)  Copyright 2022 Radical Eye Software (www.radicaleye.com)
' TeX output 2022.03.10:2237' -> texput.spc.ps
Font number 0 not found
dvips: ! no font selected

but dvitype does not complain about it, so I'm not sure such a DVI file is valid. However, in any circumstances dvi2tty should not cause a segmentation fault.

Conversion issues

Im getting unexpected and incorrect dvi codes after converting to and from dviasm.

dvipdf output:
run_1
dvipdf output, after converting to & from dviasm:
run_2

Steps to reproduce:
I have a .dvi file, built with groff.

> echo ".PP\nabcdefåäö" > min.ms
> groff -Tdvi -k -ms min.ms > min.dvi

Now converting it to and from dviasm:

> dviasm -o min.dviasm min.dvi
> dviasm -o min_conv.dvi min.dviasm

Displaying min.dvi with dvipdf gives output displayed in pic 1.
Displaying min_conv.dvi with dvipdf gives output displayed in pic 2.

Also confirmed that there is a difference between min.dvi min_conv.dvi.

Not sure if i can embedd files in this report. If someone is looking into it, i can send the built dvi from groff. Don't think this is groff related.

"put" command not dumped

  • The 'put' commands (put1, put2, put3, put4) cannot be properly dumped:
$ dviasm test/putj.dump.dvi
Traceback (most recent call last):
  File "dviasm.py", line 1252, in <module>
    else:              aDVI.DumpToFile(sys.stdout, tabsize=options.tabsize, encoding=options.encoding)
  File "dviasm.py", line 967, in DumpToFile
    fp.write("put: %s\n" % PutStr(cmd[1]))
  File "dviasm.py", line 237, in PutStrUTF8
    for o in t:
TypeError: 'int' object is not iterable

dviasm does not work with characters beyond 0x10000

I used unicode-math to convert the characters on the paper to Unicode. After that dviasm ceased to work.

Traceback:
Traceback (most recent call last):
File "../dviasm/dviasm.py", line 1214, in
else: aDVI.DumpToFile(sys.stdout, tabsize=options.tabsize, encoding=options.encoding)
File "../dviasm/dviasm.py", line 936, in DumpToFile
fp.write("set: %s\n" % PutStr(cmd[1]))
File "../dviasm/dviasm.py", line 230, in PutStrUTF8
else: s += unichr(o).encode('utf8')
ValueError: unichr() arg not in range(0x10000) (narrow Python build)

Many mathematical alphanumerical symbols are outside this range. Maybe if you are willing to explain how this works I can help you fix it.

[WIP] command-line encoding option

In the current public version 20200905, the -e sjis option causes an error.

  • I fixed the issue in the current dev version; also improved to accept -e eucjp.
  • On the other hand, compared to the original ChoF version, support of dumping with -e sjis through stdout is dropped. It seems that such usage can be ignored, and also supporting DVI->dump through stdout is difficult due to Python2 -> Python3 migration (strict distinction between strings and bytes), so there should be no problem.
# Note: The original ChoF version supported only
#         * DVI ->(-e sjis -p)-> dump(file)
#         * DVI ->(-e sjis -p)-> dump(stdout)
#       The current version by HY supports
#         * DVI ->(-e sjis/eucjp -p)-> dump(file)
#         * dump(file) ->(-e sjis/eucjp -p)-> DVI

Support request for XDV_ID other than 6

Recent version (TeX Live 2017) of XeTeX produces XDV file with id = 7, however, dviasm accepts only id = 6. According to xetex.web, the id = {4,5,6,7} is possible in the history of XeTeX, so it would be nice if all of these id's are accepted.

DVI with zero page, stackdepth

According to section A.1 of "The DVI Driver Standard, Level 0" (dvistd0.pdf),

A DVI file consists of a "preamble," followed by a sequence of
one or more "pages," followed by a "postamble."

thus dumping/compiling a DVI with zero page is invalid; added a warning about that (2f6ee6f).


BTW, while compiling a DVI with dviasm, the maximum stack depth in the postamble is set to "stackdepth+1" instead of the exact "stackdepth" value. Why?

fp.write(b''.join([bytes.fromhex('%02x' % POST), PutSignedQuad(loc), PutSignedQuad(self.numerator), PutSignedQuad(self.denominator), PutSignedQuad(self.mag), PutSignedQuad(self.max_v), PutSignedQuad(self.max_h), Put2Bytes(stackdepth+1), Put2Bytes(len(self.pages))]))

I noticed this behavior of dviasm when comparing between the two:

  • The output of DVIasm containing only one empty page
  • The output of Knuth TeX with the command tex -output-comment='' '\shipout\vbox{}\end'

Both of these outputs are exactly 100 bytes (= the smallest DVI file which conforms to the standard), however, only the maximum stack depth is 1 vs 0.

dviasm and python 3

dviasm doesn't work with python 3. Would it be possible to update it or add a python 3 variant?

Duplicated backslash char, wrong math characters

It seems that the upstream is now archived (https://github.com/khaledhosny/dviasm), so I don't know where to report this. This is just a note. (@khaledhosny)

\documentclass{article}
\begin{document}

\verb+\usepackage{amsmath}+
$-1$

\begin{itemize}
\item First
\item Second
\end{itemize}

\end{document}

Original DVI

$ latex test.tex
$ dvipdfmx test.dvi

20200830-dviasm-0

After processing with DVIasm

$ dviasm test.dvi >test-out.txt
$ dviasm -o test-out.dvi test-out.txt
$ dvipdfmx test-out.dvi

20200830-dviasm-1

Option -p discards some characters

With option -p (ISO-2022-JP-encoded DVI for Japanese pTeX), some characters are removed. It seems that bullet (0x88) is discarded by DecodeISO2022JP().

Original DVI

Same as #12.

After processing with DVIasm

$ dviasm -p test.dvi >test-out.txt
$ dviasm -p -o test-out.dvi test-out.txt
$ dvipdfmx test-out.dvi

20200830-dviasm-2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.