Comments (4)
Looking back at LZXFMT.DOC from CABSDK.EXE, you're right, it does include extra == 0
logic in the pseudocode, exactly as seen on the page you linked. I must have misread it. It's also hiding in the LZX DELTA specification logic, where the logic would run verbatim_bits = readbits(0); aligned_bits = 0;
And yes, there's a discrepancy with the java implementation, which has the logic you see in lzxd.c:
int extra = extra_bits[position_slot];
match_offset = position_base[position_slot];
if (extra > 3) match_offset += readbits(extra - 3) << 3) + aligned_tree.readsym();
else if (extra == 3) match_offset += aligned_tree.readsym();
else if (extra != 0) match_offset += readbits(extra);
else match_offset = 1;
I have thousands of cab files with extra == 0
cases. cabextract, EXTRACT.EXE (from 1997) and EXPAND.EXE (from the 2000s) all agree on the decoded result. The smallest one is 305 bytes and I've printed it in base64 below. If I change lzxd.c to follow the specification, it starts to get the result wrong.
$ base64 < 0625.cab
TVNDRgAAAAAxAQAAAAAAACwAAAAAAAAAAwEBAAEAAAAAAAAASQAAAAEAAxX+AgAAAAAAAAAA6yJg
pwAAYXJyb3dfcmwuY3VyAPRpnfLgAP4CW4CAjQAg6C+NlADBAAADAAA0NAC2DoTSCjtLYidxLIHm
/yd1Iv9iT8QEXLoAIAAAZgDAjGEKCt9fOIXvSta7Fp3AUuICkRAKQpAQCuiY/U39/f/9gGgAAAAA
AAAMjICDBu4GEfv94vczBnjmSNJfZX1w+llcmmvGkPP/+LZ928xgk3+1jbc89tyWC07HWE9tZ+jf
uV2n6FtW7WQkrGciPVHl/IxMDt/omBEgIwAAAAAAAAIA/5EAoGYTN/eyUZSI1IJoddEsEqqlNLTO
WRfA5RFRJcvSQoqniAJRjqSUsII=
$ wine extract.exe /e 0625.cab
Microsoft (R) Diamond Extraction Tool - Version (32) 1.00.0601 (03/18/97)
Copyright (c) Microsoft Corp 1994-1997. All rights reserved.
Cabinet 0625.cab
Extracting arrow_rl.cur
$ md5sum arrow_rl.cur
c8f980bf103d1c781ed71b25e46e7219 arrow_rl.cur
$ WINEPREFIX=$HOME/.wine64 WINEARCH=win64 wine expand.exe 0625.cab file.out
Microsoft (R) File Expansion Utility Version 6.1.7600.16385
Copyright (c) Microsoft Corporation. All rights reserved.
Adding Z:\tmp\file.out to Extraction Queue
Expanding Files ....
Expanding Files Complete ...
$ md5sum file.out
c8f980bf103d1c781ed71b25e46e7219 file.out
$ cabextract -t 0625.cab
Testing cabinet: 0625.cab
arrow_rl.cur OK c8f980bf103d1c781ed71b25e46e7219
All done, no errors.
$ ./cabextract-to-spec -t 0625.cab
Testing cabinet: 0625.cab
arrow_rl.cur OK 2188d750947c1fbb24dba4080dd7b51d
from libmspack.
That is bizarre to me, because it implies they're trying to code a 1 match_offset but are using a position_slot other than 3, which would already have 0 extra bits. But it seems like that's the correct behavior regardless?
from libmspack.
Thanks for your recommendation to look at it. I rewrote the decoding from scratch in 202ab41 which does correctly decode files, including the example above. I'm not entirely sure where the discrepancy lies, but there is no longer a special case for extra == 0, and it does work. Thanks!
from libmspack.
Nice, that's very similar to what I do now, except I handle the match_offset == 3 case explicitly in a separate test, which means I never have to test for extra > 0. It also means I can fold the -2 into the position_base_table, and omit the first four elements as well.
from libmspack.
Related Issues (20)
- Regression when extracting cabinets using "-F" option HOT 3
- cabextract doesn't build from master HOT 2
- memory exhausted in oabd_decompress() HOT 2
- memory exhausted in chmd_read_headers() HOT 1
- Heap buffer overflow in chmd_read_headers() HOT 7
- Multiple filters in one command HOT 6
- 1.9.1: issue with dist tar ball HOT 2
- Not clear on Github how to get libmspack vs cabextract releases HOT 17
- chmextract HOT 2
- Conflicting definitions for copy_fh HOT 3
- configure / libtool fails with -flto HOT 12
- configure / libtool fails with -flto HOT 6
- Compilier warnings with 1.9.1 HOT 4
- Compiler error with 1.9.1 / gcc 9.3.1 HOT 5
- Malloc size error in chmd.c:1327:34 HOT 2
- build fails on macos HOT 1
- cabextract: Writing into symlinks HOT 11
- Issue with KWAJ method 2 decompression HOT 2
- Fail to properly create path components coming from the archive HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libmspack.