ning / compress Goto Github PK

High-performance, streaming/chunking Java LZF codec, compatible with standard C LZF package

License: Other

Shell 0.27% Java 99.73%

compress's Introduction

LZF Compressor

Overview

LZF-compress is a Java library for encoding and decoding data in LZF format, written by Tatu Saloranta ([email protected])

Data format and algorithm based on original LZF library by Marc A Lehmann. See LZF Format Specification for full description.

Format differs slightly from some other adaptations, such as the one used by H2 database project (by Thomas Mueller); although internal block compression structure is the same, block identifiers differ. This package uses the original LZF identifiers to be 100% compatible with existing command-line lzf tool(s).

LZF algorithm itself is optimized for speed, with somewhat more modest compression. Compared to the standard Deflate (algorithm gzip uses) LZF can be 5-6 times as fast to compress, and twice as fast to decompress. Compression rate is lower since no Huffman-encoding is used after lempel-ziv substring elimination.

License

Apache License 2.0

Requirements

Versions up to 1.0.4 require JDK 6; versions from 1.1 on require JDK 8.

Library has no external dependencies.

Usage

See Wiki for more details; here's a "TL;DNR" version.

Both compression and decompression can be done either by streaming approach:

InputStream in = new LZFInputStream(new FileInputStream("data.lzf"));
OutputStream out = new LZFOutputStream(new FileOutputStream("results.lzf"));
InputStream compIn = new LZFCompressingInputStream(new FileInputStream("stuff.txt"));

or by block operation:

byte[] compressed = LZFEncoder.encode(uncompressedData);
byte[] uncompressed = LZFDecoder.decode(compressedData);

and you can even use the LZF jar as a command-line tool (it has manifest that points to 'com.ning.compress.lzf.LZF' as the class having main() method to call), like so:

java -jar compress-lzf-1.1.2.jar

(which will display necessary usage arguments for -c(ompressing) or -d(ecompressing) files.

Adding as Dependency

Maven

<dependency>
  <groupId>com.ning</groupId>
  <artifactId>compress-lzf</artifactId>
  <version>1.1.2</version>
</dependency>

Module info (JPMS)

Starting with version 1.1, module-info.class is included; module name is com.ning.compress.lzf so you will need to use:

requires com.ning.compress.lzf

Parallel processing

Since the compression is more CPU-heavy than decompression, it could benefit from concurrent operation. This works well with LZF because of its block-oriented nature, so that although there is need for sequential processing within block (of up to 64kB), encoding of separate blocks can be done completely independently: there are no dependencies to earlier blocks.

The main abstraction to use is PLZFOutputStream which a FilterOutputStream and implements java.nio.channels.WritableByteChannel as well. It use is like that of any OutputStream:

PLZFOutputStream output = new PLZFOutputStream(new FileOutputStream("stuff.lzf"));
// then write contents:
output.write(buffer);
// ...
output.close();

Interoperability

Besides Java support, LZF codecs / bindings exist for non-JVM languages as well:

C: liblzf (the original LZF package!)
C#: C# LZF
Go: Golly
Javascript(!): freecode LZF (or via SourceForge)
Perl: Compress::LZF
Python: Python-LZF
Ruby: glebtv/lzf, LZF/Ruby

Check out jvm-compress-benchmark for comparison of space- and time-efficiency of this LZF implementation, relative other available Java-accessible compression libraries.

Project Wiki.

Alternative High-Speed Lempel-Ziv Compressors

LZF belongs to a family of compression codecs called "simple Lempel-Ziv" codecs. Since LZ compression is also the first part of deflate compression (which is used, along with simple framing, for gzip), it can be viewed as "first-part of gzip" (second part being Huffman-encoding of compressed content).

There are many other codecs in this category, most notable (and competitive being)

Snappy
LZ4

all of which have very similar compression ratios (due to same underlying algorithm, differences coming from slight encoding variations, and efficiency differences in back-reference matching), and similar performance profiles regarding ratio of compression vs uncompression speeds.

compress's People

Contributors

Stargazers

Watchers

Forkers

abramsm iikku whoschek strategist922 jpbempel almubinsultan francoisforster willb hellblazer ejchet apsaltis wewela szymonwartak congmo thmd idelpivnitskiy rjernst serverperformance achun2080 tianshangjun zhangjinde hhzzhz rishards digideskio liinnux akashshakya ioucn brucezu lllding flyingbeans ralic zhangwusheng maurice-betzel kogupta vijayeluri alexsoog carryxyh wtbian mailaender marcono1234

compress's Issues

Implement skip() efficiently, without needing to decode if possible

Currently InputStream.skip(...) functionality uses same code path as reads, which works functionally, but does unnecessary work of decoding. For some use cases it would be really great to do "coarse skip", since format allows one to completely skip a chunk without decoding, as the length is known on per-chunk basis. Depending on underlying storage format, this could possibly allow skipping some reads too.

Once implemented, it would be also be good to see if a file-based approach could do even better and use random-access for access. But first steps first.

`Unsafe` needs support in `module-info.java`

In order to access sun.misc.Unsafe on Java 9+ (I am testing on Java 17), you have to add this line to module-info.java:

requires jdk.unsupported;

Unsafe clean up of Thread Local Value

This might end up being a serious bug in multithreaded enviornment. GZIPRecycler instance is stored as Thread Local variable. The GZIPRecycler.allocDeflater() instantiates Deflater if not already initialized and the GZIPRecycler.releaseDeflater() release/clears it from GZIPRecycler instance. Between these two calls if any app error occurs then the Deflater is still there with the thread, which holds states of works. Any subsequent call of OptimizedGZIPOutputStream with the dirty thread will end up with nasty output.

same scenarios goes for the Inflater too.

Unsafe-based decompressor of 0.9.7 fails on 2 sample files from 'maxcomp' data set

Looks like latest version of "Unsafe" (aka optimal) decompressor has a small problem with specific test data files, found from test files from "maximum compression" data sets, as per:

https://github.com/ning/jvm-compressor-benchmark/wiki

and specifically, "vcfiu.hlp" and "ohs.doc". Since sizes are not affected, I am guessing it must be a specific overlapping back reference. Need to add a unit test to reproduce, issue a critical fix.

Add Java 9+ module info using Moditect

Let's add JPMS module-info.java, build with Java 8; and then release 1.1 with that.

UnsafeChunkDecoder in ning-compress-0.9.0 leads to SIGSEV in gc

Hi Tatu,

I ran across this bug here: Use of UnsafeChunkDecoder in ning-compress-0.9.0 leads to SIGSEV in gc. Workaround: use VanillaChunkDecoder.

[junit] Testsuite: TestLZFOnFiles
[junit] #
[junit] # A fatal error has been detected by the Java Runtime Environment:
[junit] #
[junit] # SIGSEGV (0xb) at pc=0x00007f1a73258829, pid=18898, tid=139751499331328
[junit] #
[junit] # JRE version: 6.0_29-b11
[junit] # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.4-b02 mixed mode linux-amd64 compressed oops)
[junit] # Problematic frame:
[junit] # V [libjvm.so+0x760829] PSPromotionManager::copy_to_survivor_space(oopDesc*)+0x89
[junit] #
[junit] # An error report file with more information is saved as:
[junit] # /export/skytide/version2-main/hs_err_pid18898.log
[junit] #
[junit] # If you would like to submit a bug report, please visit:
[junit] # http://java.sun.com/webapps/bugreport/crash.jsp
[junit] #
[junit] Tests FAILED (crashed)

Reproducible always with ning-compress-0.9.0 on the following platforms:
OracleJava-1.7.0u1 on Ubuntu (server VM)
OracleJava-1.6.0_29 on Ubuntu (server VM)
OracleJava-1.6.0_26 on OSX 10.6.8 (server VM)

Note that the test passes with the VanillaChunkDecoder (to test this I had to add a system property "com.ning.compress.lzf.ChunkDecoderFactory.disableUnsafe" to ChunkDecoderFactory).

Testcase is attached below. I can't figure out how to attach the test data files to this github issue tracker system. I could send them via email, though.

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.DataInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import java.io.Writer;

import junit.framework.TestCase;

import com.ning.compress.lzf.LZFInputStream;
import com.ning.compress.lzf.LZFOutputStream;

public class TestLZFOnFiles extends TestCase {

public TestLZFOnFiles(String name) {
    super(name);
}

public void testLZFCompressionOnTestFiles() throws Throwable {
    for (int i = 0; i < 1000; i++) {
        log("iteration: " + i);
        testLZFCompressionOnDir(new File("shakespeare"));
    }
}

private void testLZFCompressionOnDir(File dir) throws Throwable {
    File[] files = dir.listFiles();
    for (int i = 0; i < files.length; i++) {
        File file = files[i];
        if (!file.isDirectory()) {

            testLZFCompressionOnFile(file);
        } else {
            testLZFCompressionOnDir(file);
        }
    }
}

private void testLZFCompressionOnFile(File file) throws Throwable {
    // compress
    log("  compressing file " + file);
    // File compressedFile = createEmptyFile("test.lzf");
    File compressedFile = new File("/tmp/test.lzf");
    InputStream in = new BufferedInputStream(new FileInputStream(file));
    OutputStream out = new LZFOutputStream(new BufferedOutputStream(
            new FileOutputStream(compressedFile)));
    byte[] buf = new byte[64 * 1024];
    int len;
    while ((len = in.read(buf, 0, buf.length)) >= 0) {
        out.write(buf, 0, len);
    }
    in.close();
    out.close();

    // decompress and verify bytes haven't changed
    log("decompressing and verifying file " + file);
    in = new BufferedInputStream(new FileInputStream(file));
    DataInputStream compressedIn = new DataInputStream(new LZFInputStream(
            new BufferedInputStream(new FileInputStream(compressedFile))));
    while ((len = in.read(buf, 0, buf.length)) >= 0) {
        byte[] buf2 = new byte[len];
        compressedIn.readFully(buf2, 0, len);

        byte[] trimmedBuf = new byte[len];
        System.arraycopy(buf, 0, trimmedBuf, 0, len);

        assertEquals(trimmedBuf, buf2);
    }
    assertEquals(-1, compressedIn.read());
    in.close();
    compressedIn.close();
}

private void log(String msg) throws IOException
{
    System.out.println(msg);
    writeFile(new File("/tmp/log"), msg + "\n", true);
}

private static void writeFile(File file, String fileData, boolean append)
        throws IOException {
    if (!file.getParentFile().exists()) {
        file.getParentFile().mkdir();
    }

    Writer w = new OutputStreamWriter(new FileOutputStream(file, append),
            "UTF-8");
    try {
        w.write(fileData);
    } finally {
        w.close();
    }
}

private void assertEquals(byte[] expected, byte[] actual) {
    if (expected == null && actual != null) {
        fail();
    }
    if (expected != null && actual == null) {
        fail();
    }
    if (expected.length != actual.length) {
        fail();
    }
    for (int i = 0; i < expected.length; i++) {
        if (expected[i] != actual[i]) {
            fail();
        }
    }
}

}

Incorrect de-serialization leading to stream corruption in Big Endian systems

The below test case (runnable in a SPARK environment using hadoop libraries in the classpath, can provide a stand-alone test case if need be) fails with an error:
java.io.InvalidClassException: org.apache.spark.SerializableWritable; local class incompatible: stream classdesc serialVersionUID = 6301214776158303468, local class serialVersionUID = -7785455416944904980

import java.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.spark.serializer.*;
import com.ning.compress.lzf.*;

public class Test
{
   public static void main(String [] args)
   {
      org.apache.spark.SerializableWritable<JobConf> foo = new org.apache.spark.SerializableWritable<JobConf>(new JobConf());
      try
      {

        ByteArrayOutputStream bout = new ByteArrayOutputStream();
        ObjectOutputStream stream = new ObjectOutputStream(bout);
        stream.writeObject(foo);
        byte []buf = bout.toByteArray();

        ByteArrayOutputStream cout = new ByteArrayOutputStream();
        LZFOutputStream  comp = new LZFOutputStream (cout);
        comp.write(buf);
        comp.close();

        byte[] compressed = cout.toByteArray();

        ByteArrayInputStream in = new ByteArrayInputStream(compressed);
        LZFInputStream decompress = new LZFInputStream(in);

        ByteArrayOutputStream str = new ByteArrayOutputStream();
        decompress.readAndWrite(str);
        ObjectInputStream reader = new ObjectInputStream(new ByteArrayInputStream(str.toByteArray()));
        org.apache.spark.SerializableWritable<JobConf> bar = (org.apache.spark.SerializableWritable<JobConf>) reader.readObject();
        decompress.close();
        System.out.println(bar);

      }catch(Exception i)
      {
          i.printStackTrace();
      }
   }
}

with a call stack:

    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:630)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1600)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1513)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1749)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1346)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:365)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
    at org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:165)
    at org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    at java.lang.reflect.Method.invoke(Method.java:611)
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1039)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1866)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1770)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1346)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1964)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1888)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1770)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1346)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1964)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1888)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1770)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1346)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:365)
    at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    at java.lang.reflect.Method.invoke(Method.java:611)
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1039)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1866)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1770)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1346)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1964)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1888)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1770)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1346)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:365)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
    at org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63)
    at org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139)
    at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1809)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1768)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1346)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:365)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:62)
    at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:195)
    at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
    at java.lang.Thread.run(Thread.java:738)

OptimizedGZIPInputStream fails on chunked stream

Hi,

we are using your library since 3 years and we think you have done a great job.
Recently we need to download a file in gzip format from a remote server.
The file is quite big so the server uses the transfer-encoding: chunked mechanism.
We download the file and save it in a temporary directory, then we unzip it.
If we use the java.util.zip.GZIPInputStream everything works fine and we get the whole file unzipped.
But if we use your com.ning.compress.gzip.OptimizedGZIPInputStream implementation we get only the first chunk unzipped.
It would be great if you could fix this issue so we can keep using your implementation.

Thanks
Massimo

Improve 'DataHandler` callback to allow early termination

Although content skipping is difficult to support efficiently for 'push' mode (used by Uncompressor interface), it should be possible to at least support early termination.
This can be achieved by adding return value for DataHandler.handleData() and Uncompressor.feedCompressedData methods; returning of false indicates desire to terminate the processing to avoid having to access any more source data.

Deserialize directly into a ByteBuffer

It'd be awesome to be able to deserialize directly into a provided ByteBuffer. For this, I think it makes sense to make the assumption that the ByteBuffer is large enough to hold the output and if it's not, the code can do whatever it wants as long as I can figure out that my ByteBuffer wasn't large enough somehow.

RFE: add alternate FileInputStream/FileOutputStream-based streams

While existing LZFInputStream/LZFOutputStream work fine functionally, there are cases where expectation is that an instance of FileInputStream/FileOutputStream is passed. And even though another RFE exists (and will be implemented) to allow accessing underlying stream, this is not always enough.

So since it is quite easy to sub-class FileInputStream/FileOutputStream, it would be good to add wrappers to get something like LZFFileOutputStream / LZFFileInputStream. Not necessarily cleanest thing to do, but from practical perspective a very useful thing to add.

Mistaken Code in k8s

i got Mistaken Code in k8s which use utf-8 as base language when i use LZFDecoder to decompress
but when i in normal linux web server then it shows correctly how can i solve it

README links are broken

Broken links :
http://freshmeat.net/projects/liblzf
https://github.com/ning/compress/wiki/LZFFormat

round tripping byte array lengths from 0 to 1000 fails

Hi Tatu, haven't talked to you in a while! Hope you and your family are fine after leaving Amazon for new waters.

I'm considering using your LZF code. It looks great!

As far as I can see it (0.7.0) mostly works, but I've bumped into some corner cases that don't work. Here is a test case that tests round tripping byte array lengths from 0 to 1000, and it fails at array length 2 for some reason.

Best regards,
Wolfgang Hoschek

package foo.test;

import java.io.ByteArrayInputStream;
import java.io.InputStream;
import java.io.OutputStream;

import junit.framework.TestCase;

import com.ning.compress.lzf.LZFInputStream;
import com.ning.compress.lzf.LZFOutputStream;

public class LZFCompressionTest extends TestCase {

public LZFCompressionTest(String name) {
    super(name);
}     

public void testLZFCompression() throws Throwable {
    for (int size = 0; size < 1000; size++)
    {
        System.out.println("testing size: " + size);
        byte[] expected = new byte[size];
        for (int i = 0; i < size; i++) expected[i] = (byte) i;

        ByteArrayList compressed = new ByteArrayList(size);
        OutputStream out = new LZFOutputStream(compressed.asOutputStream());
        out.write(expected);

        InputStream in = new LZFInputStream(compressed.asInputStream());
        byte[] actual = new byte[expected.length];
        int n;
        int len = 0;
        while (len < expected.length && (n = in.read(actual, len, actual.length - len)) >= 0)
        {
            len += n;
        }
        assertEquals(expected, actual);
    }        
}    

private void assertEquals(byte[] expected, byte[] actual)
{
    if (expected == null && actual != null)
    {
        printBytes(expected, actual);
        fail();
    }
    if (expected.length != actual.length)
    {
        printBytes(expected, actual);
        fail();
    }
    for (int i = 0; i < expected.length; i++)
    {
        if (expected[i] != actual[i])
        {
            printBytes(expected, actual);
            fail();
        }
    }
}

private void printBytes(byte[] expected, byte[] actual) {
    ByteArrayList list = new ByteArrayList();
    list.add(expected, 0, expected.length);
    System.out.println("expected: " + list);

    list = new ByteArrayList();
    list.add(actual, 0, actual.length);
    System.out.println("actual  : " + list);
}


private static final class ByteArrayList {

    private byte[] elements;
    private int size;

    public ByteArrayList() {
        this(64);
    }

    public ByteArrayList(int initialCapacity) {
        elements = new byte[initialCapacity];
        size = 0;
    }

    public void add(byte elem) {
        if (size == elements.length) ensureCapacity(size + 1);
        elements[size++] = elem;
    }

    public void add(byte[] elems, int offset, int length) {
        if (offset < 0 || length < 0 || offset + length > elems.length) 
            throw new IndexOutOfBoundsException("offset: " + offset + 
                ", length: " + length + ", elems.length: " + elems.length);

        ensureCapacity(size + length);
        System.arraycopy(elems, offset, this.elements, size, length);
        size += length;
    }    

    public void ensureCapacity(int minCapacity) {
        if (minCapacity > elements.length) {
            // biggest we can get is Integer.MAX_VALUE, be careful not to overflow
            int growthTarget = 2 * elements.length + 1;
            if (growthTarget <= 0) { // we overflowed
                growthTarget = Integer.MAX_VALUE;
            }
            int newCapacity = Math.max(minCapacity, growthTarget);
            elements = subArray(0, size, newCapacity);
        }
    }

    public int size() {
        return size;
    }

    private byte[] subArray(int from, int length, int capacity) {
        byte[] subArray = new byte[capacity];
        System.arraycopy(elements, from, subArray, 0, length);
        return subArray;
    }

    public String toString() {
        StringBuffer buf = new StringBuffer(4*size);
        buf.append("[");
        for (int i = 0; i < size; i++) {
            buf.append(elements[i]);
            if (i < size-1) buf.append(", ");
        }
        buf.append("]");
        return buf.toString();
    }

    public OutputStream asOutputStream() {
        return new OutputStream() {         
            public void write(int b) {
                add((byte) b);
            }
            public void write(byte b[], int off, int len) {
                add(b, off, len);
            }
        };
    }

    public InputStream asInputStream() {
        return new ByteArrayInputStream(elements, 0, size);
    }

}

}

Consider using Unsafe for efficient long-access

As per example of snappy-java, looks like we might be able to get performance benefits, esp. on uncompress side, by using fast(er) long (64-bit) reads. Simple testing shows that we might get +20% speedup (on JDK 6, mac os), but the real trick is doing this in a way that does not require Unsafe (i.e. graceful degradation).
Naive approach didn't quite work, but there should be a way to make it all work.

Implement encoder (compressor) that makes use of sun.misc.Unsafe

As with decoder (uncompressor), there should be ways to make use of performance tricks that sun.misc.Unsafe offers. Current "safe" implementations can and needs to be retained as fallback.

Document parallel compression task

While parallel lzf compressor exists (yay!), README does not explain its usage. It should.

Command-line tool out of memory

For big files better way is use Input/Output Streams. Very useful will be to add parametr sending results into standard output.

Example:
-d parameter: decompress to file
-c parameter: compress to file
-o parameter: decompress to output

Example Java code:

void process(String[] args) throws IOException
    {
        if (args.length == 2) {
            String oper = args[0];
            boolean toSystemOutput = false;
            boolean compress = "-c".equals(oper);
            if ("-o".equals(oper)) {
                toSystemOutput = true;
            }
            if (compress || "-d".equals(oper) || "-o".equals(oper)) {
                String filename = args[1];
                File src = new File(filename);
                if (!src.exists()) {
                    System.err.println("File '"+filename+"' does not exist.");
                    System.exit(1);
                }
                if (!compress && !filename.endsWith(SUFFIX)) {
                    System.err.println("File '"+filename+"' does end with expected suffix ('"+SUFFIX+"', won't decompress.");
                    System.exit(1);
                }

                InputStream is;
                OutputStream out;

                if (!compress) {
                    is = new LZFFileInputStream(src);
                    if (toSystemOutput) {
                        out = System.out;
                    } else {
                        out = new FileOutputStream(new File(filename.substring(0, filename.length() - SUFFIX.length())));
                    }
                } else {
                    is = new FileInputStream(src);
                    out = new LZFFileOutputStream(new File(filename+SUFFIX));
                }

                byte[] buffer = new byte[8192];
                int bytesRead = 0;
                while ((bytesRead = is.read(buffer, 0, buffer.length)) != -1) { 
                    out.write(buffer, 0, bytesRead); 
                }
                out.flush();
                out.close();

                return;
            }
        }
        System.err.println("Usage: java "+getClass().getName()+" -c/-d/-o file");
        System.exit(1);
    }

LZFInputStream.read(buffer, offset, len) ignores 'len' (or reports wrong number read)

Looks like return value of LZFInputStream.read() is wrong in cases where offset > 0 or length less than buffer size -- reported value is size of the buffer.

Add IOException sub-class for reliable catching of only LZF de/encoding problems

Currently plain IOExceptions are used for signalling problems with LZF issues (specifically, decoding, corrupt data).
This works, but makes it difficult to tell LZF encoding problems from more generic IO problems.
A simple solution would be to use a specific sub-class of IOException (LZFIOException).

Add LZFInputStream.readAndWrite(...) for copy-through

A minor optimization that would be useful is ability to ask LZFInputStream (decompressing input stream) to simply decode all content there is, and write it to given OutputStream. This avoids the intermediate copy to a temporary buffer between read and write (alternative would be exposing decoding of chunks; but this seems bit simpler to add, safer).
Same should be added in LZFFileInputStream utility class.

Possible compression performance optimization

There is a possible optimization in the compression main loop. But really, you need to try out yourself if it's faster in the average case for you. Current code (ChunkEncoder.java):

if (ref >= inPos // can't refer forward (i.e. leftovers)
|| ref < firstPos // or to previous block
|| (off = inPos - ref - 1) >= MAX_OFF
|| in[ref+2] != p2 // must match hash
|| in[ref+1] != (byte) (hash >> 8)
|| in[ref] != (byte) (hash >> 16)) {

Possible optimization (replace the last two lines with):

|| ((in[ref+1] << 8) | in[ref]) != ((hash >> 8) & 0xffff)) {

The reason why this might be faster is that conditional operations are much slower than arithmetic operations.

Corrupt input data when using LZFInputStream and jets3t

I came across this while using ning-compress 0.7. If I have some code like new LZFInputStream(inputStream) where the InputStream is provided by the jets3t library (for retrieval from S3), I get:

Caused by: java.io.IOException: Corrupt input data, block did not start with 'ZV' signature bytes
at com.ning.compress.lzf.LZFDecoder.decompressChunk(LZFDecoder.java:132)
at com.ning.compress.lzf.LZFInputStream.readyBuffer(LZFInputStream.java:130)
at com.ning.compress.lzf.LZFInputStream.read(LZFInputStream.java:76)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
at java.io.InputStreamReader.read(InputStreamReader.java:167)
at java.io.BufferedReader.fill(BufferedReader.java:136)
at java.io.BufferedReader.readLine(BufferedReader.java:299)
at java.io.BufferedReader.readLine(BufferedReader.java:362)
at org.apache.commons.io.LineIterator.hasNext(LineIterator.java:96)

However, if I instead use the following, everything works fine:

new LZFInputStream(new ByteArrayInputStream(IOUtils.toByteArray(inputStream)))

I've used LZFInputStream with FileInputStreams and never had issues. Same for jets3t's InputStream. As such, it seems like it's a weird interaction between LZFInputStream and jets3t's InputStream.

Tatu said in a private email:

"Interesting. No, I have never seen this. It might be good idea to
improve error message to list 2 bytes it sees (or lack of enough
input).

Can you add a bug entry for the project? It does sound like an issue
with interaction of two components for sure."

Document LZF format

There does not seem to be good (or any?) specification for LZF format; so although it is almost trivially simple, the only way to figure it out currently is by reading code.
This should not be the case: we should just go ahead and write basic specification, based on code written.

Parallel LZF

Hi Tatu,

I think using wide instructions via Unsafe looks very promising for decompression. I was wondering to what extent this might also be applied to compression. Compression is becoming more and more the bottleneck.

I am also thinking it might be good to use the plentiful cores available on today's production systems. Here we typically use servers with 12 cores each (2 sockets with 6 cores each). It's a bit of a pity to see only one out of 12 cores being used for compression. When the 20 core Intel CPUs (2 sockets with 10 cores each) become cheaper folks will switch to those. Using many cores for compression might work very well. Decompression is already so fast that it might not benefit as much.

Here, we are already using pigz - a parallel gzip implementation (really: drop-in replacement for gzip) for some things like backup - http://www.zlib.net/pigz/. Using this with many cores the I/O subsystem becomes the bottleneck even with gzip compression, which is, as you know, computationally much more demanding than LZF compression.

Here is an idea for your project. Perhaps ning compress could have one reader thread, one writer thread, and N worker threads. The reader feeds a queue of buffers. The worker threads grab a buffer from the queue, compress it (or uncompress it) and hand an output buffer to another queue where a writer thread grabs it, serializes the order in which output buffers are written (PriorityQueue), and then writes the logically next buffer to the output stream once that logically next buffer becomes available. Each buffer could contain an LZF chunk. To keep the CPUs busy the reader queue might maintain a read-ahead of 2 buffers for each of the N worker threads. Once the output buffer is written it is recycled via a buffer pool or queue (no gc). Same recycling for input buffers. Perhaps N=1 could be special cased to use no threads at all.

The pigz C source code available at http://www.zlib.net/pigz/ contains good comments in this direction.

Regards,
Wolfgang.

estimateMaxWorkspaceSize() is too small

// len = 20
byte[] in = new byte[] {0, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0, 4, 0, 0, 0};
// 27, actually needs 28
int outSize = LZFEncoder.estimateMaxWorkspaceSize(in.length);
// IndexOutOfBoundsException
LZFEncoder.appendEncoded(in, 0, in.length, new byte[outSize], 0);

Problem with LZFDecoder.decode()

(via email by T.Effland)

Please check the Wrapper function in LZFDecoder

public static byte[] decode(final byte[] inputBuffer, int inputPtr, int inputLen) throws IOException {
return ChunkDecoderFactory.optimalInstance().decode(inputBuffer);
}

I think the parameters int inputPtr, int inputLen should also be used!

Expose number of bytes read from `InputStream`, via `LZFInputStream`

Although LZFInputStream keeps track of number of input bytes it has read from the underlying source input stream, this is not currently exposed via API. It should be.

Maintenance, Contributor access?

Hi! I was hoping to reach out to individuals who have commit access.

I am the original author of this package and have been maintaining it for years, but have lost access sometime during 2023 (I pushed last release in January 2023).
I was wondering if either:

There are active maintainers who can handle PRs and do releases as needed, or
If not (1), I could get back enough contributor/admin access to this repo to handle maintenance tasks.
Something else?

RFE: Make LZFInput/OutputStream expose underlying stream

Sometimes it is necessary to be able to access whatever stream is the underlying stream for LZF encoding/decoding stream. There should be an accessor for this.

Add new variants for "compress only if comp rate at least N"

A relatively common use case for my own use is one where compression is only used if it makes significant difference. For example, if compression rate is only 2%, it hardly makes sense to store compressed version as there is still additional decompression overhead when reading content.

While caller can verify compression rate and choose to discard content in such case, it would be bit more efficient if library had a method that could avoid allocating result buffer in such cases. So let's add a variant or two that allows such "opportunistic" compression.

If possible, this should be added both for LZF and convenience Deflate (gzip) compression methods.

Add convenience method(s) for GZIP read/write

Although benefits of use of gzip via this package are negligible (turns out deflate reuse is either nowadays done by JDK, or not that important?), it would be nice to expose convenient methods to compress into / uncompress from byte arrays. Those are useful in their own right.

Fix issues outlined by "lgtm.com"'s static analysis

Looks like there are a few minor but legit issues with Woodstox code, as per:

https://lgtm.com/projects/g/ning/compress/

and it'd make sense to go through the list. Creating this issue as a placeholder; may create individual issues if actual severe reproducible problems are found; smaller code hygiene changes can be tracked under this one.

Fatal error in Java Runtime Environment with LZF 0.9.5 Library in Solaris 11 on sparc

Hello,

I use LZF librery to compress and uncompress file, no problem with Windows, Linux, Solaris 11 ( x86 ), but when i use Solaris 11 on sparc i have JVM FATAL error

this FATAL error occured in readLine method when i use BufferedReader :

BufferedReader brBufferedReader = new BufferedReader( new InputStreamReader( new LZFInputStream(
  new FileInputStream( fToUnCompress ), true ) ) );

String sLine = brBufferedReader.readLine();

NOTE : this work when i use VanillaChunkDecoder in ChunkDecoderFactory class.

Can you please help me about this issue.

JVM Logs

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0xa) at pc=0xffffffff76de06d4, pid=20352, tid=2
#
# JRE version: 7.0_04-b20
# Java VM: Java HotSpot(TM) 64-Bit Server VM (23.0-b21 mixed mode solaris-sparc compressed oops)    
# Problematic frame:
# V  [libjvm.so+0xae06d4]  Unsafe_GetLong+0x154
#
# Core dump written. Default location: /opt/mycom/shell/support/test15/core or core.20352
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
#

--------------- T H R E A D ---------------

Current thread (0x00000001001b6800): JavaThread "main" [_thread_in_vm, id=2, stack(0xffffffff7e000000,0xffffffff7e100000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN), si_addr=0x00000007d6700a93

Registers:
G1=0xffffffff77136870 G2=0xffffffff77136878 G3=0x0000000000084870 G4=0x0000000000084878
G5=0x0000000000084800 G6=0x0000000000000000 G7=0xffffffff7df00240 Y=0x0000000000000000
O0=0x000000001001b680 O1=0x0000000000000000 O2=0x0000000000000000 O3=0x0000000000000000
O4=0x000000000008a400 O5=0x0000000000000000 O6=0xffffffff7e0fdd31 O7=0x0000000000000000
L0=0xffffffff7713c508 L1=0x0000000000000000 L2=0x0000000000000000 L3=0x0000000030000000
L4=0x0000000000000006 L5=0x00000007d6700a80 L6=0x0000000000001ffc L7=0xffffffff770b2000
I0=0x000000000008a508 I1=0xffffffff7e0ffb00 I2=0xffffffff7e0fe890 I3=0x0000000000000013
I4=0xffffffff77136888 I5=0x00000001001b6800 I6=0xffffffff7e0fdde1 I7=0xffffffff71c10844
PC=0xffffffff76de06d4 nPC=0xffffffff76de06d8

Top of Stack: (sp=0xffffffff7e0fe530)
0xffffffff7e0fe530: ffffffff7713c508 0000000000000000
0xffffffff7e0fe540: 0000000000000000 0000000030000000
0xffffffff7e0fe550: 0000000000000006 00000007d6700a80
0xffffffff7e0fe560: 0000000000001ffc ffffffff770b2000
0xffffffff7e0fe570: 000000000008a508 ffffffff7e0ffb00
0xffffffff7e0fe580: ffffffff7e0fe890 0000000000000013
0xffffffff7e0fe590: ffffffff77136888 00000001001b6800
0xffffffff7e0fe5a0: ffffffff7e0fdde1 ffffffff71c10844
0xffffffff7e0fe5b0: 0000000000000000 ffffffff77118b88
0xffffffff7e0fe5c0: 0000000100195580 0000001500000000
0xffffffff7e0fe5d0: ffffffff71c05410 0000000000014d36
0xffffffff7e0fe5e0: 0000000000000000 0000000000000000
0xffffffff7e0fe5f0: 000000077e0f7ac0 ffffffff7e0fe898
0xffffffff7e0fe600: 0000000000000000 ffffffff71c0d1ac
0xffffffff7e0fe610: ffffffff7e0fe790 00000001001b6800
0xffffffff7e0fe620: 00000001001b6800 ffffffff7e0fe898

Instructions: (pc=0xffffffff76de06d4)
0xffffffff76de06b4: 90 10 00 1d a8 10 20 06 e8 27 62 50 0a c6 80 04
0xffffffff76de06c4: f2 5f 60 48 10 80 00 04 f0 5e e0 00 ea 5e a0 00
0xffffffff76de06d4: f0 5d 40 1b d0 5e 60 10 ec 5e 60 08 f4 5a 20 00
0xffffffff76de06e4: 02 c6 80 05 b6 10 20 07 7f dc 1e 43 01 00 00 00

G1=0xffffffff77136870: _1cTMaskFillerForNativeG__vtbl+0xad0 in /usr/jdk/instances/jdk1.7.0/jre/lib/sparcv9/server/libjvm.so at 0xffffffff76300000
G2=0xffffffff77136878: _1cTMaskFillerForNativeG__vtbl+0xad8 in /usr/jdk/instances/jdk1.7.0/jre/lib/sparcv9/server/libjvm.so at 0xffffffff76300000
G3=0x0000000000084870 is an unknown value
G4=0x0000000000084878 is an unknown value
G5=0x0000000000084800 is an unknown value
G6=0x0000000000000000 is an unknown value
G7=0xffffffff7df00240 is an unknown value

O0=0x000000001001b680 is an unknown value
O1=0x0000000000000000 is an unknown value
O2=0x0000000000000000 is an unknown value
O3=0x0000000000000000 is an unknown value
O4=0x000000000008a400 is an unknown value
O5=0x0000000000000000 is an unknown value
O6=0xffffffff7e0fdd31 is pointing into the stack for thread: 0x00000001001b6800
O7=0x0000000000000000 is an unknown value

L0=0xffffffff7713c508: _1cRDeoptimizedRFrameG__vtbl+0x1c0 in /usr/jdk/instances/jdk1.7.0/jre/lib/sparcv9/server/libjvm.so at 0xffffffff76300000
L1=0x0000000000000000 is an unknown value
L2=0x0000000000000000 is an unknown value
L3=0x0000000030000000 is an unknown value
L4=0x0000000000000006 is an unknown value
L5=0x00000007d6700a80 is an oop
[B

klass: {type array byte}
length: 65535
L6=0x0000000000001ffc is an unknown value
L7=0xffffffff770b2000: GLOBAL_OFFSET_TABLE+0 in /usr/jdk/instances/jdk1.7.0/jre/lib/sparcv9/server/libjvm.so at 0xffffffff76300000

I0=0x000000000008a508 is an unknown value
I1=0xffffffff7e0ffb00 is pointing into the stack for thread: 0x00000001001b6800
I2=0xffffffff7e0fe890 is pointing into the stack for thread: 0x00000001001b6800
I3=0x0000000000000013 is an unknown value
I4=0xffffffff77136888: _1cTMaskFillerForNativeG__vtbl+0xae8 in /usr/jdk/instances/jdk1.7.0/jre/lib/sparcv9/server/libjvm.so at 0xffffffff76300000
I5=0x00000001001b6800 is a thread
I6=0xffffffff7e0fdde1 is pointing into the stack for thread: 0x00000001001b6800
I7=0xffffffff71c10844 is an Interpreter codelet
method entry point (kind = native) [0xffffffff71c106a0, 0xffffffff71c10a20] 896 bytes

Stack: [0xffffffff7e000000,0xffffffff7e100000], sp=0xffffffff7e0fe530, free space=1017k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0xae06d4] Unsafe_GetLong+0x154
j sun.misc.Unsafe.getLong(Ljava/lang/Object;J)J+0
j sun.misc.Unsafe.getLong(Ljava/lang/Object;J)J+0
j com.ning.compress.lzf.impl.UnsafeChunkDecoder.copyUpTo32([BI[BII)V+318
j com.ning.compress.lzf.impl.UnsafeChunkDecoder.decodeChunk([BI[BII)V+26
j com.ning.compress.lzf.impl.UnsafeChunkDecoder.decodeChunk(Ljava/io/InputStream;[B[B)I+104
j com.ning.compress.lzf.LZFInputStream.readyBuffer()Z+39
j com.ning.compress.lzf.LZFInputStream.read([BII)I+8
j sun.nio.cs.StreamDecoder.readBytes()I+135
j sun.nio.cs.StreamDecoder.implRead([CII)I+112
j sun.nio.cs.StreamDecoder.read([CII)I+180
j java.io.InputStreamReader.read([CII)I+7
j java.io.BufferedReader.fill()V+145
j java.io.BufferedReader.readLine(Z)Ljava/lang/String;+44
j java.io.BufferedReader.readLine()Ljava/lang/String;+2
j com.ning.CompressUncompressNing.UnCompress(Ljava/io/File;)V+198
j com.ning.CompressUncompressNing.Start(Ljava/lang/String;Ljava/io/File;)V+28
j com.ning.CompressUncompressNing.main([Ljava/lang/String;)V+26
v ~StubRoutines::call_stub
V [libjvm.so+0x21c26c] void JavaCalls::call_helper(JavaValue_,methodHandle_,JavaCallArguments_,Thread_)+0x304
V [libjvm.so+0x228f54] void JavaCalls::call(JavaValue_,methodHandle,JavaCallArguments_,Thread*)+0x44
V [libjvm.so+0x2cca74] jni_CallStaticVoidMethod+0x618
C [libjli.so+0x23f8] JavaMain+0x7c8

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j sun.misc.Unsafe.getLong(Ljava/lang/Object;J)J+0
j com.ning.compress.lzf.impl.UnsafeChunkDecoder.copyUpTo32([BI[BII)V+318
j com.ning.compress.lzf.impl.UnsafeChunkDecoder.decodeChunk([BI[BII)V+26
j com.ning.compress.lzf.impl.UnsafeChunkDecoder.decodeChunk(Ljava/io/InputStream;[B[B)I+104
j com.ning.compress.lzf.LZFInputStream.readyBuffer()Z+39
j com.ning.compress.lzf.LZFInputStream.read([BII)I+8
j sun.nio.cs.StreamDecoder.readBytes()I+135
j sun.nio.cs.StreamDecoder.implRead([CII)I+112
j sun.nio.cs.StreamDecoder.read([CII)I+180
j java.io.InputStreamReader.read([CII)I+7
j java.io.BufferedReader.fill()V+145
j java.io.BufferedReader.readLine(Z)Ljava/lang/String;+44
j java.io.BufferedReader.readLine()Ljava/lang/String;+2
j com.ning.CompressUncompressNing.UnCompress(Ljava/io/File;)V+198
j com.ning.CompressUncompressNing.Start(Ljava/lang/String;Ljava/io/File;)V+28
j com.ning.CompressUncompressNing.main([Ljava/lang/String;)V+26
v ~StubRoutines::call_stub

--------------- P R O C E S S ---------------

Java Threads: ( => current thread )
0x0000000105414000 JavaThread "Service Thread" daemon [_thread_blocked, id=34, stack(0xffffffff78200000,0xffffffff78300000)]
0x0000000105407000 JavaThread "C2 CompilerThread1" daemon [_thread_blocked, id=33, stack(0xffffffff78400000,0xffffffff78500000)]
0x0000000105404800 JavaThread "C2 CompilerThread0" daemon [_thread_blocked, id=32, stack(0xffffffff78800000,0xffffffff78900000)]
0x0000000105402800 JavaThread "Signal Dispatcher" daemon [_thread_blocked, id=31, stack(0xffffffff78a00000,0xffffffff78b00000)]
0x00000001053ac000 JavaThread "Finalizer" daemon [_thread_blocked, id=30, stack(0xffffffff78d00000,0xffffffff78e00000)]
0x00000001053a5000 JavaThread "Reference Handler" daemon [_thread_blocked, id=29, stack(0xffffffff78f00000,0xffffffff79000000)]
=>0x00000001001b6800 JavaThread "main" [_thread_in_vm, id=2, stack(0xffffffff7e000000,0xffffffff7e100000)]

Other Threads:
0x000000010539c800 VMThread [stack: 0xffffffff79100000,0xffffffff79200000] [id=28]
0x000000010541e000 WatcherThread [stack: 0xffffffff77f00000,0xffffffff78000000] [id=35]

VM state:not at safepoint (normal execution)

VM Mutex/Monitor currently owned by a thread: None

Heap
PSYoungGen total 301056K, used 5161K [0x00000007d6400000, 0x00000007eb400000, 0x0000000800000000)
eden space 258048K, 2% used [0x00000007d6400000,0x00000007d690a440,0x00000007e6000000)
from space 43008K, 0% used [0x00000007e8a00000,0x00000007e8a00000,0x00000007eb400000)
to space 43008K, 0% used [0x00000007e6000000,0x00000007e6000000,0x00000007e8a00000)
ParOldGen total 684032K, used 0K [0x0000000783000000, 0x00000007acc00000, 0x00000007d6400000)
object space 684032K, 0% used [0x0000000783000000,0x0000000783000000,0x00000007acc00000)
PSPermGen total 24576K, used 2754K [0x000000077e000000, 0x000000077f800000, 0x0000000783000000)
object space 24576K, 11% used [0x000000077e000000,0x000000077e2b09d0,0x000000077f800000)

Code Cache [0xffffffff71c00000, 0xffffffff72000000, 0xffffffff74c00000)
total_blobs=197 nmethods=16 adapters=135 free_code_cache=48728Kb largest_free_block=49869952

Compilation events (10 events):
Event: 3.678 Thread 0x0000000105404800 nmethod 11 0xffffffff71c649d0 code [0xffffffff71c64b20, 0xffffffff71c64d50]
Event: 3.983 Thread 0x0000000105407000 nmethod 7 0xffffffff71c6af90 code [0xffffffff71c6b220, 0xffffffff71c6c448]
Event: 4.245 Thread 0x0000000105404800 12 java.lang.Object:: (1 bytes)
Event: 4.267 Thread 0x0000000105404800 nmethod 12 0xffffffff71c64510 code [0xffffffff71c64640, 0xffffffff71c64730]
Event: 4.274 Thread 0x0000000105407000 13 java.lang.String::lastIndexOf (68 bytes)
Event: 4.294 Thread 0x0000000105407000 nmethod 13 0xffffffff71c64050 code [0xffffffff71c641a0, 0xffffffff71c64410]
Event: 4.904 Thread 0x0000000105404800 14 java.io.UnixFileSystem::normalize (75 bytes)
Event: 5.010 Thread 0x0000000105404800 nmethod 14 0xffffffff71c66110 code [0xffffffff71c66260, 0xffffffff71c665b8]
Event: 5.630 Thread 0x0000000105407000 15 sun.nio.cs.UTF_8$Encoder::encode (361 bytes)
Event: 5.841 Thread 0x0000000105407000 nmethod 15 0xffffffff71c65410 code [0xffffffff71c65580, 0xffffffff71c65bf8]

GC Heap History (0 events):
No events

Deoptimization events (1 events):
Event: 4.078 Thread 0x00000001001b6800 Uncommon trap -12 fr.pc 0xffffffff71c6c138

Internal exceptions (10 events):
Event: 3.455 Thread 0x00000001001b6800 Threw 0x00000007d65ef430 at /HUDSON/workspace/jdk7u4-2-build-solaris-sparcv9-product/jdk7u4/hotspot/src/share/vm/prims/jvm.cpp:1166
Event: 4.078 Thread 0x00000001001b6800 Implicit null exception at 0xffffffff71c6b328 to 0xffffffff71c6c12c
Event: 4.090 Thread 0x00000001001b6800 Threw 0x00000007d662bc18 at /HUDSON/workspace/jdk7u4-2-build-solaris-sparcv9-product/jdk7u4/hotspot/src/share/vm/prims/jvm.cpp:1166
Event: 4.090 Thread 0x00000001001b6800 Threw 0x00000007d662bd40 at /HUDSON/workspace/jdk7u4-2-build-solaris-sparcv9-product/jdk7u4/hotspot/src/share/vm/prims/jvm.cpp:1166
Event: 4.269 Thread 0x00000001001b6800 Threw 0x00000007d663eba0 at /HUDSON/workspace/jdk7u4-2-build-solaris-sparcv9-product/jdk7u4/hotspot/src/share/vm/prims/jvm.cpp:1166
Event: 4.269 Thread 0x00000001001b6800 Threw 0x00000007d663ecc8 at /HUDSON/workspace/jdk7u4-2-build-solaris-sparcv9-product/jdk7u4/hotspot/src/share/vm/prims/jvm.cpp:1166
Event: 4.487 Thread 0x00000001001b6800 Threw 0x00000007d664cad0 at /HUDSON/workspace/jdk7u4-2-build-solaris-sparcv9-product/jdk7u4/hotspot/src/share/vm/prims/jvm.cpp:1166
Event: 4.487 Thread 0x00000001001b6800 Threw 0x00000007d664cbf8 at /HUDSON/workspace/jdk7u4-2-build-solaris-sparcv9-product/jdk7u4/hotspot/src/share/vm/prims/jvm.cpp:1166
Event: 4.489 Thread 0x00000001001b6800 Threw 0x00000007d664dae0 at /HUDSON/workspace/jdk7u4-2-build-solaris-sparcv9-product/jdk7u4/hotspot/src/share/vm/prims/jvm.cpp:1166
Event: 4.489 Thread 0x00000001001b6800 Threw 0x00000007d664dc08 at /HUDSON/workspace/jdk7u4-2-build-solaris-sparcv9-product/jdk7u4/hotspot/src/share/vm/prims/jvm.cpp:1166

Events (10 events):
Event: 5.717 loading class 0x0000000100188c20 done
Event: 5.717 loading class 0x00000001001aa440 done
Event: 5.721 loading class 0x000000010024b960
Event: 5.722 loading class 0x000000010024b960 done
Event: 5.722 loading class 0x000000010563ead0
Event: 5.723 loading class 0x0000000105786970
Event: 5.723 loading class 0x0000000105786970 done
Event: 5.723 loading class 0x000000010563ead0 done
Event: 5.726 loading class 0x000000010578a900
Event: 5.726 loading class 0x000000010578a900 done

Dynamic libraries:
0x0000000100000000 /usr/jdk/instances/jdk1.7.0/jre/bin/sparcv9/java
0xffffffff7d2f4000 /lib/64/libthread.so.1
0xffffffff77200000 /usr/jdk/instances/jdk1.7.0/jre/bin/sparcv9/../../lib/sparcv9/jli/libjli.so
0xffffffff7d2fe000 /lib/64/libdl.so.1
0xffffffff7ee00000 /lib/64/libc.so.1
0xffffffff76300000 /usr/jdk/instances/jdk1.7.0/jre/lib/sparcv9/server/libjvm.so
0xffffffff7a100000 /lib/64/libsocket.so.1
0xffffffff7d2fc000 /usr/lib/64/libsched.so.1
0xffffffff76100000 /lib/64/libm.so.1
0xffffffff75f00000 /usr/lib/64/libCrun.so.1
0xffffffff785fe000 /lib/64/libdoor.so.1
0xffffffff75d00000 /usr/lib/64/libdemangle.so.1
0xffffffff7a300000 /lib/64/libkstat.so.1
0xffffffff7e600000 /lib/64/libm.so.2
0xffffffff7d900000 /lib/64/libnsl.so.1
0xffffffff75b00000 /lib/64/libmd.so.1
0xffffffff75900000 /lib/64/libmp.so.2
0xffffffff75700000 /usr/jdk/instances/jdk1.7.0/jre/lib/sparcv9/libverify.so
0xffffffff75500000 /usr/jdk/instances/jdk1.7.0/jre/lib/sparcv9/libjava.so
0xffffffff75300000 /lib/64/libscf.so.1
0xffffffff75100000 /lib/64/libuutil.so.1
0xffffffff79d00000 /lib/64/libgen.so.1
0xffffffff7db00000 /lib/64/libnvpair.so.1
0xffffffff74f00000 /usr/jdk/instances/jdk1.7.0/jre/lib/sparcv9/libzip.so

VM Arguments:
jvm_args: -Detl-writ -Xms1000m -Xmx2000m -Djava.ext.dirs=/opt/mycom/shell/support/test15/jar -Djava.io.tmpdir=/opt/mycom/data/tmp
java_command: com.ning.CompressUncompressNing Server.HomeDir=/opt/mycom/config -d /opt/mycom/shell/support/test15/1338304562-34176-11839086-Id.mis.lzf
Launcher Type: SUN_STANDARD

Environment Variables:
PATH=/opt/mycom/3rd_party/oracle/app/oracle/products/11.2.0/bin:/usr/bin:/bin:/usr/openwin/bin:/usr/ccs/bin:/usr/local/bin
LD_LIBRARY_PATH=/opt/mycom/lib:/usr/local/lib:/opt/mycom/3rd_party/oracle/app/oracle/products/11.2.0/lib
SHELL=/usr/bin/bash
DISPLAY=localhost:19.0

Signal Handlers:
SIGSEGV: [libjvm.so+0xb16b18], sa_mask[0]=0xffbffeff, sa_flags=0x0000000c
SIGBUS: [libjvm.so+0xb16b18], sa_mask[0]=0xffbffeff, sa_flags=0x0000000c
SIGFPE: [libjvm.so+0x2891d8], sa_mask[0]=0xffbffeff, sa_flags=0x0000000c
SIGPIPE: [libjvm.so+0x2891d8], sa_mask[0]=0xffbffeff, sa_flags=0x0000000c
SIGXFSZ: [libjvm.so+0x2891d8], sa_mask[0]=0xffbffeff, sa_flags=0x0000000c
SIGILL: [libjvm.so+0x2891d8], sa_mask[0]=0xffbffeff, sa_flags=0x0000000c
SIGUSR1: SIG_DFL, sa_mask[0]=0x00000000, sa_flags=0x00000000
SIGUSR2: SIG_DFL, sa_mask[0]=0x00000000, sa_flags=0x00000000
SIGQUIT: [libjvm.so+0x9b97e8], sa_mask[0]=0xffbffeff, sa_flags=0x00000004
SIGHUP: [libjvm.so+0x9b97e8], sa_mask[0]=0xffbffeff, sa_flags=0x00000004
SIGINT: [libjvm.so+0x9b97e8], sa_mask[0]=0xffbffeff, sa_flags=0x00000004
SIGTERM: [libjvm.so+0x9b97e8], sa_mask[0]=0xffbffeff, sa_flags=0x00000004
SIG39: [libjvm.so+0x9bd6c8], sa_mask[0]=0x00000000, sa_flags=0x00000008
SIG40: [libjvm.so+0x2891d8], sa_mask[0]=0xffbffeff, sa_flags=0x0000000c

--------------- S Y S T E M ---------------

uname:SunOS 5.11 11.0 sun4v (T2 libthread)
rlimit: STACK 8192k, CORE infinity, NOFILE 65536, AS infinity
load average:68.38 72.81 64.03

CPU:total 64 v9, popc, vis1, vis2, vis3, blk_init, cbcond, sun4v, niagara_plus

Memory: 8k page, physical 267911168k(158934736k free)

vm_info: Java HotSpot(TM) 64-Bit Server VM (23.0-b21) for solaris-sparc JRE (1.7.0_04-b20), built on Apr 12 2012 02:23:35 by "" with Sun Studio 12u1

time: Fri Jun 1 03:37:59 2012
elapsed time: 7 seconds

Thanks.

did not start with 'ZV' signature bytes

Hi,
I am running this on android, parsing incoming packets compressed with the original c implementation, but it seems I am hitting a potential interoperability problem, as I am seeing:

Corrupt input data, block #0 (at offset 0): did not start with 'ZV' signature bytes

On the originating side I do test compression/decompressing on the same target buffer and that succeeds, as I can read the data back.
Are there any known issues working with that c implementation?

regards,

Close() not working correctly with input, output streams

As Dain S pointed out, currently Input-/OutputStream deal with close() in a way that is both incompatible with default JDK behavior, and potentially wrong wrt buffer recycling. The thing is that JDK actually expects an IOException to be thrown, when read/write is done on closed stream; and although there may deviations (I think System.out/err do not do this for example; nor StringWriter), I think we should throw an exception (and at most allow exception throwing be disabled as and option).

Hope it will be helpful for users and maintainers of the library. The report is generated by the https://github.com/lvc/japi-tracker tool for jars found at http://central.maven.org/maven2/com/ning/compress-lzf/ according to https://wiki.eclipse.org/Evolving_Java-based_APIs_2.

Thank you.