Code Monkey home page Code Monkey logo

libarchivejs's Introduction

Libarchivejs

npm version license

Overview

Libarchivejs is a archive tool for browser and nodejs which can extract and create various types of compression, it's a port of libarchive to WebAssembly and javascript wrapper to make it easier to use. Since it runs on WebAssembly performance should be near native. Supported formats: ZIP, 7-Zip, RAR v4, RAR v5, TAR .etc, Supported compression: GZIP, DEFLATE, BZIP2, LZMA .etc

Version 2.0 highlights!

  • Create archives
  • Use it in NodeJS

How to use

Install with npm i libarchive.js and use it as a ES module.

The library consists of two parts: ES module and webworker bundle, ES module part is your interface to talk to library, use it like any other module. The webworker bundle lives in the libarchive.js/dist folder so you need to make sure that it is available in your public folder since it will not get bundled if you're using bundler (it's all bundled up already) and specify correct path to Archive.init() method

if libarchive.js file is in the same directory as bundle file than you don't need to call Archive.init() at all

import {Archive} from 'libarchive.js/main.js';

Archive.init({
    workerUrl: 'libarchive.js/dist/worker-bundle.js'
});

document.getElementById('file').addEventListener('change', async (e) => {
    const file = e.currentTarget.files[0];

    const archive = await Archive.open(file);
    let obj = await archive.extractFiles();
    
    console.log(obj);
});

// outputs
{
    ".gitignore": {File},
    "addon": {
        "addon.py": {File},
        "addon.xml": {File}
    },
    "README.md": {File}
}

More options

To get file listing without actually decompressing archive, use one of these methods

    await archive.getFilesObject();
    // outputs
    {
        ".gitignore": {CompressedFile},
        "addon": {
            "addon.py": {CompressedFile},
            "addon.xml": {CompressedFile}
        },
        "README.md": {CompressedFile}
    }

    await archive.getFilesArray();
    // outputs
    [
        {file: {CompressedFile}, path: ""},
        {file: {CompressedFile},   path: "addon/"},
        {file: {CompressedFile},  path: "addon/"},
        {file: {CompressedFile},  path: ""}
    ]

If these methods get called after archive.extractFiles(); they will contain actual files as well.

Decompression might take a while for larger files. To track each file as it gets extracted, archive.extractFiles accepts callback

    archive.extractFiles((entry) => { // { file: {File}, path: {String} }
        console.log(entry);
    });

Extract single file from archive

To extract a single file from the archive you can use the extract() method on the returned CompressedFile.

    const filesObj = await archive.getFilesObject();
    const file = await filesObj['.gitignore'].extract();

Check for encrypted data

    const archive = await Archive.open(file);
    await archive.hasEncryptedData();
    // true - yes
    // false - no
    // null - can not be determined

Extract encrypted archive

    const archive = await Archive.open(file);
    await archive.usePassword("password");
    let obj = await archive.extractFiles();

Create new archive

Note: pathname is optional in browser but required in NodeJS

    const archiveFile = await Archive.write({
        files: [
            { file: file, pathname: 'folder/file.zip' }
        ],
        outputFileName: "test.tar.gz",
        compression: ArchiveCompression.GZIP,
        format: ArchiveFormat.USTAR,
        passphrase: null,
    });

Use it in NodeJS

    import { Archive, ArchiveCompression, ArchiveFormat } from "libarchivejs/dist/libarchive-node.mjs";
    
    let buffer = fs.readFileSync("test/files/archives/README.md");
    let blob = new Blob([buffer]);

    const archiveFile = await Archive.write({
      files: [{ 
        file: blob,
        pathname: "README.md",
      }],
      outputFileName: "test.tar.gz",
      compression: ArchiveCompression.GZIP,
      format: ArchiveFormat.USTAR,
      passphrase: null,
    });

How it works

Libarchivejs is a port of the popular libarchive C library to WASM. Since WASM runs in the current thread, the library uses WebWorkers for heavy lifting. The ES Module (Archive class) is just a client for WebWorker. It's tiny and doesn't take up much space.

Only when you actually open archive file will the web worker be spawned and WASM module will be downloaded. Each Archive.open call corresponds to each WebWorker.

After calling an extractFiles worker, it will be terminated to free up memory. The client will still work with cached data.

libarchivejs's People

Contributors

btzr-io avatar dbolton avatar dependabot[bot] avatar nika-begiashvili avatar sharkymcdongles avatar tranquilmarmot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libarchivejs's Issues

Error: Buffer exhausted

I'm working with libarchive.js in the browser and it seems that I load the worker correctly when i try to create a ZIP using this code bit:

Screenshot 2024-01-22 at 13 22 25

I get Error: Buffer exhausted

Screenshot 2024-01-22 at 13 23 54

And in the worker reference, I get this code:

Screenshot 2024-01-22 at 13 23 59

I wanna disclaim I'm a bit in doubt about the compression and format what works together and how do i get a ZIP file

Way to extract just one file from an archive?

You can call getFilesObject and getFilesArray to get a list of all of the files in an archive, but it seems like the only option to extract any files is to call extractFiles which extracts all of the files.

Would it be possible to add a way to only extract one of the files in the array?

Terminate worker manually

After calling an extractFiles worker, it will be terminated to free up memory. The client will still work with cached data.

I need to use extractFile instead, how can I free up memory without calling extractFiles ?

Something like archive.close() will be nice to have.

'worker-bundle' file alwsys throw error like 'Uncaught SyntaxError: Unexpected token '<''

like this ,
my vue project want to unzip some zipfile,
so import your 'libarchive.js' to do this,
but in the browser always thorw an error like Uncaught SyntaxError: Unexpected token '<',
there is my code:
Archive.init('../../node_modules/libarchive.js/dist/worker-bundle.js'); const archive = await Archive.open(zipFile); let obj = await archive.extractFiles(); console.log(obj);
zipfile is my zipfile and init path is true or empty always throw this error.
my English is not good, washing your reply.

Streaming API

I am considering to use this library with big files (read archives >4GB). Is there a possibility to implement streaming the output of a file extraction action without storing it in memory? Otherwise I'll probably end up with multiple GB of RAM usage only to hold the data that the library extracted.

Can't extract gz

Version: 1.3.0
Symptom: extract gz would get Memory read error -30

I found related issue and mr in libarchive and fix it in v3.4.0

Do you have any plan to upgrade?

Thanks

Decompress single compressed .gz file

Can someone give me an example of how to decompress a file using libarchive?
It is just a .gz file but it's not a tar archive. I'm assuming libarchive can do that as well.

I get an "Unrecognized archive format" error when I try to call getFilesArray after opening.

Cannot extract encrypted 7Z archive with encrypted filenames

Hi,
I am trying to extract a 7Z archive with encrypted filenames. Upon calling getFilesObject or extractFiles all I get is an empty object back. Is this feature not (yet?) supported?

I am attaching such an archive. The password is 'abc' (without quotes). I had to append '.zip' to the filename to enable attaching to this issue. Upon downloading, rename it to make sure .7z is its extension.

Thanks!

PS: If I don't choose to encrypt the filenames (just the contents) then everything works as expected.

TwoFileArchive-EncFNs-pass-is-abc.7z.zip

Promise.withResolvers

When trying to create an archive in the browser chrome v120 i get the

ERROR TypeError: Promise.withResolvers is not a function

this is probably due to it not being supported yet in most browsers suggesting an alternative implementation like

let resolve, reject;

const promise = new Promise((res, rej) => {
  resolve = res;
  reject = rej;
});

I know it doesn't look as nice in the code but its gonna have a much better browser support

How to extract to local directory?

Sorry for asking a newbie question again, but it's keep bugging my mind.
I can listing the file inside an archive and log to the console, but i can't see the file in my local directory, am i missing something?

This is my code

function finish() {
    const d = document.createElement('div');
    d.setAttribute('id', 'done');
    d.textContent = 'Done.';
    document.body.appendChild(d);
}

document.getElementById('file').addEventListener('change', async e => {
    let obj = null;

    try {
        const file = e.currentTarget.files[0];
        const archive = await Archive.open(file);
        //obj = await archive.extractFiles();
        await archive.extractFiles();
        //console.log(obj);
    } catch (err) {
        console.error(err);
    } finally {
        //window.obj = obj;
        finish();
    }
}

Help please, thank's.

Multiple compressedFile.extract() return wrong content

  • Problem
    Concurrent compressedFile.extract() execution breaks correspondence of files with contents.
    Because when message cames from webworker, always called the last element of _callbacks in current implementation. but multiple extractSingleFile calling cause _callbacks to stack up and return contents in wrong order.

  • How to reproduce

Modify test/files/test-single.html :

@@ -65,7 +65,7 @@
                     const file = e.currentTarget.files[0];
                     const archive = await Archive.open(file);
                     const files =  await archive.getFilesArray();
-                    fileObj = await files[0].file.extract();
+                    fileObj = (await Promise.all(files.map(f =>f.file.extract()) ))[0];
                 }catch(err){
                     console.error(err);
                 }finally{

and npm run test

  • How to fix
    Attach message id to message for communicate with WebWorker and response are routed by message id.

ISO file creation returns an archive size of 0, while gzip creation works fine

When attempting to create an ISO file using libarchive.js, the returned archive size is 0 bytes.
However, when creating a gzip file with the same process and files, the output is correct and functional.
Below is the code snippet used for creating the ISO file:

import {
  Archive,
  ArchiveFormat,
  ArchiveCompression,
} from 'libarchive.js/dist/libarchive.js';

// `files` is the files of the input of the file type.
const allFiles = [];
for (let i = 0; i < files.length; i++) {
  const file = files[i];
  const relativePath =
    file.webkitRelativePath || file.relativePath || file.name;
  allFiles.push({ file, pathname: relativePath });
}

const archiveFile = await Archive.write({
  files: allFiles,
  outputFileName: 'mount.iso',
  compression: ArchiveCompression.NONE,
  format: ArchiveFormat.ISO9660,
  passphrase: null,
});

Steps to Reproduce:

Prepare a list of files to be included in the archive (allFiles).
Use the above code snippet to attempt to create an ISO file.
Check the size of the generated mount.iso file.

Expected Behavior:

The ISO file should be created with the appropriate size, containing all the specified files.

Actual Behavior:

The generated ISO file (mount.iso) has a size of 0 bytes.

Additional Information:

  • The same process works correctly when creating a gzip file.
  • The issue seems to be specific to the ISO creation format.
  • No errors or warnings are thrown during the process.

Environment:

  • Library Version: [email protected]
  • Browser: Chrome 125.0.6422.113
  • Operating System: windows 10

Issue with zip created by mac (with __MACOSX folder)

When I extract the following file using native tools (on ubuntu)
test.zip all files are openable and usable.

When I extract the same file using libarchivejs, all of the files under the __MACOSX folder work properly, however all other files are corrupted (unable to open them in their respective formats).

Do you have any guidance on how to fix this?

Support zstd?

ZSTD support exists for tar, could you support it?

Better Bundling support

Passing the URL to worker may not work due to bundlers mangling file names:

this._worker = new Worker(options.workerUrl);

Maybe that could be solved by allowing passing the worker directly:

this._worker = options.worker
    ? options.worker
    : new Worker(options.workerUrl);

Can use by nodejs?

Hi.

Are there any plans to make it available on nodejs?

I think so, If it wasm, porting to nodejs is easy.
But I'm not sure.

I hope it can be used by nodejs.

Unsupported block header size (was 5, max is 2)

Hello,

I come from Vietnam, thank you for developing and sharing such a wonderful library. However, I encountered an issue with a RAR file with the password: "Unsupported block header size (was 5, max is 2)".

Screenshot 2024-03-22 090053

I have used wait archive.hasEncryptedData(); to check, but the result returned null instead of true.

I am using version v1.3.0 in an Angular application. I would greatly appreciate your assistance!

Best regards,
Anh Duc Le

Archive.open() is freezing and gives no error or results

import { Archive } from 'libarchive.js/main.js';

Archive.init({
    workerUrl: 'libarchive.js/dist/worker-bundle.js'
});


export const handleFile = async (file) => {
  console.log('file: ', file)
  console.log('_7zOpen !! ')

  const archive = await Archive.open(f);
  console.log('archive: ', archive)

  const filesObject = await archive.getFilesObject();
  console.log("filesObject: ", filesObject)

  const filesArray = await archive.getFilesArray();
  console.log("filesArray: ", filesArray)

  return filesArray
}

This runs, but the console output only print out file and _7zOpen !! and then just stops there without any further response. No error is thrown, and the line console.log('archive: ', archive) never gets executed.

console log output:

file:  File {path: 'CE027001-120011101924-T100.7z', 
name: 'CE027001-120011101924-T100.7z', 
lastModified: 1599377977254, 
lastModifiedDate: Sun Sep 06 2020 16:39:37 GMT+0900, webkitRelativePath: '', …}
lastModified: 1599377977254
lastModifiedDate: Sun Sep 06 2020 16:39:37 GMT+0900 {}
name: "CE027001-120011101924-T100.7z"
path: "CE027001-120011101924-T100.7z"
size: 75602083
type: "application/x-7z-compressed"
webkitRelativePath: ""
[[Prototype]]: File

_7zOpen !! 

I suspect that the Archive object is never really running despite the fact that it is installed and imported with no problem.

What is going on?
How can I move forward to debug this issue?

wasm streaming compile failed: TypeError: WebAssembly: Response has unsupported MIME type

Hello.
Trying out this lib for the first time and I'm getting errors. At first I tried linking through some CDN but the workerUrl did not like loading remote content. I then downloaded "Latest Release" and unpacked everything into a folder (edwardleuf.org/js/libarchivejs...) and tried again, but I get this compile error. Searching around tells me I need to add the wasm mime type to a server config file, but I don't have that access. I also don't have npm access so that is why I did not install it in that way.

Current error message:

"wasm streaming compile failed: TypeError: WebAssembly: Response has unsupported MIME type '' expected 'application/wasm'"        [worker-bundle.js:1:69897]
"falling back to ArrayBuffer instantiation"                                                                                       [worker-bundle.js:1:69897]
message: "FileReader.readAsArrayBuffer: Argument 1 is not an object."
stack: "open@https://edwardleuf.org/js/libarchivejs/dist/worker-bundle.js:1:493
self.onmessage@https://edwardleuf.org/js/libarchivejs/dist/worker-bundle.js:1:71325
EventHandlerNonNull*@https://edwardleuf.org/js/libarchivejs/dist/worker-bundle.js:1:71172
@https://edwardleuf.org/js/libarchivejs/dist/worker-bundle.js:1:72016"

Quick implementation for testing purposes:

<html>
<body>
<script type="importmap">
{
	"imports":
	{
		"ARC": "/js/libarchivejs/main.js"
	}
}
</script>
<script type="module">

import { Archive } from "ARC";
Archive.init({workerUrl: "/js/libarchivejs/dist/worker-bundle.js"});

const arc = await Archive.open("ponedward.7z");

</script>
</body>
</html>

Error accessing libarchive.js

I'm trying to use this libarchive.js package with node, but i got the following error:
Message:
alt text

I tried to make use of require instead of import as well but still no luck!
I'm using node v10.16.0
This is my code:

import {Archive} from 'libarchive.js/main.js';

Archive.init({
    workerUrl: 'libarchivejs/dist/worker-bundle.js'
});

const zipFile = 'zip/files.zip';
 
const files = async (e) => {
    const archive = await Archive.open(zipFile);
    let obj = await archive.getFilesArray();
    
    console.log(obj);
};

Am i doing it wrong or something?

Create ZIP file

Thanks for the great library.

Is there any way I can create a ZIP file with signature 50 4b 03 04? The configuration below results in 42 5A 68 (bzip2) for example:

const outputFile = await Archive.write({
  files: outFiles,
  outputFileName: "test.epub",
  compression: ArchiveCompression.BZIP2,
  format: ArchiveFormat.ZIP
});

LZMA Corrupted Input Data

I'm struggling to decompress lzma data using this library. The data comes as part of proprietary file.
I'll show you three examples of how I get the data that needs to be decompressed:
I'll add the first 16 bytes so you can see header and the start of the raw data:

1. 5d 00 00 00 04 00 00 68 80 f9 08 72 b3
2. 5d 00 00 00 04 00 38 8f 41 4c 35 9a 6a
3. 5d 00 00 00 04 00 00 68 9a a5 37 83 51

I know that I can successfully decompress this using lzma utility on ubuntu only when I add the decompressed size (real size or -1) at offset 5. I can also use this approach to decompress it using another js package. This is in fact what I have been doing until now. However, the performance is very bad and its not very well maintained as well, that's why I'm trying to migrate.

But when passing this to libarchive.js it will not even recognize that it is compressed using lzma.

So how am I supposed to pass the data in this scenario?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.