Code Monkey home page Code Monkey logo

Comments (10)

gildas-lormeau avatar gildas-lormeau commented on June 2, 2024

I guess this might be related to your filesystem perfomance. On my end, I cannot reproduce the issue. See the test here https://jsfiddle.net/uLwjpons/, and below.

<!doctype html>

<html>

<head>
  <title>Test getData in zip.js</title>
  <style>
    body {
      font-family: monospace
    }
  </style>
</head>

<body>
  <script type="module">

    import {
      BlobReader,
      BlobWriter,
      ZipReader,
      ZipWriter,
    } from "https://deno.land/x/zipjs/index.js";

    main().catch(console.error);

    async function main() {
      await log("INIT");
      const zipData = await createFile();
      await log("RUN");
      await runTest(zipData);
      await log("END");
    }

    async function createFile() {
      await log("STEP 1/2 (creating data)");
      const DATA_64_MB = new Array(64 * 1024 * 1024).fill(Math.floor(Math.random() * 128) * 2);
      const ENTRY_DATA = new Blob(new Array(8).fill(DATA_64_MB));
      await log("STEP 2/2 (zipping data)");
      const zipFileWriter = new BlobWriter();
      const entryDataReader = new BlobReader(ENTRY_DATA);
      const zipWriter = new ZipWriter(zipFileWriter);
      await zipWriter.add("test.bin", entryDataReader);
      return zipWriter.close();
    }

    async function runTest(zipData) {
      const zipFileReader = new BlobReader(zipData);
      const zipReader = new ZipReader(zipFileReader);
      const firstEntry = (await zipReader.getEntries()).shift();
      const iterations = new Array(9).fill().map((_, index) => index + 1);
      for (const iteration of iterations) {
        const startTime = performance.now();
        await firstEntry.getData(new BlobWriter());
        await log(`TEST ${iteration}/9 => ${performance.now() - startTime} ms`);
      }
    }

    async function log(value) {
      document.body.innerHTML += `${value}<br>`;
      await pause();
    }

    function pause() {
      return new Promise(resolve => setTimeout(resolve, 500));
    }

  </script>
</body>

</html>

Here are the logs when I run this test in Chrome. Performance is constant.

INIT
STEP 1/2 (creating data)
STEP 2/2 (zipping data)
RUN
TEST 1/9 => 3353 ms
TEST 2/9 => 3126.2000000476837 ms
TEST 3/9 => 3136.100000023842 ms
TEST 4/9 => 3173.699999988079 ms
TEST 5/9 => 3133.800000011921 ms
TEST 6/9 => 3098.800000011921 ms
TEST 7/9 => 3147.600000023842 ms
TEST 8/9 => 3120.699999988079 ms
TEST 9/9 => 3082.199999988079 ms
END

from zip.js.

phuong5 avatar phuong5 commented on June 2, 2024

@gildas-lormeau
I uploaded a 2GB file, the first time entry.getData ran it took 44 seconds, but from the second time onwards, it consistently took about 10 minutes. This behavior persisted across 10 attempts. I noticed that the onprogress method was running slowly from the second time. My code snippet is as follows, please check and help me.

export const isUsingPasswordOrInvalidFileZip = async (
    file: File,
    password?: string
): Promise<...> => {
    let reader: undefined | ZipReader<Blob>;
    try {
        reader = new ZipReader(new BlobReader(file), { password });
        const entries = await reader.getEntries();
        const pathAndFiles = new Map();

        for (const entry of entries) {
            if (!entry.directory) {
                const encoding = detect(entry.rawFilename);
                const textDecoder = new TextDecoder(encoding as string);
                const utf8Path = textDecoder.decode(entry.rawFilename);
                pathAndFiles.set(utf8Path, entry);
            }

            const startTime = performance.now();
            if (entry.getData) {
                await entry.getData(new BlobWriter(), {
                    // for debug
                    onprogress: async (progress, total) => {
                        console.log(progress);
                        console.log(total);
                    },
                });
            }
            console.log(`get data: => ${performance.now() - startTime} ms`);
        }

        if (pathAndFiles.size === 0) {
            return {...}
        }
            return {...}
    } catch (err: any) {
        if (err.message === ERR_ENCRYPTED || err.message === ERR_INVALID_PASSWORD) {
            console.log('password error!');
            return {...}
        }

        if (err.message === ERR_EOCDR_NOT_FOUND) {
            console.log('mime type error!');
            return {...}
        }

        console.log('error when reader:', err.message);
        return ...
    } finally {
        if (reader) {
            await reader.close();
        }
    }
};


export const isUsingPasswordOrInvalidFileZip$ = (
    file: File,
    password?: string
): Observable<{ ... }> => {
    return defer(() => isUsingPasswordOrInvalidFileZip(file, password));
};

from zip.js.

gildas-lormeau avatar gildas-lormeau commented on June 2, 2024

If you pass a Blob instead of a File as parameter to the two exported functions, do you see the same issue? Did you try to run your code on multiple machines?

from zip.js.

phuong5 avatar phuong5 commented on June 2, 2024

@gildas-lormeau
The file itself is a blob; I don't think there's a need to convert it. Furthermore, I also need to retrieve the entry and save it back, so I believe there's no need to perform any conversion.

from zip.js.

phuong5 avatar phuong5 commented on June 2, 2024

image

from zip.js.

gildas-lormeau avatar gildas-lormeau commented on June 2, 2024

I know that. I suspect the problem is coming from your filesystem, when reading the compressed data in the ZIP file. That's why I asked you to do a test and pass a Blob. That's also why I asked you if you have tested your code on multiple machines.

from zip.js.

phuong5 avatar phuong5 commented on June 2, 2024

@gildas-lormeau
I have modified the code as follows:

const fileBlob = new Blob([file], { type: file.type });
reader = new ZipReader(new BlobReader(fileBlob));

I just tried again and noticed that the time has reduced a bit, but there is still an issue because the second time takes longer, and I don't know the reason:

First attempt: 35s
Second attempt: 2m 03s
Third attempt: 5m 24s
Four attempt: 10m 47s

Additionally, when running, occasionally the following error appears:

image

from zip.js.

gildas-lormeau avatar gildas-lormeau commented on June 2, 2024

Thank you, maybe you are leaking memory and using the swap too much? Have you looked at what's happening at this level?

from zip.js.

gildas-lormeau avatar gildas-lormeau commented on June 2, 2024

Can you reproduce the issue with this test https://run.plnkr.co/preview/clq8gh5r400033b6ti2ixuutk/ in Chrome? If the link is broken, go to https://plnkr.co/edit/C8QoHl0kBD3dQMxV?preview and open the test page in a new tab by clicking the corresponding button in the upper right of the preview page. It's using the filesystem API in order to create the ZIP file on the disk. On my end, I'm still getting constant results.

INIT
STEP 1/1 (creating and zipping data)
RUN
TEST 1/9 => 3232.699999988079 ms
TEST 2/9 => 3199.099999964237 ms
TEST 3/9 => 3300 ms
TEST 4/9 => 3421.100000023842 ms
TEST 5/9 => 3297.199999988079 ms
TEST 6/9 => 3262.2999999523163 ms
TEST 7/9 => 3423.7999999523163 ms
TEST 8/9 => 3212 ms
TEST 9/9 => 3200.199999988079 ms
END
<!doctype html>
<html>

<head>
  <title>Perf test of Entry#getData in zip.js</title>
  <style>
    body {
      font-family: monospace;
    }
  </style>
</head>

<body>
  <button id=runTestButton>Run</button>
  <script type=module>

import {
  BlobReader,
  BlobWriter,
  ZipReader,
  ZipWriter,
} from "https://deno.land/x/zipjs/index.js";

const ZIP_EXTENSIONS_ACCEPT = {
  "application/zip": [".zip"],
};

const ONE_MB = 1024 * 1024;

runTestButton.addEventListener("click", async () => {
  let fileHandle;
  try {
    const suggestedName = [...new Array(16)].map(() => Math.floor(Math.random() * 16).toString(16)).join("") + ".zip";
    fileHandle = await showSaveFilePicker({
      suggestedName,
      mode: "readwrite",
      startIn: "downloads"
    });
    createZIPButton.remove();
    await log("INIT");
    const writable = await fileHandle.createWritable();
    const zipWriter = new ZipWriter(writable);
    const addFilePromises = [];
    const writers = [];
    for (let i = 0; i < 4; i++) {
      const transformStream = new TransformStream();
      addFilePromises.push(zipWriter.add(`test${i}.bin`, transformStream.readable));
      writers.push(transformStream.writable.getWriter());
    }
    await log("STEP 1/1 (creating and zipping data)");
    await Promise.all([
      ...writers.map(writer => fillData(writer)),
      ...addFilePromises
    ]);
    await zipWriter.close();
    const file = await fileHandle.getFile();
    const zipReader = new ZipReader(new BlobReader(file));
    const firstEntry = (await zipReader.getEntries()).shift();
    const iterations = new Array(9).fill().map((_, index) => index + 1);
    await log("RUN");
    for (const iteration of iterations) {
      const startTime = performance.now();
      await firstEntry.getData(new BlobWriter());
      await log(`TEST ${iteration}/9 => ${performance.now() - startTime} ms`);
    }
    await log("END");
  } finally {
    if (fileHandle) {
      await fileHandle.remove();
    }
  }
});

async function fillData(writer, currentSize = 0, maxSize = Math.floor((Math.random() * 256) + 512) * ONE_MB) {
  const chunkSize = ONE_MB;
  const chunk = new Uint8Array(chunkSize);
  for (let i = 0; i < chunkSize; i++) {
    chunk[i] = Math.floor(Math.random() * 128) * 2;
  }
  await writer.write(chunk);
  currentSize += chunkSize;
  if (currentSize < maxSize) {
    await fillData(writer, currentSize, maxSize);
  } else {
    await writer.close();
  }
}

async function log(value) {
  document.body.innerHTML += `${value}<br>`;
  await pause();
}

function pause() {
  return new Promise(resolve => setTimeout(resolve, 500));
}

  </script>
</body>

</html>

from zip.js.

phuong5 avatar phuong5 commented on June 2, 2024

Thank you, Gildas Lormeau. I used a different approach by using the 'encrypt' variable, and it resolved the issue.

from zip.js.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.