Code Monkey home page Code Monkey logo

zip4jvm's Introduction

Maven Central javadoc java1.8 License

buddy pipeline codacy-quality

develop

buddy pipeline codecov vulnerabilities codacy-quality

QR-code

zip4jvm - a java library for working with zip files

Features

  • Add regular files or directories to new or existed zip archive;
  • Extract regular files or directories from zip archive;
  • Encryption algorithms support:
  • Compression support:
  • Individual settings for each zip entry (i.e. some of files can be encrypted, and some - not);
  • Streaming support for adding and extracting;
  • Read/Write password protected Zip files and streams;
  • ZIP64 format support;
  • Multi-volume zip archive support:
    • PKWare, i.e. filename.zip, filename.z01, filename.z02
    • 7-Zip, i.e. filename.zip.001, filename.zip.002, filename.zip.003 (read-only)
  • Unicode for comments and file names.

Gradle

compile 'ru.oleg-cherednik.zip4jvm:zip4jvm:1.9'

Maven

<dependency>
    <groupId>ru.oleg-cherednik.zip4jvm</groupId>
    <artifactId>zip4jvm</artifactId>
    <version>1.9</version>
</dependency>

Usage

To simplify usage of zip4jvm, there're following classes:

  • ZipIt - add files to archive;
  • UnzipIt - extract files from archive;
  • ZipMisc - other zip file activities;
  • ZipInfo - zip file information and diagnostics.

ZipIt

Regular files and directories can be represented as Path

Create (or open existed) zip archive and add regular file /cars/bentley-continental.jpg
Path zip = Paths.get("filename.zip");
Path file = Path.get("/cars/bentley-continental.jpg")
ZipIt.zip(zip).add(file);
/-
|-- cars
|    |-- bentley-continental.jpg
|    |-- ferrari-458-italia.jpg
|    |-- wiesmann-gt-mf5.jpg
|-- bikes
|    |-- ducati-panigale-1199.jpg
|    |-- kawasaki-ninja-300.jpg
|    |-- honda-cbr600rr.jpg
|-- saint-petersburg.jpg
filename.zip
|-- bentley-continental.jpg

Note: regular file is added to the root of the zip archive.

Create (or open existed) zip archive and add directory /cars
Path zip = Paths.get("filename.zip");
Path dir = Path.get("/cars");
ZipIt.zip(zip).add(dir);
/-
|-- cars
|    |-- bentley-continental.jpg
|    |-- ferrari-458-italia.jpg
|    |-- wiesmann-gt-mf5.jpg
|-- bikes
|    |-- ducati-panigale-1199.jpg
|    |-- kawasaki-ninja-300.jpg
|    |-- honda-cbr600rr.jpg
|-- saint-petersburg.jpg
filename.zip
|-- cars
     |-- bentley-continental.jpg
     |-- ferrari-458-italia.jpg
     |-- wiesmann-gt-mf5.jpg

Note: directory is added to the root of the zip archive keeping the initial structure.

Create (or open existed) zip archive and add some regular files and/or directories
Path zip = Paths.get("filename.zip");
Collection<Path> paths = Arrays.asList(
        Paths.get("/bikes/ducati-panigale-1199.jpg"),
        Paths.get("/bikes/honda-cbr600rr.jpg"),
        Paths.get("/cars"),
        Paths.get("/saint-petersburg.jpg"));
ZipIt.zip(zip).add(paths);
/-
|-- cars
|    |-- bentley-continental.jpg
|    |-- ferrari-458-italia.jpg
|    |-- wiesmann-gt-mf5.jpg
|-- bikes
|    |-- ducati-panigale-1199.jpg
|    |-- kawasaki-ninja-300.jpg
|    |-- honda-cbr600rr.jpg
|-- saint-petersburg.jpg
filename.zip
|-- cars
|    |-- bentley-continental.jpg
|    |-- ferrari-458-italia.jpg
|    |-- wiesmann-gt-mf5.jpg
|-- ducati-panigale-1199.jpg
|-- honda-cbr600rr.jpg
|-- saint-petersburg.jpg

Note: each regular file from the list is added to the root of the zip archive.

Note: each directory from the list is added to the root of the zip archive keeping the initial structure.

Regular files and empty directories are available as InputStream

Create (or open existed) zip archive and add input streams content as regular files
Path zip = Zip4jvmSuite.subDirNameAsMethodName(rootDir).resolve("filename.zip");

try (ZipFile.Writer zipFile = ZipIt.zip(zip).open()) {
    zipFile.add(ZipFile.Entry.builder()
                             .inputStreamSupplier(() -> new FileInputStream("/cars/bentley-continental.jpg"))
                             .fileName("my_cars/bentley-continental.jpg")
                             .uncompressedSize(Files.size(Paths.get("/cars/bentley-continental.jpg"))).build());

    zipFile.add(ZipFile.Entry.builder()
                             .inputStreamSupplier(() -> new FileInputStream("/bikes/kawasaki-ninja-300.jpg"))
                             .fileName("my_bikes/kawasaki.jpg")
                             .uncompressedSize(Files.size(Paths.get("/bikes/kawasaki-ninja-300.jpg"))).build());
}
/-
|-- cars
|    |-- bentley-continental.jpg
|    |-- ferrari-458-italia.jpg
|    |-- wiesmann-gt-mf5.jpg
|-- bikes
|    |-- ducati-panigale-1199.jpg
|    |-- kawasaki-ninja-300.jpg
|    |-- honda-cbr600rr.jpg
|-- saint-petersburg.jpg
filename.zip
|-- my_cars
|    |-- bentley-continental.jpg
|-- my_bikes
|    |-- kawasaki.jpg

Note: each entry is treated as separate input stream of the regular file.

UnzipIt

Regular files and directories to Path destination

Extract all entries into given directory
Path zip = Paths.get("filename.zip");
Path destDir = Paths.get("/filename_content");
UnzipIt.zip(zip).destDir(destDir).extract();
filename.zip
|-- cars
|    |-- bentley-continental.jpg
|    |-- ferrari-458-italia.jpg
|    |-- wiesmann-gt-mf5.jpg
|-- bikes
|    |-- ducati-panigale-1199.jpg
|    |-- kawasaki-ninja-300.jpg
|-- saint-petersburg.jpg
/filename_content
 |-- cars
 |-- cars
 |    |-- bentley-continental.jpg
 |    |-- ferrari-458-italia.jpg
 |    |-- wiesmann-gt-mf5.jpg
 |-- bikes
 |    |-- ducati-panigale-1199.jpg
 |    |-- kawasaki-ninja-300.jpg
 |-- saint-petersburg.jpg

Note: all entries (i.e. regular files and empty directories) are added to the destination directory keeping the initial structure.

Extract regular file's entry into given directory
Path zip = Paths.get("filename.zip");
Path destDir = Paths.get("/filename_content");
UnzipIt.zip(zip).destDir(destDir).extract("/cars/bentley-continental.jpg");
filename.zip
|-- cars
|    |-- bentley-continental.jpg
|    |-- ferrari-458-italia.jpg
|    |-- wiesmann-gt-mf5.jpg
|-- bikes
|    |-- ducati-panigale-1199.jpg
|    |-- kawasaki-ninja-300.jpg
|-- saint-petersburg.jpg
/filename_content
 |-- bentley-continental.jpg

Note: regular file's entry is added to the root of the destination directory.

Extract directory entries into given directory
Path zip = Paths.get("filename.zip");
Path destDir = Paths.get("/filename_content");
UnzipIt.zip(zip).destDir(destDir).extract("cars");
filename.zip
|-- cars
|    |-- bentley-continental.jpg
|    |-- ferrari-458-italia.jpg
|    |-- wiesmann-gt-mf5.jpg
|-- bikes
|    |-- ducati-panigale-1199.jpg
|    |-- kawasaki-ninja-300.jpg
|-- saint-petersburg.jpg
/filename_content
 |-- cars
 |    |-- bentley-continental.jpg
 |    |-- ferrari-458-italia.jpg
 |    |-- wiesmann-gt-mf5.jpg

Note: extract all entries belong to the given directory; content of these entries is added to the destination directory keeping the initial structure.

Extract some entries into given directory
Path zip = Paths.get("filename.zip");
Path destDir = Paths.get("/filename_content");
Collection<Path> fileNames = Arrays.asList("cars", "bikes/ducati-panigale-1199.jpg", "saint-petersburg.jpg");
UnzipIt.zip(zip).destDir(destDir).extract(fileNames);
filename.zip
|-- cars
|    |-- bentley-continental.jpg
|    |-- ferrari-458-italia.jpg
|    |-- wiesmann-gt-mf5.jpg
|-- bikes
|    |-- ducati-panigale-1199.jpg
|    |-- kawasaki-ninja-300.jpg
|-- saint-petersburg.jpg
/filename_content
 |-- cars
 |    |-- bentley-continental.jpg
 |    |-- ferrari-458-italia.jpg
 |    |-- wiesmann-gt-mf5.jpg
 |-- ducati-panigale-1199.jpg
 |-- saint-petersburg.jpg

Note: directory is extracting keeping the initial structure; regular file is extracted into root of destination directory

Regular files as InputStream source

Get input stream for regular file's entry
Path zip = Paths.get("filename.zip");
Path destFile = Paths.get("/filename_content/bentley.jpg");
try (InputStream in = UnzipIt.zip(zip).stream("/cars/bentley-continental.jpg");
     OutputStream out = new FileOutputStream(destFile.toFile())) {
    IOUtils.copyLarge(in, out);
}
filename.zip
|-- cars
|    |-- bentley-continental.jpg
|    |-- ferrari-458-italia.jpg
|    |-- wiesmann-gt-mf5.jpg
|-- bikes
|    |-- ducati-panigale-1199.jpg
|    |-- kawasaki-ninja-300.jpg
|-- saint-petersburg.jpg
/filename_content
 |-- bentley-continental.jpg

Note: Input stream for regular file's entry should be correctly closed to flush all data

Use password to unzip

For all unzip operation password provider could be optionally set. It could be either single password or password provider with fileName of the entry as a key.

Unzip with single password for entries

char[] password = "1".toCharArray();
Path destDir = Paths.get("/filename_content");
List<String> fileNames = Arrays.asList("cars", "bikes/ducati-panigale-1199.jpg", "saint-petersburg.jpg");
UnzipIt.zip(zip).destDir(destDir).password(password).extract(fileNames);
filename.zip  --> password: 1
|-- cars
|    |-- bentley-continental.jpg
|    |-- ferrari-458-italia.jpg
|    |-- wiesmann-gt-mf5.jpg
|-- bikes
|    |-- ducati-panigale-1199.jpg
|    |-- kawasaki-ninja-300.jpg
|-- saint-petersburg.jpg
/filename_content
 |-- cars
 |    |-- bentley-continental.jpg
 |    |-- ferrari-458-italia.jpg
 |    |-- wiesmann-gt-mf5.jpg
 |-- ducati-panigale-1199.jpg
 |-- saint-petersburg.jpg

Or separate password for each entry. The key is the fileName of the entry:

Unzip with separate password for each entry

Path zip = Paths.get("filename.zip");
Path destFile = Paths.get("filename_content/bentley.jpg");

Function<String, char[]> passwordProvider = fileName -> {
    if (fileName.startsWith("cars/"))
        return "1".toCharArray();
    if (fileName.startsWith("bikes/ducati-panigale-1199.jpg"))
        return "2".toCharArray();
    if (fileName.startsWith("saint-petersburg.jpg"))
            return "3".toCharArray();
    return null;
};

UnzipSettings settings = UnzipSettings.builder().password(passwordProvider).build();
List<Path> fileNames = Arrays.asList("cars", "bikes/ducati-panigale-1199.jpg", "saint-petersburg.jpg");
UnzipIt.zip(zip).destDir(destDir).settings(settings).extract(fileNames);
filename.zip
 |-- cars
 |    |-- bentley-continental.jpg   --> password: 1
 |    |-- ferrari-458-italia.jpg    --> password: 1
 |    |-- wiesmann-gt-mf5.jpg       --> password: 1
 |-- bikes
 |    |-- ducati-panigale-1199.jpg  --> password: 2
 |    |-- kawasaki-ninja-300.jpg
 |-- saint-petersburg.jpg           --> password: 3
/filename_content
 |-- cars
 |    |-- bentley-continental.jpg
 |    |-- ferrari-458-italia.jpg
 |    |-- wiesmann-gt-mf5.jpg
 |-- ducati-panigale-1199.jpg
 |-- saint-petersburg.jpg

ZipMisc

Modify zip archive comment

Path zip = Paths.get("filename.zip");
ZipMisc zipFile = ZipMisc.zip(zip);

zipFile.getComment();           // get current comment (null if it's not set)
zipFile.setComment("comment");  // set comment to 'comment'
zipFile.setComment(null);       // remove comment

Get all entries

Path zip = Paths.get("filename.zip");
ZipMisc zipFile = ZipMisc.zip(zip);
List<ZipFile.Entry> entires = zipFile.getEntries().collect(Collectors.toList());

/*
 * [entryNames]
 * cars/bentley-continental.jpg
 * cars/ferrari-458-italia.jpg
 * cars/wiesmann-gt-mf5.jpg
 * bikes/ducati-panigale-1199.jpg
 * bikes/kawasaki-ninja-300.jpg
 * saint-petersburg.jpg
 */
filename.zip
|-- cars
|    |-- bentley-continental.jpg
|    |-- ferrari-458-italia.jpg
|    |-- wiesmann-gt-mf5.jpg
|-- bikes
|    |-- ducati-panigale-1199.jpg
|    |-- kawasaki-ninja-300.jpg
|-- saint-petersburg.jpg

Note: zipFile.getEntries() retrieves Stream with immutable ZupFile.Entry objects represent all entries in zip archive

Remove entry by name

Path zip = Paths.get("filename.zip");
ZipMisc zipFile = ZipMisc.zip(zip);
zipFile.removeEntryByName("cars/bentley-continental.jpg");
filename.zip (before)
 |-- cars
 |    |-- bentley-continental.jpg
 |    |-- ferrari-458-italia.jpg
 |    |-- wiesmann-gt-mf5.jpg
 |-- bikes
 |    |-- ducati-panigale-1199.jpg
 |    |-- kawasaki-ninja-300.jpg
 |-- saint-petersburg.jpg
filename.zip (after)
 |-- cars
 |    |-- ferrari-458-italia.jpg
 |    |-- wiesmann-gt-mf5.jpg
 |-- bikes
 |    |-- ducati-panigale-1199.jpg
 |    |-- kawasaki-ninja-300.jpg
 |-- saint-petersburg.jpg

Note: exactly one entry will be removed in case of entry with exact this name exists

Remove some entries by name

Path zip = Paths.get("filename.zip");
ZipMisc zipFile = ZipMisc.zip(zip);
Collection<String> entryNames = Arrays.asList("cars/ferrari-458-italia.jpg", "bikes/ducati-panigale-1199.jpg");
zipFile.removeEntryByName(entryNames);
filename.zip (before)
 |-- cars
 |    |-- bentley-continental.jpg
 |    |-- ferrari-458-italia.jpg
 |    |-- wiesmann-gt-mf5.jpg
 |-- bikes
 |    |-- ducati-panigale-1199.jpg
 |    |-- kawasaki-ninja-300.jpg
 |-- saint-petersburg.jpg
filename.zip (after)
 |-- cars
 |    |-- ferrari-458-italia.jpg
 |    |-- wiesmann-gt-mf5.jpg
 |-- bikes
 |    |-- kawasaki-ninja-300.jpg
 |-- saint-petersburg.jpg

Remove entry by name prefix

Path zip = Paths.get("filename.zip");
ZipMisc zipFile = ZipMisc.zip(zip);
zipFile.removeEntryByNamePrefix("cars")
filename.zip (before)
 |-- cars
 |    |-- bentley-continental.jpg
 |    |-- ferrari-458-italia.jpg
 |    |-- wiesmann-gt-mf5.jpg
 |-- bikes
 |    |-- ducati-panigale-1199.jpg
 |    |-- kawasaki-ninja-300.jpg
 |-- saint-petersburg.jpg
filename.zip (after)
 |-- bikes
 |    |-- ducati-panigale-1199.jpg
 |    |-- kawasaki-ninja-300.jpg
 |-- saint-petersburg.jpg

Note: multiple entries could be removed

Check whether zip archive split or not

Path zip = Paths.get("filename.zip");
ZipMisc zipFile = ZipMisc.zip(zip);
boolean split = zipFile.isSplit();

Merge split archive into solid one

Path zipSrc = Paths.get("split.zip");
Path zip = Paths.get("filename.zip");
ZipMisc zipFile = ZipMisc.zip(zipSrc);
zipFile.merge(zip);
/- (before)
|-- split.z01
|-- split.z02
|-- split.z03
|-- split.zip
/- (after)
|-- filename.zip

ZipInfo

Print content of zip file into console

Path zip = Paths.get("filename.zip");
ZipInfo.zip(zip).printShortInfo();
filename.zip
 |-- cars
 |    |-- bentley-continental.jpg
 |-- saint-petersburg.jpg
--- console output ---
(PK0506) End of Central directory record
========================================
   - location:                                     2365537 (0x00241861) bytes
   - size:                                         22 bytes
   part number of this part (0000):                1
   part number of start of central dir (0000):     1
   number of entries in central dir in this part:  3
   total number of entries in central dir:         3
   size of central dir:                            299 (0x0000012B) bytes
   relative offset of central dir:                 2365238 (0x00241736) bytes
   zipfile comment length:                         0 bytes

(PK0102) Central directory
==========================
   - location:                                     2365238 (0x00241736) bytes
   - size:                                         303 bytes
   total entries:                                  3

#1 (PK0102) [UTF-8] cars/
-------------------------
   - location:                                     2365238 (0x00241736) bytes
   - size:                                         87 bytes
   part number of this part (0000):                1
   relative offset of local header:                0 (0x00000000) bytes
   version made by operating system (00):          MS-DOS, OS/2, NT FAT
   version made by zip software (31):              3.1
   operat. system version needed to extract (00):  MS-DOS, OS/2, NT FAT
   unzip software version needed to extract (10):  1.0
   general purpose bit flag (0x0000) (bit 15..0):  0000.0000 0000.0000
     file security status  (bit 0):                not encrypted
     data descriptor       (bit 3):                no
     strong encryption     (bit 6):                no
     UTF-8 names          (bit 11):                no
   compression method (00):                        none (stored)
   file last modified on (0x5024 0x7EC0):          2020-01-04 15:54:00
   32-bit CRC value:                               0x00000000
   compressed size:                                0 bytes
   uncompressed size:                              0 bytes
   length of filename:                             5
                                                   UTF-8
   63 61 72 73 2F                                  cars/
   length of file comment:                         0 bytes
   internal file attributes:                       0x0000
     apparent file type:                           binary
   external file attributes:                       0x00000010
     WINDOWS   (0x10):                             dir
     POSIX (0x000000):                             none
   extra field:                                    2365289 (0x00241769) bytes
     - size:                                       36 bytes (1 record)
   (0x000A) NTFS Timestamp:                        2365289 (0x00241769) bytes
     - size:                                       36 bytes
     - total tags:                                 1
     (0x0001) Tag1:                                24 bytes
       Creation Date:                              2020-01-04 12:50:54
       Last Modified Date:                         2020-01-04 12:54:00
       Last Accessed Date:                         2020-01-04 12:54:00

#2 (PK0102) [UTF-8] cars/bentley-continental.jpg
------------------------------------------------
   - location:                                     2365325 (0x0024178D) bytes
   - size:                                         110 bytes
   part number of this part (0000):                1
   relative offset of local header:                35 (0x00000023) bytes
   version made by operating system (00):          MS-DOS, OS/2, NT FAT
   version made by zip software (31):              3.1
   operat. system version needed to extract (00):  MS-DOS, OS/2, NT FAT
   unzip software version needed to extract (20):  2.0
   general purpose bit flag (0x0000) (bit 15..0):  0000.0000 0000.0000
     file security status  (bit 0):                not encrypted
     data descriptor       (bit 3):                no
     strong encryption     (bit 6):                no
     UTF-8 names          (bit 11):                no
   compression method (08):                        deflate
     compression sub-type (deflation):             normal
   file last modified on (0x4F24 0x3D6D):          2019-09-04 07:43:26
   32-bit CRC value:                               0x71797968
   compressed size:                                1380544 bytes
   uncompressed size:                              1395362 bytes
   length of filename:                             28
                                                   UTF-8
   63 61 72 73 2F 62 65 6E 74 6C 65 79 2D 63 6F 6E cars/bentley-con
   74 69 6E 65 6E 74 61 6C 2E 6A 70 67             tinental.jpg
   length of file comment:                         0 bytes
   internal file attributes:                       0x0000
     apparent file type:                           binary
   external file attributes:                       0x00000020
     WINDOWS   (0x20):                             arc
     POSIX (0x000000):                             none
   extra field:                                    2365399 (0x002417D7) bytes
     - size:                                       36 bytes (1 record)
   (0x000A) NTFS Timestamp:                        2365399 (0x002417D7) bytes
     - size:                                       36 bytes
     - total tags:                                 1
     (0x0001) Tag1:                                24 bytes
       Creation Date:                              2020-01-04 12:50:54
       Last Modified Date:                         2019-09-04 04:43:27
       Last Accessed Date:                         2020-01-04 12:50:54

#3 (PK0102) [UTF-8] saint-petersburg.jpg
----------------------------------------
   - location:                                     2365435 (0x002417FB) bytes
   - size:                                         102 bytes
   part number of this part (0000):                1
   relative offset of local header:                1380637 (0x0015111D) bytes
   version made by operating system (00):          MS-DOS, OS/2, NT FAT
   version made by zip software (31):              3.1
   operat. system version needed to extract (00):  MS-DOS, OS/2, NT FAT
   unzip software version needed to extract (20):  2.0
   general purpose bit flag (0x0000) (bit 15..0):  0000.0000 0000.0000
     file security status  (bit 0):                not encrypted
     data descriptor       (bit 3):                no
     strong encryption     (bit 6):                no
     UTF-8 names          (bit 11):                no
   compression method (08):                        deflate
     compression sub-type (deflation):             normal
   file last modified on (0x4F24 0x3D6D):          2019-09-04 07:43:26
   32-bit CRC value:                               0x5F2EEF84
   compressed size:                                984551 bytes
   uncompressed size:                              1074836 bytes
   length of filename:                             20
                                                   UTF-8
   73 61 69 6E 74 2D 70 65 74 65 72 73 62 75 72 67 saint-petersburg
   2E 6A 70 67                                     .jpg
   length of file comment:                         0 bytes
   internal file attributes:                       0x0000
     apparent file type:                           binary
   external file attributes:                       0x00000020
     WINDOWS   (0x20):                             arc
     POSIX (0x000000):                             none
   extra field:                                    2365501 (0x0024183D) bytes
     - size:                                       36 bytes (1 record)
   (0x000A) NTFS Timestamp:                        2365501 (0x0024183D) bytes
     - size:                                       36 bytes
     - total tags:                                 1
     (0x0001) Tag1:                                24 bytes
       Creation Date:                              2020-01-04 12:50:54
       Last Modified Date:                         2019-09-04 04:43:27
       Last Accessed Date:                         2020-01-04 12:50:54

(PK0304) ZIP entries
====================
   total entries:                                  3

#1 (PK0304) [UTF-8] cars/
-------------------------
   - location:                                     0 (0x00000000) bytes
   - size:                                         35 bytes
   operat. system version needed to extract (00):  MS-DOS, OS/2, NT FAT
   unzip software version needed to extract (10):  1.0
   general purpose bit flag (0x0000) (bit 15..0):  0000.0000 0000.0000
     file security status  (bit 0):                not encrypted
     data descriptor       (bit 3):                no
     strong encryption     (bit 6):                no
     UTF-8 names          (bit 11):                no
   compression method (00):                        none (stored)
   file last modified on (0x5024 0x7EC0):          2020-01-04 15:54:00
   32-bit CRC value:                               0x00000000
   compressed size:                                0 bytes
   uncompressed size:                              0 bytes
   length of filename:                             5
                                                   UTF-8
   63 61 72 73 2F                                  cars/

#2 (PK0304) [UTF-8] cars/bentley-continental.jpg
------------------------------------------------
   - location:                                     35 (0x00000023) bytes
   - size:                                         58 bytes
   operat. system version needed to extract (00):  MS-DOS, OS/2, NT FAT
   unzip software version needed to extract (20):  2.0
   general purpose bit flag (0x0000) (bit 15..0):  0000.0000 0000.0000
     file security status  (bit 0):                not encrypted
     data descriptor       (bit 3):                no
     strong encryption     (bit 6):                no
     UTF-8 names          (bit 11):                no
   compression method (08):                        deflate
     compression sub-type (deflation):             normal
   file last modified on (0x4F24 0x3D6D):          2019-09-04 07:43:26
   32-bit CRC value:                               0x71797968
   compressed size:                                1380544 bytes
   uncompressed size:                              1395362 bytes
   length of filename:                             28
                                                   UTF-8
   63 61 72 73 2F 62 65 6E 74 6C 65 79 2D 63 6F 6E cars/bentley-con
   74 69 6E 65 6E 74 61 6C 2E 6A 70 67             tinental.jpg

#3 (PK0304) [UTF-8] saint-petersburg.jpg
----------------------------------------
   - location:                                     1380637 (0x0015111D) bytes
   - size:                                         50 bytes
   operat. system version needed to extract (00):  MS-DOS, OS/2, NT FAT
   unzip software version needed to extract (20):  2.0
   general purpose bit flag (0x0000) (bit 15..0):  0000.0000 0000.0000
     file security status  (bit 0):                not encrypted
     data descriptor       (bit 3):                no
     strong encryption     (bit 6):                no
     UTF-8 names          (bit 11):                no
   compression method (08):                        deflate
     compression sub-type (deflation):             normal
   file last modified on (0x4F24 0x3D6D):          2019-09-04 07:43:26
   32-bit CRC value:                               0x5F2EEF84
   compressed size:                                984551 bytes
   uncompressed size:                              1074836 bytes
   length of filename:                             20
                                                   UTF-8
   73 61 69 6E 74 2D 70 65 74 65 72 73 62 75 72 67 saint-petersburg
   2E 6A 70 67                                     .jpg

Note: additional method ZipInfo.printShortInfo(PrintStream) could be used to print this info to required PrintStream

Decompose zip file into Path destination

Path zip = Paths.get("filename.zip");
Path destDir = Paths.get("/filename_decompose");
ZipInfo.zip(zip).decompose(destDir);
filename.zip
 |-- cars
 |    |-- bentley-continental.jpg
 |-- saint-petersburg.jpg
/filename_content
 |-- central_directory
 |    |-- #1 - cars
 |    |    |-- extra_fields
 |    |    |    |-- (0x000A)_NTFS_Timestamp.txt
 |    |    |    |-- (0x000A)_NTFS_Timestamp.data
 |    |    |-- file_header.txt
 |    |    |-- file_header.data
 |    |-- #2 - cars_-_bentley-continental.jpg
 |    |    |-- extra_fields
 |    |    |    |-- (0x000A)_NTFS_Timestamp.txt
 |    |    |    |-- (0x000A)_NTFS_Timestamp.data
 |    |    |-- file_header.txt
 |    |    |-- file_header.data
 |    |-- #3 - saint-petersburg.jpg
 |    |    |-- extra_fields
 |    |    |    |-- (0x000A)_NTFS_Timestamp.txt
 |    |    |    |-- (0x000A)_NTFS_Timestamp.data
 |    |    |-- file_header.txt
 |    |    |-- file_header.data
 |    |-- central_directory.txt
 |-- entries
 |    |-- #1 - cars
 |    |    |-- local_file_header.txt
 |    |    |-- local_file_header.data
 |    |-- #2 - cars_-_bentley-continental.jpg
 |    |    |-- local_file_header.txt
 |    |    |-- local_file_header.data
 |    |-- #3 - saint-petersburg.jpg
 |    |    |-- file_header.txt
 |    |    |-- file_header.data
 |-- end_central_directory.txt
 |-- end_central_directory.data

Model

Zip settings: ZipSettings

All zip operations include ZipSettings. Default settings is used when it's not explicitly set. Settings contains zip archive scope properties as well as provider for entry specific settings. The key for entry settings is fileName.

Note: user should not worry about directory marker /, because zip4jvm does not support duplicated file names and it's impossible to have same file name for file and directory.

  • splitSize - size of each part in split archive
    • -1 - no split or solid archive
    • min size - 64Kb i.e. 65_536
    • min size - ~2Gb i.e. 2_147_483_647
  • comment - global archive comment
    • no comment - null or empty string
    • max length - 65_535 symbols
  • zip64 - use true or not false zip64 format for global zip structure
    • Note: zip64 is switched on automatically if needed
    • Note: it does not mean that entry structure is in zip64 format as well
  • entrySettingsProvider - file name base provider of settings for entry
    • Note: each entry could have different settings

Zip settings defaults

  • splitSize - -1, i.e. off or solid archive
  • comment - null, i.e. no comment
  • zip64 - false, i.e. standard format for global zip structure
  • entrySettingsProvider - default, i.e. all entries has same default entry settings

Zip entry settings: ZipEntrySettings

Each entry has it's own settings. These settings could be different for every entry. If this settings are not explicitly set, then default entry settings are used for all added entries.

  • compression - compression algorithm
    • store - no compression
    • deflate - use DEFLATE compression algorithm
    • enhanced_deflate - use ENHANCED DEFLATE compression algorithm
    • bzip2 - use BZIP2 compression algorithm
    • lzma - use LZMA compression algorithm
  • compressionLevel - compression level
    • super_fast fast normal maximum
  • encryption - encryption algorithm
    • off - not encryption
    • pkware - PKWare encryption algorithm
    • aes_128 aes_192 aes_256 - AES encryption algorithm with given 128 192 255 bits key strength
  • comment - comment for entry
    • no comment - null or empty string
    • max length - 65_535 symbols
  • zip64 - use true or false zip64 format for global zip structure
    • Note: zip64 is switched on automatically if needed
  • utf8 - true use UTF-8 charset for file name and comment instead of IBM-437 when false

Zip entry settings defaults

  • compression - deflate
  • compressionLevel - normal
  • encryption - off, i.e. no encryption
  • comment - null, i.e. no comment
  • zip64 - false, i.e. standard format for entry structure
  • utf8 - true, i.e. entry's name and comment are stored using UTF-8 charset
Links

zip4jvm's People

Contributors

oleg-cherednik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

zip4jvm's Issues

ZipInputStream stop finding next entry if entry is a folder with Data Descriptor

ZipInputStream stop finding next entry if entry is a folder with Data Descriptor. This zip could be extracted by unzip in Linux.

Seems this is caused by ZipInputStream.readUntilEndOfEntry

    if (localFileHeader.isDirectory()
        || (localFileHeader.getCompressedSize() == 0 && !localFileHeader.isDataDescriptorExists())) {
      return;
    }

Decompress improvement

When we decompress the zip file, we based on the CentralDirecrtory records. But between LocalEntries it could be additional data. Yes, this is not up to standard, but still. We should extract these data as well.

Dynamically set password after ZipFile creation

thanks for maintaining such a top notch library. I wanted to upgrade from 1.3.2 to 2.1.2, but have a question concerning the API change.

Use case:
I have to process incoming zip files. They can be encrypyed or not, I do not know upfront. If they are encrypted i know how to generate the password from the file name.

With the old API, I could create a ZipFile and check its encryption status, and if true I could set the password:

Zipfile zip = new ZipFile(filename);
if( zip.isEncrypted() ){
String password = generatePassword(filename);
zip.setPassword(password);
}
// ... process zip
AFAIK, with the new API, I can only set the password when creating the ZipFile. I cannot set it afterwards. Also not via the ZipParameters, which was also possible with the old API.

Question:
How can i handle this use case with the new API?
Or in other term, what happens when I proactively/preventively supply a password, but the zip is not actually encrypted. Does extraction still work correctly?

String meaninglessPassword = generatePassword(nonEncryptedFilename);
Zipfile zip = new ZipFile(nonEncryptedFilename, meaninglessPassword);
// ... process zip correctly?
Thanks for looking into this.

Split ZipInfoTest

ZipInfoTest should be split into two tests ZipInfoPrintShortInfoTest and ZipInfoDecomposeTest

GeneralPurposeFlag bit3 is not checked when writing

There's a problem with saving zip in Store compression. WinRar could not open it. The problem is that GeneralPurposeFlag bit3 and correct values are not consistent.

In tests, another tool says, that only Deflated archive could have EXT part.

When STORE compression, reduce zip

For STORE compression we know the compression size (it's equal to uncompressed size), therefore no need to add some blocks like DataDescriptor.

ZipInfo not working from Zip64

Looks like total disk is not correct when ready Zip64. After investigation, I have found that totalDisk is differ in EndCentralDirectory and Zip64.EndCetralDirectoryLocator.

Add encryption and compression settings for all entries

Usually we set one encryption and compression settings for all entires. So we should provide this settings directly in ZipSettings:

ZipSettings settings = ZipSettings.builder()
                                  .encryption(Encryption.AES_256, password)
                                  .compression(Compression.DEFLATE, CompressionLevel.NORMAL)
                                  .build();

Additionally, we should not allow set a password to ZipEntrySettings without providing encryption algorithm.

Add support of not standard charsets

I have found out that many programs like WinRar or WinZip use default system charset when zipping and not set utf8 flag.

It's impossible to solve this issue in general. zip4jvm by default uses utf8 encoding, if not cleared flag by the user (in this case it uses ibm437).

  • I this it could be useful to add the ability to set charset externally for correctly extract zip archive (but not for creating a zip archive).
  • Additionally, add convert API to convert such archives to correct one using utf8.

Add symbol link like solution

Toy example:

I created one directory with a symlink to it and a regular file with a symlink to it like so:

$ mkdir tmp
$ cd tmp
$ mkdir a; ln -s a b; touch c; ln -s c d
$ cd ..
$ tree tmp
tmp
├── a
├── b -> a
├── c
└── d -> c

Ideally what would happen is the following:

$ zip -yr tmp.zip tmp
$ unzip tmp.zip -d newtmp
Archive:  tmp.zip
   creating: newtmp/tmp/
   creating: newtmp/tmp/a/
 extracting: newtmp/tmp/c            
    linking: newtmp/tmp/d            -> c 
    linking: newtmp/tmp/b            -> a 
finishing deferred symbolic links:
  newtmp/tmp/d           -> c
  newtmp/tmp/b           -> a
$ tree newtmp 
newtmp
└── tmp
    ├── a
    ├── b -> a
    ├── c
    └── d -> c

I then attempted to use Zip4j to zip these files.

First attempt is a naive approach. Just setSymbolicLinkAction to INCLUDE_LINK_ONLY:

import net.lingala.zip4j.ZipFile;
import net.lingala.zip4j.exception.ZipException;
import net.lingala.zip4j.model.ZipParameters;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.stream.Stream;

public class ZipTest {
    public static void main(String[] args) throws IOException {
        try (ZipFile zipFile = new ZipFile("test.zip");
             Stream<Path> paths = Files.walk(Paths.get("tmp"))) {
            paths.forEach(path -> {
                ZipParameters zp = new ZipParameters();
                if ((!Files.isSymbolicLink(path)) && Files.isDirectory(path)) {
                    zp.setFileNameInZip(path + "/");
                } else {
                    zp.setFileNameInZip(path.toString());
                }
                zp.setSymbolicLinkAction(ZipParameters.SymbolicLinkAction.INCLUDE_LINK_ONLY);
                try {
                    zipFile.addFile(path.toFile(), zp);
                } catch (ZipException e) {
                    throw new RuntimeException(e);
                }
            });
        }
    }
}

Let's see what it contains:

$ unzip test.zip -d test && tree test
Archive:  test.zip
   creating: test/tmp/
   creating: test/tmp/a/
 extracting: test/tmp/c              
    linking: test/tmp/d              -> c 
 extracting: test/tmp/b              
finishing deferred symbolic links:
  test/tmp/d             -> c
test
└── tmp
    ├── a
    ├── b
    ├── c
    └── d -> c

2 directories, 3 files

That didn't work. The symlink b to directory a, did not stay a symlink.

Then I tried only setting INCLUDE_LINK_ONLY for paths tested to be symlinks, and manually set CompressionLevel.NO_COMPRESSION and CompressionMethod.STORE.

import net.lingala.zip4j.ZipFile;
import net.lingala.zip4j.exception.ZipException;
import net.lingala.zip4j.model.ZipParameters;
import net.lingala.zip4j.model.enums.CompressionLevel;
import net.lingala.zip4j.model.enums.CompressionMethod;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.stream.Stream;

public class ZipTest {
    public static void main(String[] args) throws IOException {
        try (ZipFile zipFile = new ZipFile("test.zip");
             Stream<Path> paths = Files.walk(Paths.get("tmp"))) {
            paths.forEach(path -> {
                ZipParameters zp = new ZipParameters();
                if ((!Files.isSymbolicLink(path)) && Files.isDirectory(path)) {
                    zp.setFileNameInZip(path + "/");
                } else {
                    zp.setFileNameInZip(path.toString());
                }
                if (Files.isSymbolicLink(path)) {
                    zp.setSymbolicLinkAction(ZipParameters.SymbolicLinkAction.INCLUDE_LINK_ONLY);
                    zp.setCompressionLevel(CompressionLevel.NO_COMPRESSION);
                    zp.setCompressionMethod(CompressionMethod.STORE);
                }
                try {
                    zipFile.addFile(path.toFile(), zp);
                } catch (ZipException e) {
                    throw new RuntimeException(e);
                }
            });
        }
    }
}

Unzipping output:

$ unzip test.zip -d test   
Archive:  test.zip
   creating: test/tmp/
   creating: test/tmp/a/
 extracting: test/tmp/c              
    linking: test/tmp/d              -> c 
 extracting: test/tmp/b              
finishing deferred symbolic links:
  test/tmp/d             -> c

It still won't link b to a!

Originally posted by @ajpfahnl in srikanth-lingala/zip4j#486

Add validity check for missing z* files

is it possible to add (or change existing) validity check that will verify if all split files exists?

it happens that sometimes one or few zip split files (z01, z02 ...etc.) are missing on the disk (due to various reasons, not related to zip4j). The extractAll method will start the extraction and will fail only when it encounter the missing file with
Caused by: java.io.FileNotFoundException: zip split file does not exist: /home/ec2-user/bigZip.z11

it would be nice if i could test for missing files in advance. (like zipFile.isValidZipFile() add zipFile.isValidSplitZipFile() )

Add support of stream not using Files

Right now we support streams but generated from files only.
We need to add support of real streams. In this case it's not possible to read CentralDirectory, so all metadata should be read from LocalFileHeader.

zstd + aes images unpack problem

When try to unzip zstd archive with AES encryption, it does not work for image. It work for simple text files, but not for image. MAC not match.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.