Code Monkey home page Code Monkey logo

harvester's Introduction

Harvester -- High-level ARchiVe ExSTactoR

This library exposes a very high-level interface to libarchive for extracting file archives.

Basic interface

It consists of one class, Directory, that manages the lifetime of the extracted files:

namespace Harvester
	{
	class Directory
		{
		public:
			template<class ExceptionHandler>
			Directory(const char* dir_name,ExceptionHandler&& eh);

			Directory(Directory&& dir) noexcept;
			Directory& operator=(Directory&& dir) noexcept;
			Directory(const Directory&)=delete;
			Directory& operator=(const Directory&)=delete;
			~Directory();

			const char* name() const noexcept;
			void contentRelease() noexcept;
		};
	}

It also contains functions that extracts files into a directory. The first one extracts all files accepted by the ExecutionPolicy. The other two are for cherrypicking individual files.

namespace Harvester
	{
	template<class ExecutionPolicy>
	Directory extract(const char* src_file,const char* dest_dir,ExecutionPolicy&& exec_policy);

	template<class ExecutionPolicy>
	Directory extract(const char* src_file,const char* dest_dir
		,ExecutionPolicy& exec_policy
		,const char** files_begin
		,const char** files_end);

	template<class ExecutionPolicy,class ... Args>
	Directory extract(const char* src_file,const char* dest_dir
		,ExecutionPolicy&& exec_policy,const char* file,Args... files);
	}

An ExecutionPolicy must have two members

  • void raise(const char* message), that is called up error and does not return. The function must not throw message directly, since the buffer may be deallocated during stack unwinding. Instead, copy the message into a fixed-size buffer such as std::array<char,512> with strncpy, and throw that object.
  • ProgressStatus progress(double x, const char* name), that is for each file during the extraction process. It may return one of ProgressStatus::
    • SKIP: Do not extract this file
    • EXTRACT: Extract this file
    • STOP: Do not process any more files, including the current file

The archive content is extracted to a uniqe directory inside dest_dir, which must exist. The unique directory name is accessible through the name method called on the returned Directory object. By default, the destructor will remove the created directory. If the directory should be kept, the method release needs to be called before the Directory object goes out of scope. Notice that release does not release the directory name from the object. Thus the caller must not try to deallocate the name.

The extraction process is transactional. That is, if extract does not succeed, all created files are removed.

Path validation policy

This library is designed to avoid messing up file systems. Therefore, any absolute path inside an archive will generate an exception, and the previously extracted files are removed from disk. An absolute path is any path that

  • Begins with / (A POSIX style absolute path)
  • Begins with \ (A UNC path or an absolute path on to the current drive)
  • Begins with regex [A-Za-z]:\\ (A drive letter)

Also paths that contains a .. element generate an exception.

Example usage

A simple example can be found in main.cpp. It is a fully working program for extracting files from a file archive.

harvester's People

Contributors

milasudril avatar

Watchers

 avatar  avatar  avatar

harvester's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.