Code Monkey home page Code Monkey logo

zbox's People

Contributors

ad-m avatar amiraeva avatar atallahade avatar burmecia avatar philippgille avatar reiniermaas avatar xmclark avatar youanden avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zbox's Issues

Load and Store the serialized version of the memory FS

Hey there! ZboxFS looks like an awesome virtual file system, that has a lot of potential, however, I have noticed that there is no way to store and restore the created memory file system, rendering it totally volatile. It makes using Zbox is scenarios like WASM as an embedded file system nearly impossible without additional hacks.
Am I missing something there? Thanks! Great work so far!

hashed key should use SafeBox

Hashed key is the password hash from user specified plaintext password using crypto_pwhash. It is a hash value but used as a key, so it must be protected by SafeBox as same as other keys.

Although the current password hash has Drop to clear it, but it is allocated on stack which is weaker than the secured heap and potential to be persistent. So it should be using same SafeBox as a normal key to enhance security.

Exclusive repo access is not implemented

Currently there is no exclusive repo access implemented on all storages except zbox storage.

For example, below code should not allow simultaneous access to same repo.

let repo = RepoOpener::new().create(true).open("mem://foo", "pwd").unwrap();
let repo2 = RepoOpener::new().create(true).open("mem://foo", "pwd").unwrap();

Document File thread safety

It seems like writing to a file repeatedly from multiple threads is not possible and is terminated with a NotInTrans error.

I stumbled over this while trying to do so from a async fn.

Use merkle tree for content hash

The content hash currently is a hash of the whole content, which means each time content is updated the whole content will be read through to compute its hash. That is not very efficient especially when the updated portion is much smaller compared with the whole content.

Merkle tree (hash tree) (https://en.wikipedia.org/wiki/Merkle_tree) is a better solution which can efficiently compute content hash even if only tiny part is updated.

OpenOption read option doesn't work for write-only mode

OpenOption for file creation and opening has a read option, but it doesn't work when set it to false. The purpose is to open a write-only file, but now it is impossible to open in that mode. This need to be rectified.

Wrong file content version retirement order

When retiring a file content version, the correct order should be adding the new version first and then retire the oldest one, not the other way around. This will make sure the linking between content and segment is correct.

Example in README.md uses wrong method name

Your README.md uses the following code in the example:

use zbox::{zbox_init, RepoOpener, OpenOptions};
[...]
zbox_init();

While the init method is called init_env.

I think zbox_init would be the better name, from looking at it, you know that zbox will be initialized while init_env might be confusing (what env will be initialized?)

So I think you should rename the method to match the README, but if you won't change the API, you should still make the docs fit the code.

File read/write should not allowed after repo is closed

Currently repo holds an Arc reference to its file system internally, and a file obtains an Arc reference to the same file system when it is opened. So that means the file can still read/write even after the repo object is dropped.

This is not semantically correct, as the repo should act as the guard of its contents. When it is closed, all access to its contents should be shutdown completely.

SIGABRT on opening Repo

I'm getting a SIGABRT on opening a repo.
The error happens both with libsodium-bundled and with the system libsodium.

Host: Arch Linux
Zbox: 0.8.3

fn main() {
    let mut repo = zbox::RepoOpener::new()
        .open("mem://", "password")
        // .open("file://file.zbox", "password")
        .unwrap();
}
Program received signal SIGABRT, Aborted.
0x00007ffff7dd1755 in raise () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff7dd1755 in raise () from /usr/lib/libc.so.6
#1  0x00007ffff7dbc851 in abort () from /usr/lib/libc.so.6
#2  0x0000555555587245 in sodium_misuse () at sodium/core.c:199
#3  0x0000555555715f7d in _sodium_malloc (size=<optimized out>)
    at sodium/utils.c:578
#4  sodium_malloc (size=<optimized out>) at sodium/utils.c:610
#5  0x00005555555ec5fa in zbox::base::crypto::SafeBox<T>::new_empty ()
    at /home/theduke/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.3/src/base/crypto.rs:163
#6  0x000055555592178a in zbox::volume::storage::file::file_armor::FileArmor<T>::new (base=0x555555a74b60)
    at /home/theduke/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.3/src/volume/storage/file/file_armor.rs:183
#7  0x0000555555595cc9 in zbox::volume::storage::file::file::FileStorage::new (
    base=0x55555599d007)
    at /home/theduke/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.3/src/volume/storage/file/file.rs:40
#8  0x00005555555feaff in zbox::volume::storage::storage::parse_uri (uri=...)
    at /home/theduke/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.3/src/volume/storage/storage.rs:45
#9  0x00005555555fec48 in zbox::volume::storage::storage::Storage::new (uri=...)
    at /home/theduke/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.3/src/volume/storage/storage.rs:144
#10 0x00005555555b328f in zbox::volume::volume::Volume::new (uri=...)
    at /home/theduke/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.3/src/volume/volume.rs:42
#11 0x00005555555fc91d in zbox::fs::fs::Fs::open (uri=..., pwd=...,
--Type <RET> for more, q to quit, c to continue without paging--
     at /home/theduke/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.3/src/fs/fs.rs:162
#12 0x000055555559087f in zbox::repo::Repo::open (uri=..., pwd=..., read_only=false) at /home/theduke/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.3/src/repo.rs:682
#13 0x0000555555590266 in zbox::repo::RepoOpener::open (self=0x7fffffffe5b8, uri=..., pwd=...) at /home/theduke/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.3/src/repo.rs:252
#14 0x0000555555587520 in zbox_repro::main () at src/main.rs:2

Cow deref() method is not correct

The Deref trait of Cow implementation isn't correct. It returns what the current switch points to no matter the inner object of cow is copied or not. If the inner object is copied by mutating, in-thread deref() should point to the copied one, while out-thread deref() should still return the original one.

Pre-bundled libsodium

Using zbox could be a little easier if libsodium was downloaded and built during cargo build. Although simple, libsodium is an extra configuration step, and It introduces complexity for local development and automated CI builds.

The rust_sodium project does this in a custom build script.

Zbox could offer an optional pre-bundled version with a feature flag e.g. cargo build --features libsodium-bundled which would download and build a recent stable build of libsodium.

Absolute file URLs on Windows

I'm able to use the file storage on Windows 10 using urls like file://../data/foo, but file://D:/data/foo fails on the open call with the following error message:

thread 'main' panicked at 'opening archive failed: Io(Os { code: 123, kind: Other, message: "The filename, directory name, or volume label syntax is incorrect." })', src\libcore\result.rs:1189:5

I'm generating the file URLs using Url::from_file_path, so the syntax should be ok. I can also paste this URL into Chrome and opens the folder view.

Password Reset

How do I reset the password for the zboxfs repository if I need to change it, I tried to look for the solution in the docs but there is none

Deeper explanation in website or docs

The current explanation on the website is shallow and does not go deep into the internal workings of the library. For instance "Multiple storages, including memory, OS file system, RDBMS, key-value object store" does not explain how to interact with other storage systems like memory or RDBMS.

Another example, "State-of-the-art cryptography: AES-256-GCM (hardware), XChaCha20-Poly1305, Argon2 and etc." does not explain which is cryptography is used by default or how it's all interconnected.

It would be great if you explained more, or gave a document detailing the internal
architecture of zbox.

Also the chat on gitter seems abandoned

Why is Zbox so slow?

The benchmark results in README show that Zbox's read/write throughput is one order of magnitude lower than the baseline. And my impression is that AES-GCM can easily achieve a throughput over 1GB/s. So, encryption/decryption should not be the root of the problem. Then, what is the performance bottleneck? Is it the Merkle trees (if Zbox uses them internally)? Is improving performance on the top of your TODO list?

I believe Zbox is a very good idea. And it could be even more useful if the performance overhead is reduced to the minimal.

There are no design docs about the internals of Zbox. And I haven't got the time to read the source code. So, I hope you can shed some light on the performance issue. Thanks.

ZBox claims misleading and inaccurate. Please consider rewording the lack of leakage.

SpiderOak made the same mistake with its misuse of the term "zero knowledge" and since migrated to "no knowledge", which hasn't fixed the semantic issue but it has avoided the overlap of terms - by coining another inaccurate term, which you've adopted.

An extract from https://zbox.io/about/ , emphasis mine.

Only encryption is not enough to protect privacy. In addition to strong encryption, Zbox also stores all data in a number of same-sized blocks, which leaves no details to any underlying storages, including Zbox cloud storage. By using this method, only [the] application itself can hold the key and have access to their data.

"Same-sized blocks" may reduce some leakage but this does not imply that nothing remains leaked.

  1. The server knows which users interact with the service and when.
  2. The server knows which blocks the user reads.
  3. The server knows which blocks the user writes.

To prevent (1) the users must identify to the service using a credential that demonstrates their authority without revealing their identity.

To prevent (2) a Private Information Retrieval (PIR) or Oblivious Transfer (OT) protocol must be used; to prevent the server from learning which blocks the client asks for and to prevent the server from learning which value it responds with. PIR may reveal other blocks, potentially owned by other users. Even if encrypted, this is probably not desired. Limiting (2) to OT.

To prevent (3) the dual to PIR whose name I've forgotten may be used to enable a client to upload a block without revealing to the server the block they've pushed into the set.

In so far that I've read, there have been no mentions for PIR or OT or any form of anonymous credentials in ZBox' documentation. When a user reads your claims, they may be led into a false sense of security by correctly interpreting what you've written and not as intended.

Please consider rewording these claims or invest into the cryptographic protocols that make them true. Obviously there are costs in the latter and the choices of which to use depends on how many "cloud" replicas there exist and who controls them. I.e. If there are several replicas owned by distinct parties then you can use some of the cheaper PIR protocols that rely on a ratio of the servers not colluding. If you've only one server, you'll need CPIR (computationally bound) which builds on Diffie-Hellman, or a computationally bound OT.


BTW: The /about/ page says "ZBos" instead of "ZBox" in the 5th paragraph.

File content span split should be on chunk boundary

Currently a file content span as whole cannot be split, this causes lots of space waste especially when span is large crossing many chunks. For example, when calling set_len(1) for a 16MB file with one version limit, this will not shrink the segment size as the whole span will still reference it.

To solve this issue, we need to go deeper from span to chunk level, so split on a span can be on chunk boundary. By this means, unnecessary chunk reference will be reduced, hence segment shrink becomes possible.

Concurrency

It seems like zbox does not currently allow for any multi-threaded concurrency at all.

Sadly this limits usage to very simple use cases.

Are there plans for introducing concurrency?

A "simple" first step could be to allow concurrent reads, with a global write lock. Although I'd really love to have concurrent reads and writes as well.

Zbox characteristics as a filesystem

Most of README is about encryption and security, also some about backend support.

But I don't see typical filesystem metrics like maximum file size, maximum filesystem size, maximum number of files, overhead, fragmentation resilience and so on.

Shall README also mention those parameters?

For example, is zbox suitable for storage of very large number of small files? How slow is to delete those files afterwards?

Unable to build on Windows 10 using MSVC toolchain

I tried using this in a toy project in Windows and I got this error when trying to build both the latest release on crates.io and the master branch:

PS > cargo build --features libsodium-bundled

error: failed to run custom build command for `zbox v0.7.1 (R:\zbox_master)`
Caused by:
  process didn't exit successfully: `R:/cargo_build/debug\build\zbox-4db8f77a6c10c304\build-script-build` (exit code: 101)
--- stderr
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: FileNotFound', src\libcore\result.rs:999:5
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

The relevant part of the backtrace (line numbers reference master branch as of today):

  10: core::result::unwrap_failed<zip::result::ZipError>
             at /rustc/3a5d62bd5675ddd406a5b93d630ba1ddced91777\src\libcore\macros.rs:18
  11: core::result::Result<zip::read::ZipFile, zip::result::ZipError>::unwrap<zip::read::ZipFile,zip::result::ZipError>
             at /rustc/3a5d62bd5675ddd406a5b93d630ba1ddced91777\src\libcore\result.rs:800
  12: build_script_build::download_and_install_libsodium
             at .\build.rs:361
  13: build_script_build::main
             at .\build.rs:26
  14: std::rt::lang_start::{{closure}}<()>
             at /rustc/3a5d62bd5675ddd406a5b93d630ba1ddced91777\src\libstd\rt.rs:64
  15: std::rt::lang_start_internal::{{closure}}
             at /rustc/3a5d62bd5675ddd406a5b93d630ba1ddced91777\/src\libstd\rt.rs:49

The offending line is:

#[cfg(target_arch = "x86_64")]
let mut lib = zip
        .by_name("x64/Release/v142/static/libsodium.lib")
        .unwrap();

I forked the repo, and fixed this issue - in doing so I encountered another issue in which it would only compile if cargo ran in a shell with all the MSVC variables (x64 Native Tools Command Prompt for VS 2019). In my fork I fixed this as well by compiling liblz4 using the cc crate instead of directly invoking cl.

I'm new to contributing so I'm not too familiar with opening a pull request, but I'll give it a shot.

Add create_new() to RepoOpener

The RepoOpener should have a create_new() function similar to the OpenOptions for File. By this means, no repo is allowed to exist at the target location before calling this function.

no std support?

I am really interested in using zbox as the default file system for my OS, but can it be used in no std environments? I took a look at a couple pieces of your code and saw nowhere a no std indicator. Your project claims to be embeddable, but it isn't truly embeddable if you can't use it in those environments. If this is a project GOAL, great, and I would be willing to help to the best of my ability. If not, oh well. Great work so far though! :)

Unable to open Sqlite storage

Hi!

I can create a sqlite backend file, open some files in it, read them and close. But i am not able to reopen the same sqlite file again. Then zbox crashes with the given error below. Somehow in a frame below the RepoOpener call.

To reproduce simply execute the attached rust program two times.

It seems that I am able to use the file backend, but not the sqlite backend. If I replace sqlite://./myRepoSqlite with file://./myRepo it works. ๐Ÿค”.

use std::io::prelude::*;
use std::io::{Seek, SeekFrom};
use zbox::{init_env, OpenOptions, RepoOpener};

fn main() {
    // initialise zbox environment, called first
    init_env();

    // create and open a repository
    let mut repo = RepoOpener::new()
        .create(true)
        .open("sqlite://./myRepoSqlite", "your password")
        .unwrap();

    // List files
    let things = repo.read_dir("/").unwrap();
    println!("{:?}", things);

    {
        // create and open a file for writing
        let mut file = OpenOptions::new()
            .create(true)
            .open(&mut repo, "/my_file.txt")
            .unwrap();
        
        file.seek(SeekFrom::End(0)).unwrap();

        // use std::io::Write trait to write data into it
        file.write_all(b"Hello, world!\n").unwrap();

        // finish writting to make a permanent content version
        file.finish().unwrap();

        // read file content using std::io::Read trait
        let mut content = String::new();
        file.seek(SeekFrom::Start(0)).unwrap();
        file.read_to_string(&mut content).unwrap();

        println!("{}", content);
    }
}

Dependencies:

[dependencies]
zbox = {version="0.8.2", features=["storage-file", "storage-sqlite"]}

Error:

RUST_BACKTRACE=1 cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.05s
     Running `target/debug/crypting`
thread 'main' panicked at 'index out of bounds: the len is 0 but the index is 0', /rustc/a53f9df32fbb0b5f4382caaad8f1a46f36ea887c/src/libcore/slice/mod.rs:2695:10
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:39
   1: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:71
   2: std::panicking::default_hook::{{closure}}
             at src/libstd/sys_common/backtrace.rs:59
             at src/libstd/panicking.rs:197
   3: std::panicking::default_hook
             at src/libstd/panicking.rs:211
   4: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:474
   5: std::panicking::continue_panic_fmt
             at src/libstd/panicking.rs:381
   6: rust_begin_unwind
             at src/libstd/panicking.rs:308
   7: core::panicking::panic_fmt
             at src/libcore/panicking.rs:85
   8: core::panicking::panic_bounds_check
             at src/libcore/panicking.rs:61
   9: <usize as core::slice::SliceIndex<[T]>>::index
             at /rustc/a53f9df32fbb0b5f4382caaad8f1a46f36ea887c/src/libcore/slice/mod.rs:2695
  10: core::slice::<impl core::ops::index::Index<I> for [T]>::index
             at /rustc/a53f9df32fbb0b5f4382caaad8f1a46f36ea887c/src/libcore/slice/mod.rs:2552
  11: <alloc::vec::Vec<T> as core::ops::index::Index<I>>::index
             at /rustc/a53f9df32fbb0b5f4382caaad8f1a46f36ea887c/src/liballoc/vec.rs:1687
  12: <zbox::volume::storage::sqlite::sqlite::SqliteStorage as zbox::volume::storage::Storable>::get_super_block
             at /home/stefano/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.2/src/volume/storage/sqlite/sqlite.rs:329
  13: zbox::volume::storage::storage::Storage::get_super_block
             at /home/stefano/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.2/src/volume/storage/storage.rs:196
  14: zbox::volume::super_block::SuperBlk::load_arm
             at /home/stefano/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.2/src/volume/super_block.rs:131
  15: zbox::volume::super_block::SuperBlk::load
             at /home/stefano/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.2/src/volume/super_block.rs:159
  16: zbox::volume::volume::Volume::open
             at /home/stefano/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.2/src/volume/volume.rs:95
  17: zbox::fs::fs::Fs::open
             at /home/stefano/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.2/src/fs/fs.rs:167
  18: zbox::repo::Repo::open
             at /home/stefano/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.2/src/repo.rs:682
  19: zbox::repo::RepoOpener::open
             at /home/stefano/.cargo/registry/src/github.com-1ecc6299db9ec823/zbox-0.8.2/src/repo.rs:247
  20: crypting::main
             at src/main.rs:10
  21: std::rt::lang_start::{{closure}}
             at /rustc/a53f9df32fbb0b5f4382caaad8f1a46f36ea887c/src/libstd/rt.rs:64
  22: std::panicking::try::do_call
             at src/libstd/rt.rs:49
             at src/libstd/panicking.rs:293
  23: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:85
  24: std::rt::lang_start_internal
             at src/libstd/panicking.rs:272
             at src/libstd/panic.rs:394
             at src/libstd/rt.rs:48
  25: std::rt::lang_start
             at /rustc/a53f9df32fbb0b5f4382caaad8f1a46f36ea887c/src/libstd/rt.rs:64
  26: main
  27: __libc_start_main
  28: _start

Environment information:

$ rustc -V && cargo version && uname -a && lsb_release -d
rustc 1.36.0 (a53f9df32 2019-07-03)
cargo 1.36.0 (c4fcfb725 2019-05-15)
Linux STEFANO-PC 5.1.15-arch1-1-ARCH #1 SMP PREEMPT Tue Jun 25 04:49:39 UTC 2019 x86_64 GNU/Linux
Description:	Arch Linux

Zipped Sqlite file for inspection:
myRepoSqlite.zip

Add version_limit option for RepoOpener

RepoOpener should has a version_limit option, which provides global limit of maximum number of versions for all files in the repo. Individual file can still use OpenOptions.version_limit to override that global setting.

Need a method can copy directory recursively

Currently if you want to copy entire directory tree, you have to write you own implementation which may not be optimal. We need a native method on Repo can do that work like this:

repo.copy_dir_all("/foo", "/bar");

File should not be accessible after repo is closed

Currently a opened file is still accessible even after repo is closed. The reason is that an opened file is still holding strong reference to underlying storage object, which prevents storage object from being released. A closed repo must not allow any open file access. This can be achieved by let open file hold weak reference to storage object.

Memory storage should be persistent during process lifetime

The memory storage currently is not identified by its URI, that is, each time repo.open is called a new memory storage is created regardless the URI. An in-memory 'persistent' storage should be used as back-end of memory storage, so that an URI specific memory storage can be opened multiple times during process lifetime.

For example, the 2 open calls below should open the same memory storage.

{
  let repo = RepoOpener::new().create(true).open("mem://foo", "pwd");
}
{
  let repo = RepoOpener::new().create(true).open("mem://foo", "pwd");
}

WebAssembly support

Hey there! As far as I can see, ZboxFS has its ways of using the native FS and other services, therefore making me wonder if it can be successfully compiled to WASM and use as an in-memory secure file system. Are there any plans for it?

Thanks!

zbox is unable to successfully handle a machine failure

As this filesystem aims to provide ACID functionality, I tested if it can handle a full machine failure while a write is in progress. My test shows that zbox fails to even open the repository after such a machine failure.

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidSuperBlk', libcore/result.rs:1009:5
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::print
             at libstd/sys_common/backtrace.rs:71
             at libstd/sys_common/backtrace.rs:59
   2: std::panicking::default_hook::{{closure}}
             at libstd/panicking.rs:211
   3: std::panicking::default_hook
             at libstd/panicking.rs:227
   4: std::panicking::rust_panic_with_hook
             at libstd/panicking.rs:476
   5: std::panicking::continue_panic_fmt
             at libstd/panicking.rs:390
   6: rust_begin_unwind
             at libstd/panicking.rs:325
   7: core::panicking::panic_fmt
             at libcore/panicking.rs:77
   8: core::result::unwrap_failed
             at /home/buildozer/aports/community/rust/src/rustc-1.31.1-src/src/libcore/macros.rs:26
   9: <core::result::Result<T, E>>::unwrap
             at /home/buildozer/aports/community/rust/src/rustc-1.31.1-src/src/libcore/result.rs:808
  10: zbox_fail_test::main
             at src/main.rs:27
  11: std::rt::lang_start::{{closure}}
             at /home/buildozer/aports/community/rust/src/rustc-1.31.1-src/src/libstd/rt.rs:74
  12: std::panicking::try::do_call
             at libstd/rt.rs:59
             at libstd/panicking.rs:310
  13: __rust_maybe_catch_panic
             at libpanic_unwind/lib.rs:102
  14: std::rt::lang_start_internal
             at libstd/panicking.rs:289
             at libstd/panic.rs:392
             at libstd/rt.rs:58
  15: std::rt::lang_start
             at /home/buildozer/aports/community/rust/src/rustc-1.31.1-src/src/libstd/rt.rs:74
  16: main

I tried to make the test as reproducible as possible, so you can recreate it on your own machine. The machine failure is simulated by forcefully shutting down a virtual machine where a zbox program is currently writing on.

Is recovery from a full machine failure not supported yet or am I using the zbox API in wrong way? (main.rs)

Inconsistent `rename` behaviour

Hi @burmecia,

just a quick hint about Repo.rename being described in the documentation as overwriting the destination if it already exists, while on the other hand the code has it return Err(AlreadyExists).

You can test with the following cargo script:

// cargo-deps: zbox="*"
extern crate zbox;

fn main() {
    init_env();

    let mut repo = zbox::RepoOpener::new()
        .create(true)
        .open("file://./my_repo", "your password")
        .unwrap();

    let mut file = zbox::OpenOptions::new()
        .create(true)
        .open(&mut repo, "/my_file.txt")
        .unwrap();
    file.write_once(b"My file").unwrap();

    let mut file = zbox::OpenOptions::new()
        .create(true)
        .open(&mut repo, "/other_file.txt")
        .unwrap();
    file.write_once(b"Other file").unwrap();


    let res = repo.rename("/my_file.txt", "/other_file.txt");
    println!("{:?}", res)
}

Repo is opened - trivially to lock out

fn main() {
   zbox::RepoOpener::new().create(true).open("file://./data", "password").unwrap();
}

cargo watch -x run

Cargo watch will immediately restart due to the data directory changing.

This results in a Error::RepoOpened error on the second run, due to the repo.lock file, without the API offering a way to recover (by just deleting the lock file).

The lockfile should probably store the process id and delete itself if it doesn't match.

(of course watch should ignore the data directory in real use, but this demonstrates a problem)

New versions are merging old and new file contents

I've tested this with 0.7.1, 0.8.1 and current master and it is happening in all three cases.

The problem is when you write new versions of the file that are shorter than the previous versions you end up with merged content where the beginning of the file content is overwritten but the old remains.

The following is a simple replication case:

use std::io::{Read, Write};
use zbox::{init_env, Error, OpenOptions, RepoOpener};

fn main() -> Result<(), Error> {
    init_env();
    let mut repo = RepoOpener::new()
        .create(true)
        .open("file://./my_repo", "your password")?;

    let mut f = OpenOptions::new()
        .create(true)
        .open(&mut repo, "/myfile.txt")?;
    let mut pdf = std::fs::File::open("mylargefile.txt").unwrap();
    let mut buffer = Vec::new();
    pdf.read_to_end(&mut buffer)?;
    f.write_all(&buffer)?;
    f.finish()?;
    println!(
        "Long: version={} - size={}",
        f.curr_version()?,
        f.metadata()?.content_len()
    );

    let mut f = OpenOptions::new()
        .create(true)
        .open(&mut repo, "/myfile.txt")?;
    let mut pdf = std::fs::File::open("myfile.txt").unwrap();
    let mut buffer = Vec::new();
    pdf.read_to_end(&mut buffer)?;
    f.write_all(&buffer)?;
    f.finish()?;

    println!(
        "Short: version={} - size={}",
        f.curr_version()?,
        f.metadata()?.content_len()
    );

    let mut f = OpenOptions::new()
        .create(false)
        .open(&mut repo, "/myfile.txt")?;
    let mut out = std::fs::File::create("myfile.out").unwrap();
    let mut out_buf = Vec::new();
    f.read_to_end(&mut out_buf)?;
    out.write_all(&out_buf).unwrap();
    println!(
        "Reading: version={} - size={}",
        f.curr_version()?,
        f.metadata()?.content_len()
    );
    Ok(())
}

If you create two files: myfile.txt and mylargefile.txt where mylargefile.txt has more lines than myfile you'll see the resulting merged output in myfile.out. (Note I am using text files in this example but I originally saw this when I was trying to overwrite a pdf file with a text file)

The output also shows that the size between versions isn't changing:

Long: version=2 - size=37
Short: version=3 - size=37
Reading: version=3 - size=37

Looking through the source I couldn't see the exact cause - but I suspect the issue is around the merging in the Writer finish implementation.

Issue with file reader after using `set_len`

Hi @burmecia,

another issue I stumbled upon (this is the last test case that fails against the pyfilesystem test suite): when a reader already exists for a file, extending the file with set_len and then reading the file (for instance with read_to_end) will not take into account the newly added bytes.

Here's a test case, that's shows that file.read_to_end will read the same file content twice although the call to set_len should extend the file:

// cargo-deps: zbox="*"
extern crate zbox;

use ::std::io::Read;
use ::std::io::Seek;
use ::std::io::SeekFrom;

fn main() {
    ::zbox::init_env();

    let mut repo = zbox::RepoOpener::new()
        .create(true)
        .open("mem://", "your password")
        .unwrap();

    let mut file = zbox::OpenOptions::new()
        .create(true)
        .write(true)
        .read(true)
        .open(&mut repo, "/my_file.txt")
        .unwrap();
    file.write_once(b"My file").unwrap();
    file.seek(SeekFrom::Start(0)).unwrap();

    let mut res = Vec::new();
    file.read_to_end(&mut res);
    println!("{:?}", res);

    file.set_len(15).unwrap();
    file.seek(SeekFrom::Start(0)).unwrap();

    let mut res2 = Vec::new();
    file.read_to_end(&mut res2);
    println!("{:?}", res2);
}

segment data shrink doesn't work

When a large amount of chunks are deleted from a segment, the segment data needs to be shrank. But current segment data shrink doesn't work properly as it always results in an empty segment data.

"zero knowledge" does not seem to be the right term

The project's description says

Zbox is a zero-knowledge, privacy-focused embeddable file system.

This seems to be false: there is nothing "zero knowledge" about this (or any) filesystem. Zero knowledge is a property of a proof; informally, it means that the proof reveals nothing except its correctness [1].

I think the term you're looking for is something closer to "semantic security" [2], meaning that the filesystem does not leak anything about the contents except their length.

I am aware that several other projects currently misuse this terminology in the same way you are. It's harmful to your users (in that it misinforms them), to the community (in that it confuses useful pieces of terminology), and to your project (in that it looks unschooled) to follow their example.

[1] https://en.wikipedia.org/wiki/Zero-knowledge_proof
[2] https://en.wikipedia.org/wiki/Semantic_security

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.