Comments (5)
Hi James,
Would you be able to look at #124 as it might do what you want?
Regards, David
from zbackup.
I'm talking about a new type of backupinstruction, enabling an internal reference in the backup stream to a partial chunk. What you've referenced is addressing a range of the expanded backup. They are separate concepts.
from zbackup.
Could you please explain how zbackup is able to determine if the current ringbuffer data is a part of a chunk already saved without a rollinghash from every possible subblock in a chunk?
from zbackup.
I see one optimization in handleMoreData zbackup could check if it has a partial chunk (emitted from addChunkIfMatched, when there is already data in chunkToSave), instead only checking if there are chunks with the hash after having a full chunk in the ringbuffer. But this might cause problems if a partial chunk is a prefix of another chunk, zbackup would never check if the incoming data matches the bigger chunk because it always deduplicates the prefix. So I don't know if this increases or decreases the backup size.
This might also be compatible with the current file format.
from zbackup.
It wouldn't be compatible with the current file format, I'm fairly sure, but yes, this is what I'm suggesting. My main reason for suggesting this optimization in zbackup is because when I first mentioned this to @dragonroot he wasn't keen because to have it included in zbackup itself because it wouldn't make use of it and therefore be difficult to test, whereas this optimization would certainly be implementable and testable.
However, I've had experience of zbackup stalling on data (which I later realised contained many, many copies of the same data over and over again), which it now seems clear was because it was repeatedly performing SHA sums on large bits of data, because it kept finding rolling sum matches. If it could match partial chunks it would (a) not store so many duplicated chunks and (b) be able to more sensibly optimize when it does SHA checks because it would have a choice of two efficient ways to optimize overlapping chunks and only have to perform an SHA sum once per chunk size (or so) in this case.
from zbackup.
Related Issues (20)
- Encoding format (feature) HOT 20
- Fixed chunk offsets (feature) HOT 4
- Built in locking (feature) HOT 6
- ZBackup client/server API specification (feature)
- Filesystem synchronisation to ensure durability HOT 1
- Separate rolling-hash set could make compression faster HOT 8
- Dependencies missing in README.md ?
- zbackup should operate using O_DIRECT flag, to minimize O/S buffer cache drains HOT 10
- Support multiple compressed streams per bundle
- New file format with named (or numbered) features HOT 4
- [EXPIRED] Windows standalone binary HOT 4
- import/export : SIGSEGV in Bundle::Creator::Write HOT 1
- Feature request: no compression HOT 2
- blaze HOT 1
- Password on commandline option? HOT 1
- Integrity checking HOT 6
- Single file per backup run HOT 5
- Critical gc.concat Data Corruption Issue HOT 8
- zbackup does not build on archlinux HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zbackup.