Comments (9)
In btrfs, files can be deduplicated with other parts of themselves. So it's generally no problem and even a wanted feature if bees does that. And usually it works without an exception.
So is your question about that fact, or is it about the exception? @Zygo may be able to explain why that exception occurred. There's probably a corner case that the kernel does not support or that bees cannot resolve currently. I wouldn't care too much about such exceptions if you only see them once in a while. Bees can gracefully handle that and your data is not at risk.
from bees.
True, I didn't see the difference in ranges. I am more concerned about the large amounts of exceptions I see. I do have to admit that this machine is running 3.16.
I will set up a VM soon under a new kernel and a fresh workload (this filesystem had dupremove run on it previously).
from bees.
Bees won't run well before 4.11 kernel, kernel 3.16 is even in the list of missing kernel features so bees cannot work reliably there. If possible, switch to at least 4.11, preferably 4.14.
from bees.
3.16 has unnecessarily restrictive rules about what dedups are allowed, so I'd expect you will see a lot of EINVAL errors running bees on that kernel--especially if you have VM or disk image files which will contain a lot of duplicate blocks. Most of those rules were relaxed by 4.2. Kernel deadlock bugs were fixed at the same time, so you might want to upgrade.
dedup from one offset in a file to another offset in the same file is allowed on modern kernels. The offset ranges in the file must not overlap, but bees detects this case and avoids it. bees doesn't try to work around the pre-4.2 restrictions--there are enough real bugs in kernels that old that it's not worthwhile to run bees there.
from bees.
Maybe that fact about pre-4.2 restrictions should be documented in the README, so future questions may not occur or can be pointed to the README?
@Zygo: Feel free to assign to me, I can prepare a patch then.
from bees.
README already says:
Minimum Linux kernel version: 4.4.3
Don't bother trying to make Bees work with older kernels. It won't end well: there are too many missing features and bugs to work around.
This could be improved:
- 4.2:
FILE_EXTENT_SAME
no longer updates mtime, can be used at EOF.
to add a note about not supporting dedup on the same file, the kernel deadlock issue, and whatever other bugs exist that I've forgotten about.
@kakra Let me see what you come up with. github doesn't seem to let me assign it to you (the only choices seem to be me and stroucki?)
from bees.
I have set up a vm of Devuan Ascii which runs 4.9. A run of bees on one volume so far has not thrown any exceptions.
from bees.
@Zygo Docs say "Earlier kernels are usable with bees, but bees can trigger a few performance bugs and hangs in dedup-related functions.". Perhaps extend that with "exceptions on deduplication due to ioctl arguments that kernel version has not yet deemed safe"?
from bees.
I guess I think an unending stream of exceptions on some but not all dedups is a performance bug.
That sentence ("Earlier kernels are usable...") applies to kernels between 4.4.3 and 4.11. The previous paragraph states that bees on kernels before 4.4.3 should not be attempted at all.
from bees.
Related Issues (20)
- Mount prevent second run HOT 1
- bees seemingly cannot catch up with snapper snapshot creation HOT 16
- Initial run of bees appears to have resulted in data corruption HOT 7
- bees breaks existing reflinks? HOT 4
- bees "--loadavg=3" option causes load crazy 20+ and OOM kill HOT 1
- beesd script can't handle BEESHOME on a non-btrfs filesystem HOT 4
- Stopping one of multiple beesd services removes the /run/bees directory HOT 4
- 执行bee run之后一直卡出不动 HOT 2
- Lookup root 256 ino xxx failed: Bad address HOT 1
- Documentation: modern Ubuntu install HOT 1
- [Feature Request] Unmount mount points if beesd exits.
- [Feature Request] Beesd to run a full dedup cycle and then end HOT 6
- How to remove? HOT 10
- cancelling with CTRL+C need to be followed by umount /run/bees/mnt/$UUID HOT 3
- build fails on Fedora 40
- Demystifying needed options with QubesOS pool in btrfs reflink (multiple cow snapshots rotating, beesd dedup and load avg hitting 65+ on 12 cores setup) HOT 4
- btrfs send size HOT 3
- one question: multiple cloned sets, two proposals: de-duplicate on write, 'only new' option, HOT 5
- How to Force Rescan after increasing Hash Table Size? HOT 4
- optimization - de-optimization, fragmentation, how to use efficiently, lock down risk. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bees.