Comments (5)
I've implemented the algorithm for generating the fingerprints without any problem. However using the fingerprints in a repair is a bit more problematic: to identify where a directory has moved to would require fingerprinting every directory in the search paths specified which would potentially be very expensive.
You might think this would be a preexisting problem when repairing files but TMSU is able to build a shortlist of candidates by only considering files with an identical size as a file cannot be identical to another if its filesize is different. The size check (stat) is a relatively cheap operation compared to calculating a fingerprinting.
A synonym to the file size shortcut might be to consider the number of items within the directory. This might be cheap if the directory is small but could potentially be more expensive than the fingerprint calculation if the directory has millions of items. It might be I would have to cap the number of items just like the directory fingerprint algorithm stops if the directory has too many items.
from tmsu.
If the idea is just to notice when a directory moves somewhere, perhaps what you could do is to add a file .tmsuid
in that directory containing a unique id + device and inode number of that file. This will be the directory identifier.
When the directory is moved somewhere else, the file stays with its inode and device number untouched (if on same filesystem). This can be detected.
When the directory is copied, it is also possible to detect it by noticing that the .tmsuid
file has change device and inode number, and is a copy.
If you don't care to detect copies vs renames, you don"t need to keep track of device and inode number.
from tmsu.
@mildred Yes, that is one possible solution however it is not very user-friendly: one would have to remember to add this meta-data to each directory ahead of time. I would prefer to come up with a solution that would transparently detect directory moves/renames.
I think the best solution (as in most transparent and requiring no up-front participation from the user) would be to shortlist candidate directories based upon the number of directory entries or their aggregate size. This should be relatively cheap to calculate as it would only require a (perhaps recursive) directory enumeration.
from tmsu.
Well, I was suggesting that tmsu would create this file. Perhaps this might be a little bit invasive, but it would record directory identity better than the list of its files (that can change possibly).
Or perhaps, just record directory inode number as a hint.
from tmsu.
I wouldn't want to use inodes as not every type of filesystem uses inodes.
With respect to the list of a directory's files changing: I would consider this no different than the contents of a file changing after the fingerprint has been calculated, i.e. it could be repaired in the same way using the repair
subcommand.
from tmsu.
Related Issues (20)
- Pls consider making this tool available on conda HOT 1
- tmsu-fs-mv may overwrites files it cannot update internally HOT 4
- key-value tags will not work with rename HOT 5
- `tmsu tags --name=never` seems to imply `-1` HOT 1
- How can I delete values? HOT 4
- Need python example on how to import the tags for a file from a sqlite database into TMSU HOT 1
- Confusing status command behavior HOT 1
- ...
- Duplicated tags in file system HOT 1
- Repair files by automatically locating files based on checksum? HOT 2
- List all the tags with values HOT 11
- small contribution: sc-im interface for changing tmsu tags HOT 2
- Question: what does the error message mean? HOT 3
- --name=never removes lines HOT 2
- database locked occasionally
- VFS can be very slow
- Feature Request in VFS: Listing files outside of the `files` folder and hiding `queries` folder
- Cannot install TMSU HOT 1
- Package available in Void Linux
- too many SQL variables - transport endpoint is not connected when too many tag values exist
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tmsu.