Code Monkey home page Code Monkey logo

Comments (7)

cboxdoerfer avatar cboxdoerfer commented on July 19, 2024 2

Yes, this feature has a high priority for me and it's already on the roadmap for version 0.3.

However it'll never work as smooth as on Windows with Everything Search. Everything on Windows makes great use of the MFT and USN Journal, neither of those (or similar technologies) are available on Linux.

In principal (and afaik) Everything's database works like that: When initially run Everything will build its database from reading the MFT. That's a table stored in file system which lists all file and folders within this file system. Doing that is much quicker than traversing the file system, because the operating system just needs to read a single file instead of touching every file on the disk. Once that is done, Everything monitors the USN Journal, which lists all changes made to the file system up until a certain point in the past. Once the USN Journal changes and the changed file is indexed by Everything it'll update its index.

The really great thing about the USN Journal is, your application doesn't need to run all the time, because the USN Journals logs all the changes. So when you launch your application you just need to look in the USN Journal, and it'll tell you which files have changed since the last time your application was running. That's why Everything doesn't need to update its whole database every time you launch it.

On Linux this is completely different. The best technology I can rely on is called inotify. But this has lots of issues, e.g:

  • you can only detect changes when your application is running
  • therefore, each time FSearch launches I need to update the whole database first and detect changes manually (this can take quite a while)
  • by default most distributions allow only 8192 files or folders to be monitored by each user (I have more than 100.000 folders in my home directory alone) (this can be configured by modifying a kernel paramenter)
  • it requires quite a lot of additional memory, ~100 MB for 100,000 folders to be watched on a x64 system (for the inotify watches), and on top of that a few MB which I need for efficiently storing the inotify file descriptors and directory names
  • when a folder is moved/renamed, inotify requires me to rescan the whole subtree

That's why monitoring the root directory on Linux with inotify isn't a good idea, and therefore I'm going to make monitoring optional. The user needs to decide which directory is more important and should be monitored. And because of the limited number of inotify watches, I still can't guarantee that it'll really monitor "Everything".

Long story short: The Linux kernel makes it nearly impossible to do file system monitoring like Everything does, and there's very little I can do about that.

from fsearch.

cboxdoerfer avatar cboxdoerfer commented on July 19, 2024 1

It may be a naive question, but aren't ext3/ext4 journaling file systems as well? Shouldn't it be possible to use them like USN Journal?

Unfortunately no. The journal in journaling file system (including NTFS) has a different purpose than the USN and works differently. It's used to prevent data-loss after crashes or power outages by first writing transactions to a journal, before commiting them to the disk. So this isn't a historical log of all changes over a large time period, which could be used to keep an index of files on disk up to date.

from fsearch.

cboxdoerfer avatar cboxdoerfer commented on July 19, 2024 1

@Anagastes

And sorry for bringing up the old stuff. I just had a problem at the moment where fsearch would have been much more efficient in the background :)

No problem. It's something I'm currently working on anyway.

inotify will (most likely) be used in the upcoming 0.3 release, however only as a fallback. By default the file system monitoring will be done with fanotify. This has been working the best in my testing, but it requires Kernel >=5.1 and it unfortunately doesn't work with btrfs (and maybe some other filesystems as well).

The database options for 0.3 will look something like this then:

Path Monitor Rescan on launch Archive One filesystem
/home/user Yes Yes* No Yes
/mnt/nas No No No Yes
/mnt/usbdrive No Yes Yes Yes

* Rescan on launch is automatically activated when Monitor is active`

But maybe I'm also letting the user decide which monitoring backend they want to use (Auto | fanotify | inotify | Periodic scan).

from fsearch.

cboxdoerfer avatar cboxdoerfer commented on July 19, 2024

Oh, and this is a duplicate of #26 . You can, follow the process there. Closing.

from fsearch.

danielkrajnik avatar danielkrajnik commented on July 19, 2024

It may be a naive question, but aren't ext3/ext4 journaling file systems as well? Shouldn't it be possible to use them like USN Journal? If needed tweak them so that they behave similar? One forensic tool that parses them here (The Sleuth Kit) for reference.

Sorry for not posting it under #26, but I think that your explanation above is really good.

from fsearch.

Anagastes avatar Anagastes commented on July 19, 2024

It may be a naive question, but aren't ext3/ext4 journaling file systems as well? Shouldn't it be possible to use them like USN Journal?

Unfortunately no. The journal in journaling file system (including NTFS) has a different purpose than the USN and works differently. It's used to prevent data-loss after crashes or power outages by first writing transactions to a journal, before commiting them to the disk. So this isn't a historical log of all changes over a large time period, which could be used to keep an index of files on disk up to date.

Hm, if I'm not mistaken, inotify would be good for this, wouldn't it? Probably not as efficient as USN etc. but the kernel informs very sparingly about changes. Thus close to the hardware. :)

And sorry for bringing up the old stuff. I just had a problem at the moment where fsearch would have been much more efficient in the background :)

from fsearch.

danielkrajnik avatar danielkrajnik commented on July 19, 2024

@Anagastes I've heard that inotify won't "scale" to cover all filesystem changes and is better suited for a few selected folders. Not great when your home folder is a few hundreds gigabytes in size...

I'm hoping though that opensnoop might work?

from fsearch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.