Comments (39)
Here's a short update for the progress of adding monitoring support:
This might not sound or look like much, but it was quite some work to get here. So here's the first video demonstrating how FSearch updates the search results as files are being removed with the terminal:
Screencast.from.2023-02-28.18-42-59.webm
In order for that to work the database was rewritten completely in the last couple of weeks and now I'm step by step porting the code from the monitor prototypes to FSearch.
I hope to get the first alpha versions with full monitoring support out by the end of next month. However, note that those will likely not be usable as a daily driver, e.g. some features might still be missing or the database format on disk will likely change a couple of times.
from fsearch.
Yes, I know this is a really important feature to me too, but inotify support (the technology to automatically update the database) is planned for version 0.3, which will be released in a couple of months, depending on how long it takes me to release 0.2.
Edit: Of course help is always welcome :)
from fsearch.
Can't wait to see fsearch become one more of such "diagnostic" tools.
Yeah, me too.
So I just finished the fanotify
backend. It was a bit more complicated than anticipated. The fanotify
documentation isn't the best, but fortunately there were some great demo implementations out there and with some additional trial and error it's now working quite well.
I think I also found a solution to insert/update/delete database entries much more efficiently. I expect that this will boost the performance up from 1000 updates/sec to at least 100.000 updates/sec.
Currently the bottle neck is the data structure being used for the database; a couple of large arrays (one for each sort type). This means that each time you add or delete a file, FSearch needs to memmove
a huge block of memory (up to 8 MB for every million db entries) to the left or right by a few bytes, which is really inefficient. My idea now is to simply split those arrays into much smaller ones (something like 32k elements per array). This way search performance will still be great and memmove
s will become much much faster, because much less memory needs to be moved around. I'm really curious how this turns out and I hope to get this done by the end of the week.
from fsearch.
In the next build add the feature of the application being able to auto-update the database automatically so that it can add files automatically by itself.Because when i download even a small document, fsearch can't index it till i manually update the database myself.
But this is a good project.
I think am gonna go back to study C and contribute. Regards bro.
from fsearch.
I'm not sure if I understand you correctly, but FSearch does index all entries in a database. At launch this database is loaded again.
However, FSearch doesn't automatically detect changes made to the file system and update its index then. This is on the roadmap (it's called inotify support) but it'll never work as smooth as Everything on Windows, because the Linux kernel isn't particularly good at reporting filesystem changes.
from fsearch.
dear author,
please give yourself time to make inotify your number 1 priority for this project.
without inotify, this app is totally useless as I already can query the mlocate via cli and mlocate is already up to date via automated cron jobs.
thanks.
from fsearch.
The Linux kernel is really the limiting factor here, and there's currently nothing that I can do to bring the smooth and fast experience Everything offers on Windows. inotify is just much slower, it requires much more memory, it's less reliable and harder to use.
Not sure if this would help, but have you seen these recent changes to fanotify?
torvalds/linux@235328d
from fsearch.
Next update: I've been running FSearch now for a few hours while it's monitoring my home folder with 1.2 million entries and it works surprisingly well. No obvious memory leaks, no crashes, no excessive CPU usage most of the time, ...
It's also really interesting to see how many files constantly get changed when you perform certain actions on your system. This is now super easy to spot when you sort by date modified.
Next up I'm going to:
- add the
fanotify
backend. This won't be much work, becausefanotify
events can be easily translated toinotify
events (and vice versa). - optimize the performance: I've noticed that there are some applications which often create and immediately delete folder structures with thousands of entries. Ironically on Windows those same applications don't to that. At the moment the FSearch database can only process around 1000 creations/deletions per second on my system, so this must become way faster.
from fsearch.
@spsf64, yes, that's a good idea. Since Ctrl+R is already used (enable regex mode), I'll probably use Ctrl+Shift+R instead.
In the future I'm adding the ability to configure shortcuts for all actions anyway, then users can choose whatever key combinations they happen to like.
from fsearch.
from fsearch.
isn't it possible to even build a script just to update the database that i can run regularly via cron (like the one with angrysearch) ?
from fsearch.
The Linux kernel is really the limiting factor here, and there's currently nothing that I can do to bring the smooth and fast experience Everything offers on Windows. inotify is just much slower, it requires much more memory, it's less reliable and harder to use.
Not sure if this would help, but have you seen these recent changes to fanotify?
torvalds/linux@235328d
this looks promising! And now @cboxdoerfer is tagged too :P
from fsearch.
great news, 60x faster will still make a huge difference :)
Yes, the app feels much more responsive now when lots of stuff happens on the system.
I've also found and fixed another performance bottle neck. When a folder is renamed FSearch needs to find all its sub-directories and sub-files in the database as well (because their sort order changes as well, as they now have a different path). This took about 0.1 seconds up until now with a database of one million entries. When lots of folders get renamed in a short period of time this can quickly add up and make the database busy for a while; so this needed to be improved. Fortunately the fix I came up with wasn't difficult and worked really well; it's now 10x faster (100ms -> 10ms).
All of those performance issues really made me appreciate again how snappy Everything on Windows is. Its developers really put a lot of thought into its design.
Next I'll be working on the preferences dialog, so you can actually enable/disable file system monitoring for individual folders from the GUI.
from fsearch.
I know that the kernel is capable of file system notifications, I've used inotify extensively already. But inotify has lots of limitations, that's the problem. And most solutions (GFileMonitor, FS Event (libuv), ...) are just a nicer frontend for inotify.
Like I said in my other post: The Linux kernel is really the limiting factor here, and there's currently nothing that I can do to bring the smooth and fast experience Everything offers on Windows. inotify is just much slower, it requires much more memory, it's less reliable and harder to use. If you are interested you can read about some of that in the inotify documentation: http://man7.org/linux/man-pages/man7/inotify.7.html#NOTES
But slow and memory hungry notifications are better than none, so of course I'm going to add that one way or another. Just don't expect FSearch being able to monitor the whole file system (/), because that's going to be really slow - and most certainly the kernel wont even allow it since it reaches the limit of available inotify watches per user.
from fsearch.
@cboxdoerfer
A bit off topic, but how about an accelerator/shortcut to update database?
Like F5 or Ctrl+R?
Maybe also add a message in the background where it says "Press Ctrl+F and start typing" like:
"Press Ctrl+F and start typing or Ctrl+R to update database"
from fsearch.
@cboxdoerfer
Wow, this one was fast! Just built the new package (using archlinux / aur) and it works perfect.
Thank you!
from fsearch.
@spsf64, no problem ;)
from fsearch.
I think the idea of the script to use with cron is very cool, indeed!
I am currently using it in angrysearch, so the database is automatically updated every 6 hours. I think it may be a good compromise
from fsearch.
I've also came across fswatch with allows the recursive monitoring for directories .. just wanted to let you know
from fsearch.
@cboxdoerfer Have you looked into the eBPF capabilities yet? This does sound promising.
from fsearch.
Thanks, I'll keep the selection then for now. If this turns out to be controversial I can still add a config options for it.
from fsearch.
I expect that this will boost the performance up from 1000 updates/sec to at least 100.000 updates/sec.
Just finished the first prototype of the new data structure and it seems I was a bit too optimistic here. However the performance still made a significant jump; it went up from 1.000 updates/sec to around 60.000 updates/sec (so roughly 60 times faster) and fortunately memmove
is no longer the bottle neck.
There's still more room for improvements (for example by using multiple threads to apply database updates), but for now I think the performance is fine.
from fsearch.
i'm here on ubuntu 16.10 and there is a software that is called "gamin" and it says here in its description "File and directory monitoring system Gamin is a file and directory monitoring system which allows applications to detect when a file or a directory has been added, removed or modified by somebody else."
i don't know if you can use this but i think it's promising and easy too as you shouldn't implement everything from scratch here. also i think there are other alternatives as well.
from fsearch.
@robert1826, thx, I'll have a look at that. But first impression isn't that good, because Gamin seems to be pretty much dead - there have just been 5 commits in the past 8 years. However, chances are that's because Gamin is feature complete and rock solid. Only way to find out is by trying it.
from fsearch.
@cboxdoerfer ok, but the point is that the linux kernel actually supporting notifying applications about file system changes and that technology is called inotify maybe gamin is the best choice here but i'm sure that there are other alternatives that uses inotify. also i'll keep searching and will notify you if i found one
from fsearch.
Hi @cboxdoerfer i was wondering about a way to implement an incremental database update at least for now till someone can figure out how to make a 'proper' folder monitor ... the idea is we crawl the folders keeping a time that the previous database was built and we compare the time of the current database with the modification time of the target folder if we found that its older that the our database time we skip that folder else we recursively crawl that directory our resume with whatever way you are doing .... hope this idea helps or at least inspire someone else to help thx again for this awesome piece of software
from fsearch.
Yeah this really doesn't work like Everything if you have to spend several minutes updating the database before each search. :/
from fsearch.
Would eBPF for monitoring/tracing file system changes be worth considering? Example projects that seem to use it to monitor file system changes:
from fsearch.
@danielkrajnik, thanks I've not heard of that before. I'll have a look at it.
from fsearch.
Thanks, I hope that it could be faster than fanotify and substitute what USN Journal provides on NTFS. Here is another interesting project from this area: https://github.com/kanurag94/filemonitor
from fsearch.
@dlong500 yes, I experimented a bit with it. It's incredibly powerful and flexible, but it's also more complex to implement and at least in my demo had a performance overhead compared to fanotify and inotify (but this might be fixable).
So for the next 0.3 release I decided to use fanotify as the default backend (which works really well in my testing) and inotify as a fallback. An eBPF backend, if it turns out to be an improvement compared to the others, can then be added later. This way I'm not unnecessarily delaying the release of 0.3 any further.
from fsearch.
I'm currently adding the file move/rename handling and ran into the following question: What's supposed to happen with the selection when a file gets renamed? Should the file (with the new name) keep the selection state it previously had or should it automatically become un-selected?
from fsearch.
Thanks for asking, I'd keep the previous selection (common operation for me would be renaming a file and then copying/moving it to somewhere else).
from fsearch.
So it turned out that remembering file selection for moved/renamed files is a bit more difficult than anticipated and I've put it on hold for the moment.
The problem is that it is quite difficult to detect true move or rename events with inotify
. The general idea of inotify
is that whenever you rename or move a file inotify
creates two events for you: IN_MOVED_FROM
and IN_MOVED_TO
.
The first minor problem is that there can be other events in between those two. The fix for that is quite simple: remember all IN_MOVED_FROM
events until their matching IN_MOVED_TO
event happens.
But the big problem is that inotify
doesn't always create proper pairs. Sometimes you only get a IN_MOVED_FROM
and never the corresponding IN_MOVED_TO
event and vice versa. This happens when files move between un-watched and watched directories. E.g. when you're monitoring /home/user/Downloads
and you move one of its files to the un-monitored trash directory, then you'll only get a IN_MOVED_FROM
event and never a IN_MOVED_TO
event.
There are two ways how this can be fixed, as far as I know:
-
Assume that if there's still no matching
IN_MOVED_TO
event after some time, that there won't ever be one and we then interpret the formerIN_MOVED_FROM
event as amoved out of our monitored directory
event and simply remove the file from our index. The longer you wait, the more reliable this approach gets, but also your index and search results remain inconsistent with the file system for longer. There's probably some good middle ground for that, but it remains guess work. -
The most reliable fix I can think of is to simply treat every
IN_MOVED_FROM
event immediately as adelete
event and anIN_MOVED_TO
event as acreated
event. This just works, since there's no guess work necessary for how long to wait for the next event, but it comes at the cost of being a bit more resource intensive and it's not possible to remember the selection for actually moved/renamed files.
So currently I'm favoring and using the second approach, simply because it's reliable and simplifies the code. But I'll revisit the first approach again. If anyone knows of an alternative solution, let me know.
from fsearch.
Great news, thanks for your hard work. I always find it interesting to see how much code runs on a seemingly idle system. Can't wait to see fsearch become one more of such "diagnostic" tools.
from fsearch.
great news, 60x faster will still make a huge difference :)
from fsearch.
Can anyone think of a good use case for allowing the same directory being included in the database multiple times(with slightly different settings)? For example a situation like this:
Currently it's still possible, but since it would simplify a few things in the code and it seems pretty pointless, I'm thinking about removing that option.
from fsearch.
Look forward to this feature! You're a hero. Is version 0.3 ready?
from fsearch.
Look forward to this feature! You're a hero. Is version 0.3 ready?
No, unfortunately not yet. I've been quite busy recently (new job etc.). But now that things are slowly going back to normal, I'll be able spend more time on FSearch again.
from fsearch.
Related Issues (20)
- [BUG] fsearch locks hdd HOT 1
- Separate update interval setting for each database path HOT 1
- [BUG] FSearch's searching results include a duplicate file HOT 4
- Ability to start a one-off search of a location not normally indexed - preferably in file explorer context menus but also in the main UI. HOT 2
- Hotkey should bring focus to fsearch HOT 6
- [BUG] HOT 15
- Duplicate search result [BUG] HOT 2
- List of files, that have changed during the last 10 minutes HOT 2
- [BUG] Whenever I open a folder via FSearch I will have two new tabs in thunar HOT 1
- GNOME 44 runtime is no longer supported HOT 2
- Accessibility
- Add Search Profiles
- [BUG] Cannot install on ubuntu 24.04 HOT 5
- Dragging to change column width will also affect the scroll bar
- [BUG] Modifiers table of Search Syntax page of Wiki contains an error
- [Feature request] Please add support for "recently used files" in Xfce (thunar)
- [BUG] The icon of "Preferences" menu is shown as a broken image
- [BUG?] search string "iexplore*exe"
- [BUG] Segfault @ [0x00007ffff7d8cdc7 in g_menu_exporter_menu_free (menu=0x0) at ../glib/gio/gmenuexporter.c:121]
- [BUG] Copy-paste not working on KDE
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fsearch.