Code Monkey home page Code Monkey logo

Comments (3)

matthewmueller avatar matthewmueller commented on May 23, 2024 1

Thanks for the thorough response @yalue!

I'm going to close this issue since I've started diverging slightly from merged_fs and actually the io/fs.FS interface. I'm finding myself needing to pass a context.Context through for tracing support, so I decided to fork merged_fs anyway.

Hopefully if others need multiple FS support, they'll find this issue and copy & paste that Merge function in or open a PR. I don't want to add code I'm not currently likely to use.

Thanks again!

from merged_fs.

yalue avatar yalue commented on May 23, 2024 1

I realized that there could be a low-effort optimization to merging multiple filesystems compared to the snippet you posted: merge them into a balanced binary tree rather than a fully unbalanced tree. (Technically, this may impose a slight overhead when opening files from top-priority FS's, but it will both reduce the overall number of MergedFS instances to create and reduce overhead of opening a file from a low-priority FS.)

Anyway, thought I may point this out if you're doing some similar n-way merge in your fork. It's the MergeMultiple function I added in my latest commit (tag v1.2.0). It seems to work with no issue in a simple test using 2,048 zip files.

from merged_fs.

yalue avatar yalue commented on May 23, 2024

Thanks, and I'm glad the library's been working for you! While much of the Open function, as it is, could probably be refactored into a loop over n filesystems rather than a big "if" statement, the additional code complexity would happen when attempting to manage the path prefix cache used when determining whether a regular file in any higher-priority FS conflicts with a directory in a lower-priority FS. Ultimately, you'd either need to cut out the caching entirely (which could be especially slow, especially with more FSs), or maintain a separate path prefix cache for every sub-FS apart from the first (which is essentially what a "tree" of MergedFS instances is doing implicitly already).

Basically, it is certainly possible to refactor the code to behave like this, but it would require a fair amount of work. While I think it's an interesting challenge, I personally don't have time to work on it at the moment.

On top of that, my hunch is that for most reasonable use cases, the overhead of the recursion is not going to be too high. The depth of the recursion is limited by the number of FSs being merged, so I doubt it would be a problem for even a couple dozen filesystems. The amount of additional memory used by each MergedFS struct is not that high, with the largest overhead likely being the path-prefix cache, which, as I mentioned earlier, likely couldn't go away even with an n-way merge. I guess if you had thousands of "zip" archives you're trying to merge, then the recursion may become an issue! Though I will admit that isn't the use case I had in mind when I wrote this library!

If you want, I don't mind adding an additional function like the Merge code in your comment to the library itself. It can at least serve as a placeholder---that way if I ever do implement a more efficient n-way merging then downstream code wouldn't need to change. Sound useful to you?

from merged_fs.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.