Code Monkey home page Code Monkey logo

merged_fs's Introduction

Merged FS: Compose Multiple Go Filesystems

The release of version 1.16 of the Go programming language included a standard interface for read-only filesystems, defined in Go's io/fs standard library package. With this change came some other standard-library changes, including the fact that archive/zip now provides a "filesystem" interface for zip files, or the ability of net/http to serve files from any filesystem providing the io/fs interface. In conjunction, this means utilities like the HTTP server can now directly serve content from zip files, without the data needing to be extracted manually.

While that's already pretty cool, wouldn't it be nice if you could, for example, transparently serve data from multiple zip files as if they were a single directory? This library provides the means to do so: it implements the io/fs.FS interface using two underlying filesystems. The underlying filesystems can even include additional MergedFS instances, enabling combining an arbitrary number of filesystems into a single io/fs.FS.

This repository provides a roughly similar function to laher/mergefs, but it offers one key distinction: correctly listing contents of merged directories present in both FS's. This adds quite a bit of complexity. However, laher/mergefs will be more performant for filesystems not requiring directory- listing capabilities.

Usage

Documentation on pkg.go.dev

Simply pass two io/fs.FS instances to merged_fs.NewMergedFS(...) to obtain a new FS serving data from both. See the following example:

import (
    "archive/zip"
    "github.com/yalue/merged_fs"
    "net/http"
)

func main() {
    // ...

    // Assume that zipFile1 and zipFile2 are two zip files that have been
    // opened using os.Open(...).
    zipFS1, _ := zip.NewReader(zipFile1, file1Size)
    zipFS2, _ := zip.NewReader(zipFile2, file2Size)

    // Serve files contained in either zip file.
    mergedFS := NewMergedFS(zipFS1, zipFS2)
    http.Handle("/", http.FileServer(http.FS(mergedFS)))

    // ...
}

Additional notes:

  • Both underlying FS's must support the ReadDirFile interface when opening directories. Without this, we have no way for determining the contents of merged directories.

  • If a file with the same name is present in both FSs given to NewMergedFS, then the file in the first of the two always overrides the file with the same name in the second FS.

  • Following the prior point, if a directory in the second FS has the same name as a regular file in the first, neither the directory in the second FS nor any of its contents will be present in the merged FS (the regular file will take priority). For example, if FS A contains a regular file named a/b, and FS B contains a regular file c at the path a/b/c (in which a/b is a directory), then a/b/c will not be available in the FS returned by NewMergedFS(A, B), because the directory b is overridden by the regular file b in the first FS.

Multi-Way Merging

If you want to merge more than two filesystems, you can use the MergeMultiple function, which takes an arbitrary number of filesystem arguments:

    merged := merged_fs.MergeMultiple(fs_1, fs_2, fs_3, fs_4)

The earlier arguments to MergeMultiple will have higher priority over the later filesystems, in the same way that the first argument to NewMergedFS has priority over the second. For now, the MergeMultiple function just provides a convenient wrapper for building a tree of MergedFS instances.

merged_fs's People

Contributors

purpleidea avatar yalue avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

merged_fs's Issues

Consider supporting a list of filesystems?

Thanks for this library, it works like a charm!

I was wondering if you'd be open to accepting a list of filesystems, something like:

fs := mergefs.New(a, b, c)

Right now you have to do this following:

// Merge the filesystem together
func Merge(first fs.FS, remaining ...fs.FS) (merged fs.FS) {
	merged = first
	for _, fsys := range remaining {
		merged = merged_fs.NewMergedFS(merged, fsys)
	}
	return merged
}

But I have a feeling it would be a bit more efficient with less layers if this was possible natively.

Data race reading multiple files at the same time

We're used merged_fs to serve files from disk and fall back to embedded versions if they don't exist.

It works great, but there's a data race which occasionally results in a fatal error:

fatal error: concurrent map writes
fatal error: concurrent map writes

goroutine 67 [running]:
runtime.throw(0xdc0eae, 0x15)
        /usr/lib/go/src/runtime/panic.go:1117 +0x72 fp=0xc00013ed48 sp=0xc00013ed18 pc=0x437f92
runtime.mapassign_faststr(0xcb4b00, 0xc000120e70, 0xc0006981c5, 0x7, 0x0)
        /usr/lib/go/src/runtime/map_faststr.go:291 +0x3d8 fp=0xc00013edb0 sp=0xc00013ed48 pc=0x4160b8
github.com/yalue/merged_fs.(*MergedFS).validatePathPrefix(0xc000120ea0, 0xc0006981c5, 0x27, 0x0, 0x0)
        /home/chris/go/pkg/mod/github.com/yalue/[email protected]/merged_fs.go:347 +0x118 fp=0xc00013eea8 sp=0xc00013edb0 pc=0xc1dbd8
github.com/yalue/merged_fs.(*MergedFS).Open(0xc000120ea0, 0xc0006981c5, 0x27, 0xc00013efb0, 0x4696e5, 0xc000582480, 0x200000003)
        /home/chris/go/pkg/mod/github.com/yalue/[email protected]/merged_fs.go:406 +0x106 fp=0xc00013ef68 sp=0xc00013eea8 pc=0xc1e1c6
net/http.ioFS.Open(0x12fa900, 0xc000120ea0, 0xc0006981c4, 0x28, 0x0, 0x20, 0x0, 0x0)
        /usr/lib/go/src/net/http/fs.go:760 +0x68 fp=0xc00013efc0 sp=0xc00013ef68 pc=0x7acb48
net/http.(*ioFS).Open(0xc0002cf720, 0xc0006981c4, 0x28, 0x0, 0x0, 0x20, 0x0)
        <autogenerated>:1 +0x5d fp=0xc00013f010 sp=0xc00013efc0 pc=0x81abbd
net/http.serveFile(0x1304f50, 0xc00022e288, 0xc00012bd00, 0x12fc520, 0xc0002cf720, 0xc0006981c4, 0x28, 0xc00022e201)
        /usr/lib/go/src/net/http/fs.go:597 +0x84 fp=0xc00013f1d8 sp=0xc00013f010 pc=0x7abcc4
[...]

goroutine 68 [running]:
runtime.throw(0xdc0eae, 0x15)
        /usr/lib/go/src/runtime/panic.go:1117 +0x72 fp=0xc0004c6d48 sp=0xc0004c6d18 pc=0x437f92
runtime.mapassign_faststr(0xcb4b00, 0xc000120e70, 0xc000736c45, 0x7, 0x0)
        /usr/lib/go/src/runtime/map_faststr.go:291 +0x3d8 fp=0xc0004c6db0 sp=0xc0004c6d48 pc=0x4160b8
github.com/yalue/merged_fs.(*MergedFS).validatePathPrefix(0xc000120ea0, 0xc000736c45, 0x26, 0x0, 0x0)
        /home/chris/go/pkg/mod/github.com/yalue/[email protected]/merged_fs.go:347 +0x118 fp=0xc0004c6ea8 sp=0xc0004c6db0 pc=0xc1dbd8
github.com/yalue/merged_fs.(*MergedFS).Open(0xc000120ea0, 0xc000736c45, 0x26, 0xc0004c6fb0, 0x4696e5, 0xc000282a80, 0x200000003)
        /home/chris/go/pkg/mod/github.com/yalue/[email protected]/merged_fs.go:406 +0x106 fp=0xc0004c6f68 sp=0xc0004c6ea8 pc=0xc1e1c6
net/http.ioFS.Open(0x12fa900, 0xc000120ea0, 0xc000736c44, 0x27, 0x0, 0x20, 0x0, 0x0)
        /usr/lib/go/src/net/http/fs.go:760 +0x68 fp=0xc0004c6fc0 sp=0xc0004c6f68 pc=0x7acb48
net/http.(*ioFS).Open(0xc0002cf720, 0xc000736c44, 0x27, 0x0, 0x0, 0x20, 0x0)
        <autogenerated>:1 +0x5d fp=0xc0004c7010 sp=0xc0004c6fc0 pc=0x81abbd
net/http.serveFile(0x1304f50, 0xc0007130f8, 0xc0000b6600, 0x12fc520, 0xc0002cf720, 0xc000736c44, 0x27, 0xc000713001)
        /usr/lib/go/src/net/http/fs.go:597 +0x84 fp=0xc0004c71d8 sp=0xc0004c7010 pc=0x7abcc4
net/http.(*fileHandler).ServeHTTP(0xc0002cf730, 0x1304f50, 0xc0007130f8, 0xc0000b6600)
        /usr/lib/go/src/net/http/fs.go:848 +0x9c fp=0xc0004c7228 sp=0xc0004c71d8 pc=0x7ad39c

Running with the race detector enabled shows these two warnings about a concurrent read/write in validatePathPrefix, and a concurrent write/write which is the same as the fatal error above.

==================
WARNING: DATA RACE
Write at 0x00c000219230 by goroutine 24:
  runtime.mapassign_faststr()
      /usr/lib/go/src/runtime/map_faststr.go:202 +0x0
  github.com/yalue/merged_fs.(*MergedFS).validatePathPrefix()
      /home/chris/go/pkg/mod/github.com/yalue/[email protected]/merged_fs.go:347 +0x184
  github.com/yalue/merged_fs.(*MergedFS).Open()
      /home/chris/go/pkg/mod/github.com/yalue/[email protected]/merged_fs.go:406 +0x13a
  net/http.ioFS.Open()
      /usr/lib/go/src/net/http/fs.go:760 +0x84
  net/http.(*ioFS).Open()
      <autogenerated>:1 +0x8b
  net/http.serveFile()
      /usr/lib/go/src/net/http/fs.go:597 +0xdd
[...]

Previous read at 0x00c000219230 by goroutine 37:
  runtime.mapaccess1_faststr()
      /usr/lib/go/src/runtime/map_faststr.go:12 +0x0
  github.com/yalue/merged_fs.(*MergedFS).validatePathPrefix()
      /home/chris/go/pkg/mod/github.com/yalue/[email protected]/merged_fs.go:316 +0xa4
  github.com/yalue/merged_fs.(*MergedFS).Open()
      /home/chris/go/pkg/mod/github.com/yalue/[email protected]/merged_fs.go:406 +0x13a
  net/http.ioFS.Open()
      /usr/lib/go/src/net/http/fs.go:760 +0x84
  net/http.(*ioFS).Open()
      <autogenerated>:1 +0x8b
  net/http.serveFile()
      /usr/lib/go/src/net/http/fs.go:597 +0xdd
[...]
==================

and

==================
WARNING: DATA RACE
Write at 0x00c0008be128 by goroutine 24:
  github.com/yalue/merged_fs.(*MergedFS).validatePathPrefix()
      /home/chris/go/pkg/mod/github.com/yalue/[email protected]/merged_fs.go:347 +0x19c
  github.com/yalue/merged_fs.(*MergedFS).Open()
      /home/chris/go/pkg/mod/github.com/yalue/[email protected]/merged_fs.go:406 +0x13a
  net/http.ioFS.Open()
      /usr/lib/go/src/net/http/fs.go:760 +0x84
  net/http.(*ioFS).Open()
      <autogenerated>:1 +0x8b
  net/http.serveFile()
      /usr/lib/go/src/net/http/fs.go:597 +0xdd
[...]

Previous write at 0x00c0008be128 by goroutine 37:
  github.com/yalue/merged_fs.(*MergedFS).validatePathPrefix()
      /home/chris/go/pkg/mod/github.com/yalue/[email protected]/merged_fs.go:347 +0x19c
  github.com/yalue/merged_fs.(*MergedFS).Open()
      /home/chris/go/pkg/mod/github.com/yalue/[email protected]/merged_fs.go:406 +0x13a
  net/http.ioFS.Open()
      /usr/lib/go/src/net/http/fs.go:760 +0x84
  net/http.(*ioFS).Open()
      <autogenerated>:1 +0x8b
  net/http.serveFile()
      /usr/lib/go/src/net/http/fs.go:597 +0xdd
[...]
==================

Add support for GlobFS

Currently the merged FS does not support Glob, this is inconvenient as some underlying FS might support it.

It would be cool if we could only return the Glob interface if both A and B implement it but that would require to either create another method or return the interface FS instead a concrete type (breaking change).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.