Code Monkey home page Code Monkey logo

nix-snapshotter's People

Contributors

antoinerg avatar cameronraysmith avatar elpdt852 avatar gbpdt avatar rbpdt avatar robbiebuxton avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nix-snapshotter's Issues

[Feature request] CRI-O support (CRI-O driver)

I’d absolutely love to use nix-snapshotter with podman's Quadlets for local, rootless .container and .kube systemd units. (I don’t know of an equivalent systemd generator for k3s.) This would require nix-snapshotter to implement a CRI-O driver, AFAIK.

The CRI-O container runtime's default overlay storage (graph) driver can be configured with additional image stores and additional layer stores, which are used to configure the overlay driver to use the specified stores for image lookup and layer lookup, respectively. This additional layer store functionality is used by nydus-storage-plugin to support nydus images & stargz-snapshotter's Stargz Store CRI-O plugin to support lazy pulling of eStargz images.

k3s example: error pulling image

First of all, thanks for the great project! I've always wanted K8S to be declarative all the way down to container images. This is awesome!

I tried running the examples on k3s by commenting out ./kubernetes.nix in favor of ./k3s.nix in the following file:

./kubernetes.nix
# ./k3s.nix

then ran nix run ".#vm" to get a VM.

Although it successfully boots a VM with k3s, I hit an error when pulling a Nix image like the preloaded one (eg. kubectl apply -Rf /etc/kubernetes/redis will result in a pod that fails with a pull image error).

Doing the same steps as above but using ./kubernetes.nix works flawlessly.

Rewrite README to guide users through usage and implementation

Let's take a look at some good READMEs around GitHub and write a proper README. I'll add some examples below as I find them.

At a high level we want:

  • One sentence summary of what this project does
  • Badges for code coverage, CI job, godocs
  • Usage with the shortest path to trying the project (likely the NixOS VM)
  • Usage as a NixOS module (to install to host NixOS)
  • Usage as a independent project (flake to build and nix-snapshotter's toml config)
  • Usage via rootless
  • Architecture

We don't have to put everything in a single README.md, a common pattern is to have docs/ directory and have README link to them. We'll have to evaluate whether to split it up when we start writing.

Consider changes to Proxy model

Just a heads up, we have a k8s SIG-Node WG that is considering significant changes to the CRI around image services.

Security image access policies, authentication with in proc local key rings vs over the RPC, GC cache polices, support for runtime handlers in the image service layer that choose which image to unpack from the image index (windows platform versions etc.,) and which snapshotter to use one per runtime handler, ...

For these and a number of other reasons we should chat about other potential ways to hook into the image services api.

Thought:

  • extend the NRI to support image services at the CRI layer (something we've already been considering)
  • some other way to extend sandboxes/the internal core image service tooling to support needed extension points

Provide `homeModules` for integration with home-manager

Rootless nix-snapshotter+containerd will be especially great to be using with home-manager, as this allows usage of our modules outside of a NixOS system as long as its systemd based.

Note that containers still require linux namespaces, thus it still needs a linux kernel. We should note that it won't work elsewhere.

Add module for rootless Kubernetes

Since Kubernetes is complex, writing a NixOS module for rootless Kubernetes seems difficult. Though there is usernetes, I'm not sure what they use underneath.

k3s is a single binary, and much simpler to configure. It is missing plumbing for the kubelet flag --image-service-endpoint here, but otherwise have no other known blockers: k3s-io/k3s#8279

Ideally both rootless k3s and rootless containerd modules should be upstreamed into Home-manager and/or nixpkgs.

Create scaffolding for NixOS test based integration testing

As of #23 we have nix-snapshotter running a NixOS VM. Let's leverage the NixOS test framework to write integration testing.

Initially we should first create the scaffolding to make this possible, but later we'll want to flesh it out with various integration tests for default.nix exported functions like buildImage, copyToRegistry, as well as using containerd+nix-snapshotter with kubernetes.

containerd rootless service fails on Linux Mint with home-manager

Trying to run nix-snapshotter using the home-manager setup from the readme. But the containerd systemd service doesn't start, and gives the following error:

containerd-rootless[316090]: [rootlesskit:parent] error: failed to setup UID/GID map: newuidmap 316098 [0 1000 1 1 100000 65536] failed: : exec: "newuidmap": executable file not found in $PATH

Consider using skopeo for exporting images

Currently we have our own pushing code under pkg/nix2container, but we still lack support for exporting as OCI tarball, directly to containerd, and so forth.

So using skopeo will let us do the following exports: containers-storage (for podman/buildah), dir (non-standardized format), docker (registry), docker-archive, docker-daemon, oci (non-archive), oci-archive.

However, skopeo will not support direct export to containerd: containers/image#1572

Perhaps we should write our own afterall. There is much of the utility in nerdctl but its a pretty big package that I want to avoid depending on (e.g. nix-snapshotter will transitively depend on stargz-snapshotter):

Create kind image in CI with integration smoke test

Spinning issue off #6.

Another way of making nix-snapshotter easier to consume is providing a kind image. We'll need to build and publish this image in CI.

We'll also want an integration smoke test to attempt running a container in kind in CI.

Add documentation for manual installation

We need to complete docs/manual-install.md for how to setup nix-snapshotter manually, with examples of the systemd units and documentation around the TOML config for containerd, nix-snapshotter, kubernetes, and nerdctl.

rootless setup with home-manager fails

Running NixOS 23.11, with home-manager.

The following additions to home-manager work fine:

    imports = [
      nix-snapshotter.homeModules.default
    ];
    nixpkgs.overlays = [ nix-snapshotter.overlays.default ];
    virtualisation.containerd.rootless = {
      enable = true;
      nixSnapshotterIntegration = true;
    };

However, when I attempt the final change (and nixos-rebuild switch):

    services.nix-snapshotter.rootless = {
      enable = true;
    };

I get the error error: nix-snapshotter cannot be found in pkgs

I've tried this both with and without flakes enabled at the nixos level, and the error is the same.

Curiously, removing the problematic final change above, and adding home.packages = with pkgs; [ nix-snapshotter ] does not give an error. So clearly pkgs is being correctly extended with nix-snapshotter, but for some reason it's not appearing for services.nix-snapshotter.rootless.

Any thoughts?

Test failures due to incorrect guess on needing "userxattr"

I'm trying out the Flake instructions. I get this error when building nix-snapshotter:

--- FAIL: TestSnapshotter (0.01s)
    --- FAIL: TestSnapshotter/no_opt (0.01s)
        --- FAIL: TestSnapshotter/no_opt/TestSnapshotterView (0.00s)
            snapshotter_overlay_test.go:389:   []string{
                        "lowerdir=/build/TestSnapshotterno_optTestSnapshotterView20029197"...,
                +       "userxattr",
                        "volatile",
                  }
        --- FAIL: TestSnapshotter/no_opt/TestSnapshotterOverlayMount (0.00s)
            snapshotter_overlay_test.go:389:   []string{
                        "lowerdir=/build/TestSnapshotterno_optTestSnapshotterOverlayMount"...,
                        "upperdir=/build/TestSnapshotterno_optTestSnapshotterOverlayMount"...,
                +       "userxattr",
                        "workdir=/build/TestSnapshotterno_optTestSnapshotterOverlayMount2"...,
                  }
    --- FAIL: TestSnapshotter/AsynchronousRemove (0.00s)
        --- FAIL: TestSnapshotter/AsynchronousRemove/TestSnapshotterView (0.00s)
            snapshotter_overlay_test.go:389:   []string{
                        "lowerdir=/build/TestSnapshotterAsynchronousRemoveTestSnapshotter"...,
                +       "userxattr",
                        "volatile",
                  }
        --- FAIL: TestSnapshotter/AsynchronousRemove/TestSnapshotterOverlayMount (0.00s)
            snapshotter_overlay_test.go:389:   []string{
                        "lowerdir=/build/TestSnapshotterAsynchronousRemoveTestSnapshotter"...,
                        "upperdir=/build/TestSnapshotterAsynchronousRemoveTestSnapshotter"...,
                +       "userxattr",
                        "workdir=/build/TestSnapshotterAsynchronousRemoveTestSnapshotterO"...,
                  }

I'm not sure of the root cause of the error; the machine is running k3s on NixOS, with kernel 5.15.114.

Enroll in GH larger runners to access KVM for NixOS tests

Looks like KVM is not allowed on standard runners and we'll have to upgrade to larger runners to run NixOS tests on GH actions. If we are able to get it, then here's how to enable it.

See this example repo, but it's using a third party GH runner

  • Enable KVM group perms in the GH workflow. See GH blog
      - name: Enable KVM group perms
        run: |
            echo 'KERNEL=="kvm", GROUP="kvm", MODE="0666", OPTIONS+="static_node=kvm"' | sudo tee /etc/udev/rules.d/99-kvm4all.rules
            sudo udevadm control --reload-rules
            sudo udevadm trigger --name-match=kvm
  • Configure cachix/install-nix-action to enable kvm:
      - uses: cachix/install-nix-action@v16
        with:
          extra_nix_config: "system-features = nixos-test benchmark big-parallel kvm"
  • Possibly need to chmod /dev/kvm:
          sudo chmod o+rw /dev/kvm
          nix flake check

Fix `nix flake check` interaction with NixOS VM

Currently nix flake check is failing since nixosConfigurations.vm has no filesystems."/" or boot.loader.grub.device defined. This is because the default options for qemu-vm.nix module from upstream starts the VM without a boot loader to speed up startup. However there is also assertions that assume these should be set even though nixos-rebuild build-vm is happy with it.

See if upstream wants to make overlayfs snapshotter embeddable

So at a high-level, nix-snapshotter is basically overlayfs snapshotter but with additional support for nix layers.

Currently there's a lot of code duplication between pkg/nix/nix.go and the overlayfs snapshotter from containerd: https://github.com/containerd/containerd/blob/main/snapshots/overlay/overlay.go

This is because we use private members like o.ms:

	ctx, t, err := o.ms.TransactionContext(ctx, false)

to handle the metadata layer's bolt transactions.

If we look around the remote snapshotter ecosystem, you'll see this is duplicated the same way:
https://github.com/containerd/stargz-snapshotter/blob/main/snapshot/snapshot.go

We should consider upstreaming a refactor to make it possible to embed it, thus deleting a lot of code from pkg/nix/nix.go.

Rename repository?

This is a super cool project, and I can't wait to play around with it :)) but I believe that discoverability will suffer due to the poor repository name. May I suggest naming something along the lines of nix-native-oci-images

Add documentation around rootless setup

The ceremony around setting up rootless containerd and nix-snapshotter running in the same user namespace is a bit complex. We should complete docs/rootless.md with a diagram or two explaining this, as well as the advanced module options around bindMounts and nsenter for extending rootless containerd with other sibling services (like fuse-overlayfs, other snapshotters, etc).

Investigate integration with nomad

Nomad is Hashicorp's container orchestrator. They have a containerd-driver, this means its very likely nix-snapshotter is usable with Nomad without much effort. If there is a nomad NixOS module, we should probably try it out and add documentation for it.

unable to initialize unpacker: no unpack platforms defined: invalid argument

Error message:

rpc error: code = InvalidArgument desc = unable to initialize unpacker: no unpack platforms defined: invalid argument

Source of error:

err = client.Transfer(ctx, src, dest)

Environment

  • OS: Ubuntu 20.04.6 LTS x86_64
  • Kernel: 5.15.0-84-generic
  • CPU: AMD Ryzen 7 3700X (16) @ 3.600GHz
  • Kubernetes: microk8s v1.28
$ sudo nano /var/snap/microk8s/current/args/kubelet

# Add this to the end of the file:
# --image-service-endpoint=unix:///run/nix-snapshotter/nix-snapshotter.sock

Starting nix-snapshotter:

$ git clone https://github.com/pdtpartners/nix-snapshotter
$ cd nix-snapshotter
$ sudo "$(nix-build)/bin/nix-snapshotter"

We deploy to microk8s using:

kubectl apply -f "$(nix-build image.nix)"
# image.nix
{ pkgs ? import (builtins.fetchTarball {
    url = "https://github.com/NixOS/nixpkgs/archive/refs/tags/23.05.tar.gz";
    sha256 = "10wn0l08j9lgqcw8177nh2ljrnxdrpri7bp0g7nvrsn9rkawvlbf";
  }) {}
, nix-snapshotter ? import (builtins.fetchTarball {
    url = "https://github.com/pdtpartners/nix-snapshotter/archive/6eb21bd3429535646da4aa396bb0c1f81a9b72c6.tar.gz";
    sha256 = "11sfy3kf046p8kacp7yh8ijjpp6php6q8wxlbya1v5q53h3980v1";
  })
}:
let
  redis-image = nix-snapshotter.default.buildImage {
    name = "abc123-redis";
    tag = "latest";
    config.entrypoint = [ "${pkgs.redis}/bin/redis-server" ];
  };
in
pkgs.writeText "pod.json" (builtins.toJSON rec {
  apiVersion = "v1";
  kind = "Pod";
  metadata.name = "redis";
  metadata.labels.name = metadata.name;
  spec.containers = [{
    inherit (metadata) name;
    args = [ "--protected-mode" "no" ];
    image = "nix:0${redis-image}";
    ports = [{
      name = "client";
      containerPort = 6379;
    }];
  }];
})

Create snapshotter opt `WithNixStoreDir` with NixOS test

Test that we can run nix-snapshotter against a non-conventional /nix/store dir path. We should default to /nix/store and allow overriding via WithNixStoreDir snapshotter opt with an accompany NixOS test to verify everything works as expected.

Switch examples to run with nerdctl

Currently we are using critcl to pull our images and ctr to run our containers. As ctr is quite low level would make more sense to combine these commands into a single nerdctl run command.

x86-64 only ?

Hi, I've tried to run the example but got
error: flake 'github:pdtpartners/nix-snapshotter' does not provide attribute 'apps.aarch64-linux.vm', 'packages.aarch64-linux.vm', 'legacyPackages.aarch64-linux.vm' or 'vm'

I can see systems = [ "x86_64-linux" ]; in flake.nix, is there any way to get around that ?

Add more logs to nix-snapshotter

Currently nix-snapshotter is pretty quiet. We should add some appropriate logs to help debug issues when they arise and generally confirming that it's working the way we expect.

Some ideas:

  • In main.go log when we started listening, e.g.
Initialized Nix Snapshotter
Registered GRPC snapshots server
Listening on unix://<addr>
  • Output the cmd and args of nix build --out-link ... for substitution before running the exec.Command
  • Anything else that makes sense but let's not over do it.

Write unit tests for nix2container.Push

Since remotes.Pusher is an interface, we can implement one that mocks out the networking and use it help us validate test expectations. This will probably involve adding a variadic option pattern to Push so that one can override what remotes.Pusher it should use.

A table test makes sense in this case, because there are many variants of images to test. For example, vanilla images, nix2container images, hybrid images, empty images, so-on.

Error: failed to create containerd container: failed to mount /var/lib/containerd/tmpmounts/containerd-mount…: no such file or directory

Hello! First, thank you so much for this, I love the idea! And the feedback loop is soooooo much faster!

I'm struggling to find the reason behind this error message, which only appears in this particular setup: docteurklein/nixok@208201b#diff-206b9ce276ab5971a2489d75eb1b12999d4bf3843b7988cbe8d687cfde61dea0R114

When I'm shipping the same config but using nix2container directly, my pod starts, but when I use nix-snapshotter, I get this error message:

Error: failed to create containerd container: failed to mount /var/lib/containerd/tmpmounts/containerd-mount…: no such file or directory

Any idea?

sudo ctr -n k8s.io -a /run/containerd/containerd.sock c info d40b2284de669d166a8a1fac32971cdb88c5e382b1d4d172cab6f5854278fed3
details
{
    "ID": "d40b2284de669d166a8a1fac32971cdb88c5e382b1d4d172cab6f5854278fed3",
    "Labels": {
        "app": "s1",
        "io.cri-containerd.kind": "sandbox",
        "io.kubernetes.pod.name": "s1-648769894-xs96h",
        "io.kubernetes.pod.namespace": "default",
        "io.kubernetes.pod.uid": "0b8c18e0-87ca-41ac-a0ab-e58df9d79a14",
        "pod-template-hash": "648769894"
    },
    "Image": "docker.io/library/pause:latest",
    "Runtime": {
        "Name": "io.containerd.runc.v2",
        "Options": {
            "type_url": "containerd.runc.v1.Options",
            "value": "SAE="
        }
    },
    "SnapshotKey": "d40b2284de669d166a8a1fac32971cdb88c5e382b1d4d172cab6f5854278fed3",
    "Snapshotter": "nix",
    "CreatedAt": "2023-10-03T13:32:31.850481864Z",
    "UpdatedAt": "2023-10-03T13:32:31.850481864Z",
    "Extensions": {
        "io.cri-containerd.sandbox.metadata": {
            "type_url": "github.com/containerd/cri/pkg/store/sandbox/Metadata",
            "value": "eyJWZXJzaW9uIjoidjEiLCJNZXRhZGF0YSI6eyJJRCI6ImQ0MGIyMjg0ZGU2NjlkMTY2YThhMWZhYzMyOTcxY2RiODhjNWUzODJiMWQ0ZDE3MmNhYjZmNTg1NDI3OGZlZDMiLCJOYW1lIjoiczEtNjQ4NzY5ODk0LXhzOTZoX2RlZmF1bHRfMGI4YzE4ZTAtODdjYS00MWFjLWEwYWItZTU4ZGY5ZDc5YTE0XzAiLCJDb25maWciOnsibWV0YWRhdGEiOnsibmFtZSI6InMxLTY0ODc2OTg5NC14czk2aCIsInVpZCI6IjBiOGMxOGUwLTg3Y2EtNDFhYy1hMGFiLWU1OGRmOWQ3OWExNCIsIm5hbWVzcGFjZSI6ImRlZmF1bHQifSwiaG9zdG5hbWUiOiJzMS02NDg3Njk4OTQteHM5NmgiLCJsb2dfZGlyZWN0b3J5IjoiL3Zhci9sb2cvcG9kcy9kZWZhdWx0X3MxLTY0ODc2OTg5NC14czk2aF8wYjhjMThlMC04N2NhLTQxYWMtYTBhYi1lNThkZjlkNzlhMTQiLCJkbnNfY29uZmlnIjp7InNlcnZlcnMiOlsiMTAuMC4wLjI1NCJdLCJzZWFyY2hlcyI6WyJkZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwic3ZjLmNsdXN0ZXIubG9jYWwiLCJjbHVzdGVyLmxvY2FsIl0sIm9wdGlvbnMiOlsibmRvdHM6NSJdfSwibGFiZWxzIjp7ImFwcCI6InMxIiwiaW8ua3ViZXJuZXRlcy5wb2QubmFtZSI6InMxLTY0ODc2OTg5NC14czk2aCIsImlvLmt1YmVybmV0ZXMucG9kLm5hbWVzcGFjZSI6ImRlZmF1bHQiLCJpby5rdWJlcm5ldGVzLnBvZC51aWQiOiIwYjhjMThlMC04N2NhLTQxYWMtYTBhYi1lNThkZjlkNzlhMTQiLCJwb2QtdGVtcGxhdGUtaGFzaCI6IjY0ODc2OTg5NCJ9LCJhbm5vdGF0aW9ucyI6eyJrdWJlcm5ldGVzLmlvL2NvbmZpZy5zZWVuIjoiMjAyMy0xMC0wM1QxNTozMjozMS41MjI0NDEzNjQrMDI6MDAiLCJrdWJlcm5ldGVzLmlvL2NvbmZpZy5zb3VyY2UiOiJhcGkiLCJwcm9maWxlcy5ncmFmYW5hLmNvbS9jcHUucG9ydF9uYW1lIjoiaHR0cC1tZXRyaWNzIiwicHJvZmlsZXMuZ3JhZmFuYS5jb20vY3B1LnNjcmFwZSI6InRydWUiLCJwcm9maWxlcy5ncmFmYW5hLmNvbS9tZW1vcnkucG9ydF9uYW1lIjoiaHR0cC1tZXRyaWNzIiwicHJvZmlsZXMuZ3JhZmFuYS5jb20vbWVtb3J5LnNjcmFwZSI6InRydWUifSwibGludXgiOnsiY2dyb3VwX3BhcmVudCI6Ii9rdWJlcG9kcy5zbGljZS9rdWJlcG9kcy1iZXN0ZWZmb3J0LnNsaWNlL2t1YmVwb2RzLWJlc3RlZmZvcnQtcG9kMGI4YzE4ZTBfODdjYV80MWFjX2EwYWJfZTU4ZGY5ZDc5YTE0LnNsaWNlIiwic2VjdXJpdHlfY29udGV4dCI6eyJuYW1lc3BhY2Vfb3B0aW9ucyI6eyJwaWQiOjF9LCJzdXBwbGVtZW50YWxfZ3JvdXBzIjpbMTAwMF0sInNlY2NvbXAiOnt9fSwib3ZlcmhlYWQiOnt9LCJyZXNvdXJjZXMiOnsiY3B1X3BlcmlvZCI6MTAwMDAwLCJjcHVfc2hhcmVzIjoyfX19LCJOZXROU1BhdGgiOiIvdmFyL3J1bi9uZXRucy9jbmktNTg3Y2I2ZjQtODE1NS1iNWIyLTEwMmQtZjVjOGI5MDRkMWRlIiwiSVAiOiIxMC4xLjAuMjciLCJBZGRpdGlvbmFsSVBzIjpudWxsLCJSdW50aW1lSGFuZGxlciI6IiIsIkNOSVJlc3VsdCI6eyJJbnRlcmZhY2VzIjp7ImV0aDAiOnsiSVBDb25maWdzIjpbeyJJUCI6IjEwLjEuMC4yNyIsIkdhdGV3YXkiOiIxMC4xLjAuMSJ9XSwiTWFjIjoiZmU6NmY6ZDI6ZGI6YjU6MTMiLCJTYW5kYm94IjoiL3Zhci9ydW4vbmV0bnMvY25pLTU4N2NiNmY0LTgxNTUtYjViMi0xMDJkLWY1YzhiOTA0ZDFkZSJ9LCJsbyI6eyJJUENvbmZpZ3MiOlt7IklQIjoiMTI3LjAuMC4xIiwiR2F0ZXdheSI6IiJ9LHsiSVAiOiI6OjEiLCJHYXRld2F5IjoiIn1dLCJNYWMiOiIwMDowMDowMDowMDowMDowMCIsIlNhbmRib3giOiIvdmFyL3J1bi9uZXRucy9jbmktNTg3Y2I2ZjQtODE1NS1iNWIyLTEwMmQtZjVjOGI5MDRkMWRlIn0sIm15bmV0Ijp7IklQQ29uZmlncyI6bnVsbCwiTWFjIjoiYTI6OTE6NDY6NTM6MjU6NzciLCJTYW5kYm94IjoiIn0sInZldGg3NmIwY2VkNyI6eyJJUENvbmZpZ3MiOm51bGwsIk1hYyI6IjQyOjFiOjJhOjgzOjFkOjgyIiwiU2FuZGJveCI6IiJ9fSwiRE5TIjpbe30se31dLCJSb3V0ZXMiOlt7ImRzdCI6IjEwLjEuMC4wLzE2In0seyJkc3QiOiIwLjAuMC4wLzAiLCJndyI6IjEwLjEuMC4xIn1dfSwiUHJvY2Vzc0xhYmVsIjoiIn19"
        }
    },
    "SandboxID": "",
    "Spec": {
        "ociVersion": "1.1.0",
        "process": {
            "user": {
                "uid": 0,
                "gid": 0,
                "additionalGids": [
                    1000
                ]
            },
            "args": [
                "/bin/pause"
            ],
            "cwd": "/",
            "capabilities": {
                "bounding": [
                    "CAP_CHOWN",
                    "CAP_DAC_OVERRIDE",
                    "CAP_FSETID",
                    "CAP_FOWNER",
                    "CAP_MKNOD",
                    "CAP_NET_RAW",
                    "CAP_SETGID",
                    "CAP_SETUID",
                    "CAP_SETFCAP",
                    "CAP_SETPCAP",
                    "CAP_NET_BIND_SERVICE",
                    "CAP_SYS_CHROOT",
                    "CAP_KILL",
                    "CAP_AUDIT_WRITE"
                ],
                "effective": [
                    "CAP_CHOWN",
                    "CAP_DAC_OVERRIDE",
                    "CAP_FSETID",
                    "CAP_FOWNER",
                    "CAP_MKNOD",
                    "CAP_NET_RAW",
                    "CAP_SETGID",
                    "CAP_SETUID",
                    "CAP_SETFCAP",
                    "CAP_SETPCAP",
                    "CAP_NET_BIND_SERVICE",
                    "CAP_SYS_CHROOT",
                    "CAP_KILL",
                    "CAP_AUDIT_WRITE"
                ],
                "permitted": [
                    "CAP_CHOWN",
                    "CAP_DAC_OVERRIDE",
                    "CAP_FSETID",
                    "CAP_FOWNER",
                    "CAP_MKNOD",
                    "CAP_NET_RAW",
                    "CAP_SETGID",
                    "CAP_SETUID",
                    "CAP_SETFCAP",
                    "CAP_SETPCAP",
                    "CAP_NET_BIND_SERVICE",
                    "CAP_SYS_CHROOT",
                    "CAP_KILL",
                    "CAP_AUDIT_WRITE"
                ]
            },
            "noNewPrivileges": true,
            "oomScoreAdj": -998
        },
        "root": {
            "path": "rootfs",
            "readonly": true
        },
        "hostname": "s1-648769894-xs96h",
        "mounts": [
            {
                "destination": "/proc",
                "type": "proc",
                "source": "proc",
                "options": [
                    "nosuid",
                    "noexec",
                    "nodev"
                ]
            },
            {
                "destination": "/dev",
                "type": "tmpfs",
                "source": "tmpfs",
                "options": [
                    "nosuid",
                    "strictatime",
                    "mode=755",
                    "size=65536k"
                ]
            },
            {
                "destination": "/dev/pts",
                "type": "devpts",
                "source": "devpts",
                "options": [
                    "nosuid",
                    "noexec",
                    "newinstance",
                    "ptmxmode=0666",
                    "mode=0620",
                    "gid=5"
                ]
            },
            {
                "destination": "/dev/mqueue",
                "type": "mqueue",
                "source": "mqueue",
                "options": [
                    "nosuid",
                    "noexec",
                    "nodev"
                ]
            },
            {
                "destination": "/sys",
                "type": "sysfs",
                "source": "sysfs",
                "options": [
                    "nosuid",
                    "noexec",
                    "nodev",
                    "ro"
                ]
            },
            {
                "destination": "/dev/shm",
                "type": "bind",
                "source": "/run/containerd/io.containerd.grpc.v1.cri/sandboxes/d40b2284de669d166a8a1fac32971cdb88c5e382b1d4d172cab6f5854278fed3/shm",
                "options": [
                    "rbind",
                    "ro",
                    "nosuid",
                    "nodev",
                    "noexec"
                ]
            },
            {
                "destination": "/etc/resolv.conf",
                "type": "bind",
                "source": "/var/lib/containerd/io.containerd.grpc.v1.cri/sandboxes/d40b2284de669d166a8a1fac32971cdb88c5e382b1d4d172cab6f5854278fed3/resolv.conf",
                "options": [
                    "rbind",
                    "ro",
                    "nosuid",
                    "nodev",
                    "noexec"
                ]
            }
        ],
        "annotations": {
            "io.kubernetes.cri.container-type": "sandbox",
            "io.kubernetes.cri.sandbox-cpu-period": "100000",
            "io.kubernetes.cri.sandbox-cpu-quota": "0",
            "io.kubernetes.cri.sandbox-cpu-shares": "2",
            "io.kubernetes.cri.sandbox-id": "d40b2284de669d166a8a1fac32971cdb88c5e382b1d4d172cab6f5854278fed3",
            "io.kubernetes.cri.sandbox-log-directory": "/var/log/pods/default_s1-648769894-xs96h_0b8c18e0-87ca-41ac-a0ab-e58df9d79a14",
            "io.kubernetes.cri.sandbox-memory": "0",
            "io.kubernetes.cri.sandbox-name": "s1-648769894-xs96h",
            "io.kubernetes.cri.sandbox-namespace": "default",
            "io.kubernetes.cri.sandbox-uid": "0b8c18e0-87ca-41ac-a0ab-e58df9d79a14"
        },
        "linux": {
            "resources": {
                "devices": [
                    {
                        "allow": false,
                        "access": "rwm"
                    }
                ],
                "cpu": {
                    "shares": 2
                }
            },
            "cgroupsPath": "kubepods-besteffort-pod0b8c18e0_87ca_41ac_a0ab_e58df9d79a14.slice:cri-containerd:d40b2284de669d166a8a1fac32971cdb88c5e382b1d4d172cab6f5854278fed3",
            "namespaces": [
                {
                    "type": "pid"
                },
                {
                    "type": "ipc"
                },
                {
                    "type": "uts"
                },
                {
                    "type": "mount"
                },
                {
                    "type": "network",
                    "path": "/var/run/netns/cni-587cb6f4-8155-b5b2-102d-f5c8b904d1de"
                }
            ],
            "seccomp": {
                "defaultAction": "SCMP_ACT_ERRNO",
                "architectures": [
                    "SCMP_ARCH_X86_64",
                    "SCMP_ARCH_X86",
                    "SCMP_ARCH_X32"
                ],
                "syscalls": [
                    {
                        "names": [
                            "accept",
                            "accept4",
                            "access",
                            "adjtimex",
                            "alarm",
                            "bind",
                            "brk",
                            "capget",
                            "capset",
                            "chdir",
                            "chmod",
                            "chown",
                            "chown32",
                            "clock_adjtime",
                            "clock_adjtime64",
                            "clock_getres",
                            "clock_getres_time64",
                            "clock_gettime",
                            "clock_gettime64",
                            "clock_nanosleep",
                            "clock_nanosleep_time64",
                            "close",
                            "close_range",
                            "connect",
                            "copy_file_range",
                            "creat",
                            "dup",
                            "dup2",
                            "dup3",
                            "epoll_create",
                            "epoll_create1",
                            "epoll_ctl",
                            "epoll_ctl_old",
                            "epoll_pwait",
                            "epoll_pwait2",
                            "epoll_wait",
                            "epoll_wait_old",
                            "eventfd",
                            "eventfd2",
                            "execve",
                            "execveat",
                            "exit",
                            "exit_group",
                            "faccessat",
                            "faccessat2",
                            "fadvise64",
                            "fadvise64_64",
                            "fallocate",
                            "fanotify_mark",
                            "fchdir",
                            "fchmod",
                            "fchmodat",
                            "fchown",
                            "fchown32",
                            "fchownat",
                            "fcntl",
                            "fcntl64",
                            "fdatasync",
                            "fgetxattr",
                            "flistxattr",
                            "flock",
                            "fork",
                            "fremovexattr",
                            "fsetxattr",
                            "fstat",
                            "fstat64",
                            "fstatat64",
                            "fstatfs",
                            "fstatfs64",
                            "fsync",
                            "ftruncate",
                            "ftruncate64",
                            "futex",
                            "futex_time64",
                            "futex_waitv",
                            "futimesat",
                            "getcpu",
                            "getcwd",
                            "getdents",
                            "getdents64",
                            "getegid",
                            "getegid32",
                            "geteuid",
                            "geteuid32",
                            "getgid",
                            "getgid32",
                            "getgroups",
                            "getgroups32",
                            "getitimer",
                            "getpeername",
                            "getpgid",
                            "getpgrp",
                            "getpid",
                            "getppid",
                            "getpriority",
                            "getrandom",
                            "getresgid",
                            "getresgid32",
                            "getresuid",
                            "getresuid32",
                            "getrlimit",
                            "get_robust_list",
                            "getrusage",
                            "getsid",
                            "getsockname",
                            "getsockopt",
                            "get_thread_area",
                            "gettid",
                            "gettimeofday",
                            "getuid",
                            "getuid32",
                            "getxattr",
                            "inotify_add_watch",
                            "inotify_init",
                            "inotify_init1",
                            "inotify_rm_watch",
                            "io_cancel",
                            "ioctl",
                            "io_destroy",
                            "io_getevents",
                            "io_pgetevents",
                            "io_pgetevents_time64",
                            "ioprio_get",
                            "ioprio_set",
                            "io_setup",
                            "io_submit",
                            "io_uring_enter",
                            "io_uring_register",
                            "io_uring_setup",
                            "ipc",
                            "kill",
                            "landlock_add_rule",
                            "landlock_create_ruleset",
                            "landlock_restrict_self",
                            "lchown",
                            "lchown32",
                            "lgetxattr",
                            "link",
                            "linkat",
                            "listen",
                            "listxattr",
                            "llistxattr",
                            "_llseek",
                            "lremovexattr",
                            "lseek",
                            "lsetxattr",
                            "lstat",
                            "lstat64",
                            "madvise",
                            "membarrier",
                            "memfd_create",
                            "memfd_secret",
                            "mincore",
                            "mkdir",
                            "mkdirat",
                            "mknod",
                            "mknodat",
                            "mlock",
                            "mlock2",
                            "mlockall",
                            "mmap",
                            "mmap2",
                            "mprotect",
                            "mq_getsetattr",
                            "mq_notify",
                            "mq_open",
                            "mq_timedreceive",
                            "mq_timedreceive_time64",
                            "mq_timedsend",
                            "mq_timedsend_time64",
                            "mq_unlink",
                            "mremap",
                            "msgctl",
                            "msgget",
                            "msgrcv",
                            "msgsnd",
                            "msync",
                            "munlock",
                            "munlockall",
                            "munmap",
                            "name_to_handle_at",
                            "nanosleep",
                            "newfstatat",
                            "_newselect",
                            "open",
                            "openat",
                            "openat2",
                            "pause",
                            "pidfd_open",
                            "pidfd_send_signal",
                            "pipe",
                            "pipe2",
                            "pkey_alloc",
                            "pkey_free",
                            "pkey_mprotect",
                            "poll",
                            "ppoll",
                            "ppoll_time64",
                            "prctl",
                            "pread64",
                            "preadv",
                            "preadv2",
                            "prlimit64",
                            "process_mrelease",
                            "pselect6",
                            "pselect6_time64",
                            "pwrite64",
                            "pwritev",
                            "pwritev2",
                            "read",
                            "readahead",
                            "readlink",
                            "readlinkat",
                            "readv",
                            "recv",
                            "recvfrom",
                            "recvmmsg",
                            "recvmmsg_time64",
                            "recvmsg",
                            "remap_file_pages",
                            "removexattr",
                            "rename",
                            "renameat",
                            "renameat2",
                            "restart_syscall",
                            "rmdir",
                            "rseq",
                            "rt_sigaction",
                            "rt_sigpending",
                            "rt_sigprocmask",
                            "rt_sigqueueinfo",
                            "rt_sigreturn",
                            "rt_sigsuspend",
                            "rt_sigtimedwait",
                            "rt_sigtimedwait_time64",
                            "rt_tgsigqueueinfo",
                            "sched_getaffinity",
                            "sched_getattr",
                            "sched_getparam",
                            "sched_get_priority_max",
                            "sched_get_priority_min",
                            "sched_getscheduler",
                            "sched_rr_get_interval",
                            "sched_rr_get_interval_time64",
                            "sched_setaffinity",
                            "sched_setattr",
                            "sched_setparam",
                            "sched_setscheduler",
                            "sched_yield",
                            "seccomp",
                            "select",
                            "semctl",
                            "semget",
                            "semop",
                            "semtimedop",
                            "semtimedop_time64",
                            "send",
                            "sendfile",
                            "sendfile64",
                            "sendmmsg",
                            "sendmsg",
                            "sendto",
                            "setfsgid",
                            "setfsgid32",
                            "setfsuid",
                            "setfsuid32",
                            "setgid",
                            "setgid32",
                            "setgroups",
                            "setgroups32",
                            "setitimer",
                            "setpgid",
                            "setpriority",
                            "setregid",
                            "setregid32",
                            "setresgid",
                            "setresgid32",
                            "setresuid",
                            "setresuid32",
                            "setreuid",
                            "setreuid32",
                            "setrlimit",
                            "set_robust_list",
                            "setsid",
                            "setsockopt",
                            "set_thread_area",
                            "set_tid_address",
                            "setuid",
                            "setuid32",
                            "setxattr",
                            "shmat",
                            "shmctl",
                            "shmdt",
                            "shmget",
                            "shutdown",
                            "sigaltstack",
                            "signalfd",
                            "signalfd4",
                            "sigprocmask",
                            "sigreturn",
                            "socketcall",
                            "socketpair",
                            "splice",
                            "stat",
                            "stat64",
                            "statfs",
                            "statfs64",
                            "statx",
                            "symlink",
                            "symlinkat",
                            "sync",
                            "sync_file_range",
                            "syncfs",
                            "sysinfo",
                            "tee",
                            "tgkill",
                            "time",
                            "timer_create",
                            "timer_delete",
                            "timer_getoverrun",
                            "timer_gettime",
                            "timer_gettime64",
                            "timer_settime",
                            "timer_settime64",
                            "timerfd_create",
                            "timerfd_gettime",
                            "timerfd_gettime64",
                            "timerfd_settime",
                            "timerfd_settime64",
                            "times",
                            "tkill",
                            "truncate",
                            "truncate64",
                            "ugetrlimit",
                            "umask",
                            "uname",
                            "unlink",
                            "unlinkat",
                            "utime",
                            "utimensat",
                            "utimensat_time64",
                            "utimes",
                            "vfork",
                            "vmsplice",
                            "wait4",
                            "waitid",
                            "waitpid",
                            "write",
                            "writev"
                        ],
                        "action": "SCMP_ACT_ALLOW"
                    },
                    {
                        "names": [
                            "socket"
                        ],
                        "action": "SCMP_ACT_ALLOW",
                        "args": [
                            {
                                "index": 0,
                                "value": 40,
                                "op": "SCMP_CMP_NE"
                            }
                        ]
                    },
                    {
                        "names": [
                            "personality"
                        ],
                        "action": "SCMP_ACT_ALLOW",
                        "args": [
                            {
                                "index": 0,
                                "value": 0,
                                "op": "SCMP_CMP_EQ"
                            }
                        ]
                    },
                    {
                        "names": [
                            "personality"
                        ],
                        "action": "SCMP_ACT_ALLOW",
                        "args": [
                            {
                                "index": 0,
                                "value": 8,
                                "op": "SCMP_CMP_EQ"
                            }
                        ]
                    },
                    {
                        "names": [
                            "personality"
                        ],
                        "action": "SCMP_ACT_ALLOW",
                        "args": [
                            {
                                "index": 0,
                                "value": 131072,
                                "op": "SCMP_CMP_EQ"
                            }
                        ]
                    },
                    {
                        "names": [
                            "personality"
                        ],
                        "action": "SCMP_ACT_ALLOW",
                        "args": [
                            {
                                "index": 0,
                                "value": 131080,
                                "op": "SCMP_CMP_EQ"
                            }
                        ]
                    },
                    {
                        "names": [
                            "personality"
                        ],
                        "action": "SCMP_ACT_ALLOW",
                        "args": [
                            {
                                "index": 0,
                                "value": 4294967295,
                                "op": "SCMP_CMP_EQ"
                            }
                        ]
                    },
                    {
                        "names": [
                            "process_vm_readv",
                            "process_vm_writev",
                            "ptrace"
                        ],
                        "action": "SCMP_ACT_ALLOW"
                    },
                    {
                        "names": [
                            "arch_prctl",
                            "modify_ldt"
                        ],
                        "action": "SCMP_ACT_ALLOW"
                    },
                    {
                        "names": [
                            "chroot"
                        ],
                        "action": "SCMP_ACT_ALLOW"
                    },
                    {
                        "names": [
                            "clone"
                        ],
                        "action": "SCMP_ACT_ALLOW",
                        "args": [
                            {
                                "index": 0,
                                "value": 2114060288,
                                "op": "SCMP_CMP_MASKED_EQ"
                            }
                        ]
                    },
                    {
                        "names": [
                            "clone3"
                        ],
                        "action": "SCMP_ACT_ERRNO",
                        "errnoRet": 38
                    }
                ]
            },
            "maskedPaths": [
                "/proc/acpi",
                "/proc/asound",
                "/proc/kcore",
                "/proc/keys",
                "/proc/latency_stats",
                "/proc/timer_list",
                "/proc/timer_stats",
                "/proc/sched_debug",
                "/sys/firmware",
                "/proc/scsi"
            ],
            "readonlyPaths": [
                "/proc/bus",
                "/proc/fs",
                "/proc/irq",
                "/proc/sys",
                "/proc/sysrq-trigger"
            ]
        }
    }
}

Nerdctl regression with kubernetes module

Looks like the kubernetes module takes over the /etc/cni/net.d directory making it read-only since it needs to configure flannel, a layer 3 network fabric for k8s. See: https://github.com/NixOS/nixpkgs/blob/53bbb203e013e8fbbcddd9f205e73674475f129a/nixos/modules/services/cluster/kubernetes/kubelet.nix#L250

Which stops nerdctl working properly since it wants to write /etc/cni/net.d/nerdctl-bridge.conflist.

Our options:

  1. See if we can make kubernetes not take over the whole directory to configure the flannel config
  2. Disable flannel and not take over the directory at all
  3. Statically create the nerdctl cni config via NixOS module
  4. Flag / config to change nerdctl cni config location

Preparing for public release

We should update this as we think more things to do:

  • Static-analysis & linting
  • Test coverage
  • Godocs badge
  • Go report card
  • Unit tests
  • Integration tests
  • Docs on usage with containerd standalone
  • Docs on usage with k8s
  • Package nix-snapshotter in nixpkgs

Write unit tests for the actual snapshotter

Depends on #10

Since the snapshotter is mostly stateless (besides creating directories and metadata), and the outputs are just golang []mount.Mount in-memory structs, we should be able to write good unit tests that doesn't actually mount. (We do want integration tests in the future to test that the mount.Mount structs are valid options against the mount syscall though)

See these snasphotter tests for inspiration:

Add automated testing for examples in README

In order to make sure the examples in README is well maintained, we should add automated testing to the examples and installation instructions. There are two approaches we can take:

  1. Generate README with examples in individual nix files that we test
  2. Extract examples out of README for testing

I think it's probably cleaner to do (2), so that it's easier to maintain the README with no generation step. Let's find out if what solutions there are for (2) and if any nix project does something similar.

For testing against home-manager, we likely want a development flake, as we don't want to introduce home-manager as a top-level dependency.

Remove WithNixStoreDir to allow builder to decide how to handle paths

Now that we refactored out a WithNixBuilder, its becoming increasingly obvious that we should pass the full nix store path to the builder, allowing it to decide how to handle that path. For example, if you have multiple nix store dirs, then you want to only run one nix-snapshotter instead of multiple kubernetes + nix-snapshotter for each nix store dir, or add a proxy (too complicated).

Instead the much simpler solution is to pass along the nix store path directly to the builder, allowing it to choose a nix binary built for another nix store dir.

This will change the image specification as well, so we need to re-push hinshun/hello:nix until we have kubernetes working with loaded images.

Add NixOS vm with nix-snapshotter systemd service

Spinning off a new issue from: #6

Another avenue to make nix-snapshotter easy to consume is to use NixOS great support for qemu VMs, that'll let someone quickly try out a Nix VM with containerd + nix-snapshotter configured with the root nix store.

We'll need to write a few modules to setup containerd and nix-snapshotter as systemd services, then expose the modules as nixosModules.default as a flake output.

Write unit tests for nix2container.Build

Technically speaking, the code in pkg/nix2container/build.go doesn't necessarily need nix store paths, but just arbitrary paths. This means we can write effective unit tests without involving nix.

Enable rootless containerd with nix-snapshotter

We need to investigate how to make nix-snapshotter easier to try out. Not everyone is familiar with how to configure containerd, and running services as root so there should be a better way.

For example, we can investigate rootless containers.

Another is #19 which exposes nix-snapshotter as NixOS modules usable with either NixOS vm or NixOS directly.

Provide flake-compat default.nix for non-flake users

Not everyone uses flakes, so it'll be nice for non-flake users to utilize nix-snapshotter. Normally this is done via using flake-compat in the default.nix, so we'll need to move the existing default.nix somewhere else.

Originally the derivation was kept in default.nix to at least people use the package, but non-flake use cases also cover nixos modules and home-manager too, so we should just move the nix-snapshotter derivation into flake-parts and just let flake-compat provide the outputs.

`nerdctl` fails to run a minimal container

Consider the following image:

{ writeShellScript, runtimeShell, nix-snapshotter }:
let
  hello-world = writeShellScript "hello-world"
  ''#!{runtimeShell}
   echo "Hello, world!"
  '';
  in (nix-snapshotter.buildImage {
    name = "repro";
    resolvedByNix = true;
    config.entrypoint = [ hello-world ];
})

Built and loaded by this procedure:

$ drv=$(nix eval --raw --apply 'p: let nix-snapshotter = (builtins.getFlake "github:pdtpartners/nix-snapshotter/6eb21bd"); pkgs = import (builtins.getFlake "nixpkgs/f292b49") { overlays = [ nix-snapshotter.overlays.default ]; }; container = (pkgs.callPackage p {}).copyToContainerd {}; in container.drvPath' --file ./repro.nix)
$ nix build "$drv^out"
$ sudo result/bin/copy-to-containerd

When this container is run with nerdctl run nix:0/nix/store/<path>:latest, an error occurs:

FATA[0002] failed to mount {Type:bind Source:/nix/store/c2lyvs0iz8b3l6ijk1f9fz8ma8khxcxm-hello-world Target:/nix/store/c2lyvs0iz8b3l6ijk1f9fz8ma8khxcxm-hello-world Options:[ro rbind]} on "/tmp/initialC595249715": no such file or directory

This issue seems to be related to how the Nix closure layer is generated. Directory mountpoints are generated correctly, but file mountpoints don't appear in the final image.

if !fi.IsDir() {
relStorePath = filepath.Dir(relStorePath)
}
err = os.MkdirAll(relStorePath, 0o755)

Investigate integration with Docker Engine / Desktop

As of Docker Desktop 4.12.0, the Docker Engine has been slowly replacing its internals with containerd, and now there is experimental support to use containerd snapshotters for image storage. It may be possible to hook up Docker Engine / Desktop with nix-snapshotter so we can docker run --rm ghcr.io/pdtpartners/hello.

Switch flake.nix over to flake-parts

Flake-parts lets you compose your flake outputs as NixOS modules, not to be confused with using modules to define a NixOS system but NixOS modules can be used outside of NixOS due to its type checker & composability.

This will help organize our different package outputs, rootless configuration, and nixos modules for the NixOS vm.

Migrate rootless scripts to flake apps

Ideally, we can run a rootless stack with just nix run .#rootless. We should investigate systemd user services to see if we can have a single entrypoint to run several services (rootless-containerd, nix-snapshotter with fuse-overlayfs).

The ideal UX is that if you Ctrl+C the process from nix run .#rootless it will tear down all the services. All the state, root directories should probably be in a tmp directory in this repository which is gitignored so that it is all contained within the repository.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.