Code Monkey home page Code Monkey logo

nomad-driver-containerd's Introduction

nomad-driver-containerd

CI Actions Status License Release Docs

  

We are actively looking for contributors and maintainers for this project. If you have experience in container internals e.g. cgroups, namespaces, or have contributed to any open source projects around containers e.g. docker, containerd, nerdctl, podman etc or build tooling which involves dealing with container internals, and are interested in contributing to this project, I would love to talk to you! Golang experience is preferred but not required.

Please reach out to me @_shishir_m or open an issue in this repository with your contact details, if you are interested in contributing to this project.

Overview

Nomad task driver for launching containers using containerd.

Containerd (containerd.io) is a lightweight container daemon for running and managing container lifecycle.
Docker daemon also uses containerd.

dockerd (docker daemon) --> containerd --> containerd-shim --> runc

nomad-driver-containerd enables nomad client to launch containers directly using containerd, without docker!
Docker daemon is not required on the host system.

nomad-driver-containerd architecture

Requirements

Building nomad-driver-containerd

Make sure your $GOPATH is setup correctly.

$ mkdir -p $GOPATH/src/github.com/Roblox
$ cd $GOPATH/src/github.com/Roblox
$ git clone [email protected]:Roblox/nomad-driver-containerd.git
$ cd nomad-driver-containerd
$ make build (This will build your containerd-driver binary)

If you want to compile for arm64, you can run:

make -f Makefile.arm64

Screencast

asciicast

Wanna try it out!?

$ vagrant up

or vagrant provision if the vagrant VM is already running.

Once setup (vagrant up OR vagrant provision) is complete and the nomad server is up and running, you can check the registered task drivers (which will also show containerd-driver) using:

$ nomad node status (Note down the <node_id>)
$ nomad node status <node_id> | grep containerd-driver

NOTE: setup.sh is part of the vagrant setup and should not be executed directly.

Run Example jobs.

There are few example jobs in the example directory.

$ nomad job run <job_name.nomad>

will launch the job.

More detailed instructions are in the example README.md

To interact with images and containers directly, you can use nerdctl which is a docker compatible CLI for containerd. nerdctl is already installed in the vagrant VM at /usr/local/bin.

Supported options

Driver Config

Option Type Required Default Description
enabled bool no true Enable/Disable task driver.
containerd_runtime string yes N/A Runtime for containerd e.g. io.containerd.runc.v1 or io.containerd.runc.v2.
stats_interval string no 1s Interval for collecting TaskStats.
allow_privileged bool no true If set to false, driver will deny running privileged jobs.
auth block no N/A Provide authentication for a private registry. See Authentication for more details.

Task Config

Option Type Required Description
image string yes OCI image (docker is also OCI compatible) for your container.
image_pull_timeout string no A time duration that controls how long containerd-driver will wait before cancelling an in-progress pull of the OCI image as specified in image. Defaults to "5m".
command string no Command to override command defined in the image.
args []string no Arguments to the command.
entrypoint []string no A string list overriding the image's entrypoint.
cwd string no Specify the current working directory for your container process. If the directory does not exist, one will be created for you.
privileged bool no Run container in privileged mode. Your container will have all linux capabilities when running in privileged mode.
pids_limit int64 no An integer value that specifies the pid limit for the container. Defaults to unlimited.
pid_mode string no host or not set (default). Set to host to share the PID namespace with the host.
hostname string no The hostname to assign to the container. When launching more than one of a task (using count) with this option set, every container the task starts will have the same hostname.
host_dns bool no Default (true). By default, a container launched using containerd-driver will use host /etc/resolv.conf. This is similar to docker behavior. However, if you don't want to use host DNS, you can turn off this flag by setting host_dns=false.
seccomp bool no Enable default seccomp profile. List of allowed syscalls.
seccomp_profile string no Path to custom seccomp profile. seccomp must be set to true in order to use seccomp_profile. The default docker seccomp profile found here can be used as a reference, and modified to create a custom seccomp profile.
shm_size string no Size of /dev/shm e.g. "128M" if you want 128 MB of /dev/shm.
sysctl map[string]string no A key-value map of sysctl configurations to set to the containers on start.
readonly_rootfs bool no Container root filesystem will be read-only.
host_network bool no Enable host network. This is equivalent to --net=host in docker.
extra_hosts []string no A list of hosts, given as host:IP, to be added to /etc/hosts.
cap_add []string no Add individual capabilities.
cap_drop []string no Drop invidual capabilities.
devices []string no A list of devices to be exposed to the container.
auth block no Provide authentication for a private registry. See Authentication for more details.
mounts []block no A list of mounts to be mounted in the container. Volume, bind and tmpfs type mounts are supported. fstab style mount options are supported.

Mount block
  {
   - type (string) (Optional): Supported values are volume, bind or tmpfs. Default: volume.
   - target (string) (Required): Target path in the container.
   - source (string) (Optional): Source path on the host.
   - options ([]string) (Optional): fstab style mount options. NOTE: For bind mounts, atleast rbind and ro are required.
  }

Bind mount example

mounts = [
           {
                type    = "bind"
                target  = "/target/t1"
                source  = "/src/s1"
                options = ["rbind", "ro"]
           }
        ]

In additon to the mounts option in Task Config, you can also mount your volumes into the container using nomad volume_mount stanza

See example job for volume_mount.

Custom seccomp profile example

The default docker seccomp profile found here can be downloaded, and modified (by removing/adding syscalls) to create a custom seccomp profile.
The custom seccomp profile can then be saved under /opt/seccomp/seccomp.json on the Nomad client nodes.

A nomad job can be launched using this custom seccomp profile.

config {
	seccomp         = true
	seccomp_profile = "/opt/seccomp/seccomp.json"
}

Sysctl example

config {
  sysctl = {
    "net.core.somaxconn"  = "16384"
    "net.ipv4.ip_forward" = "1"
  }
}

Authentication (Private registry)

auth stanza allow you to set credentials for your private registry e.g. if you want to pull an image from a private repository in docker hub.
auth stanza can be set either in Driver Config or Task Config or both.
If set at both places, Task Config auth will take precedence over Driver Config auth.

NOTE: In the below example, user and pass are just placeholder values which need to be replaced by actual username and password, when specifying the credentials. Below auth stanza can be used for both Driver Config and Task Config.

auth {
    username = "user"
    password = "pass"
}

Networking

nomad-driver-containerd supports host and bridge networks.

NOTE: host and bridge are mutually exclusive options, and only one of them should be used at a time.

  1. Host network can be enabled by setting host_network to true in task config of the job spec (see under Supported options).

  2. Bridge network can be enabled by setting the network stanza in the task group section of the job spec.

network {
  mode = "bridge"
}

You need to install CNI plugins on Nomad client nodes under /opt/cni/bin before you can use bridge networks.

Instructions for installing CNI plugins.

 $ curl -L -o cni-plugins.tgz https://github.com/containernetworking/plugins/releases/download/v0.8.6/cni-plugins-linux-amd64-v0.8.6.tgz
 $ sudo mkdir -p /opt/cni/bin
 $ sudo tar -C /opt/cni/bin -xzf cni-plugins.tgz

Also, ensure your Linux operating system distribution has been configured to allow container traffic through the bridge network to be routed via iptables. These tunables can be set as follows:

$ echo 1 > /proc/sys/net/bridge/bridge-nf-call-arptables
$ echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables
$ echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables

To preserve these settings on startup of a nomad client node, add a file including the following to /etc/sysctl.d/ or remove the file your Linux distribution puts in that directory.

net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

Port forwarding

nomad supports both static and dynamic port mapping.

  1. Static ports

Static port mapping can be added in the network stanza.

network {
  mode = "bridge"
  port "lb" {
    static = 8889
    to     = 8889
  }
}

Here, host port 8889 is mapped to container port 8889.
NOTE: static ports are usually not recommended, except for system or specialized jobs like load balancers.

  1. Dynamic ports

Dynamic port mapping is also enabled in the network stanza.

network {
  mode = "bridge"
  port "http" {
    to = 8080
  }
}

Here, nomad will allocate a dynamic port on the host and that port will be mapped to 8080 in the container.

You can also read more about network stanza in the nomad official documentation

Service discovery

Nomad schedules workloads of various types across a cluster of generic hosts. Because of this, placement is not known in advance and you will need to use service discovery to connect tasks to other services deployed across your cluster. Nomad integrates with Consul to provide service discovery and monitoring.

A service stanza can be added to your job spec, to enable service discovery.

The service stanza instructs Nomad to register a service with Consul.

Tests

If you are running the tests locally, use the vagrant VM provided in the repository.

$ vagrant up
$ vagrant ssh containerd-linux
$ sudo make test

NOTE: These are destructive tests and can leave the system in a changed state.
It is highly recommended to run these tests either as part of a CI/CD system e.g. circleci or on a immutable infrastructure e.g vagrant VMs.

You can also run an individual test by specifying the test name. e.g.

$ cd tests
$ sudo ./run_tests.sh 001-test-redis.sh

Cleanup

make clean

This will delete your binary: containerd-driver

vagrant destroy

This will destroy your vagrant VM.

Currently supported environments

Ubuntu (>= 16.04)

License

Copyright 2020 Roblox Corporation

Licensed under the Apache License, Version 2.0 (the "License"). For more information read the License.

nomad-driver-containerd's People

Contributors

chuckyz avatar dependabot[bot] avatar jamesalbert avatar lisongmin avatar n-marton avatar rgychiu avatar sha7khan avatar shishir-a412ed avatar shoenig avatar squidcod avatar th0m avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nomad-driver-containerd's Issues

[feature request] Extra hosts in the /etc/hosts

Hi,

Nomad recently fixed\implement long-standing issue with bridge network Docker workloads not being able to add extra entries into the /etc/hosts.

hashicorp/nomad#10766

I wonder if containerd driver could support this?

Some applications are dependant on some /etc/hosts entries in our case and we have to do a lot of hacks to workaround it.

containerd driver doesn't support passing args without command

In the docker driver, args are optional but they could be passed as with command and without command:
https://www.nomadproject.io/docs/drivers/docker#args

Without command, they will be passed to entrypoint script I guess.

contained driver produces an error if args are passed without command:

rpc error: code = Unknown desc = Error in creating container: Command is empty. Cannot set --args without --command.

This is an incompatibility with the docker driver and could block the migration.

Do you think it makes sense to make args behave in the same way as in docker driver?

inline seccomp_profile

kind of nightmare having to first deploy (and redeploy if it changes) a seccomp_profile for every container onto the filesystem of each nomad server.

considering it's just json, would be amazing if we could supply it inline with the task specification similar to how capabilities currently are. this is not currently possible right?

[feature request] plugin configuration level privileged mode

Hello!

Currently, it's possible to set the privileged mode in the Nomad job definition via:

config {
  privileged = true
}

I think it's a security risk. What do you think about making this a plugin-level configuration that will prevent such job configurations?

So it will be:

plugin "containerd-driver" {
  config {
    enabled = true
    containerd_runtime = "io.containerd.runc.v2"
    stats_interval = "5s"
    privileged = false
  }
}

Running with Nomad inside containerd

I'm interested in supporting this driver within the ResinStack distribution that I have developed for a more readily deployable version of the nomad ecosystem. In this environment I have nomad itself running as a containerd task, and I'm trying to work out either what needs to be mounted in, or if I can change the mount paths. Right now I'm hung up on this error and would appreciate advice:

2022-01-27T22:10:45-06:00  Driver Failure  rpc error: code = Unknown desc = Error in creating container: failed to mount /tmp/containerd-mount2802059906: no such file or directory

/tmp from the host is available to the container, so I'm not really sure what's wrong here.

How to use template stanza

I reconfigured tasks from docker to containerd driver without any big problems - except one: How to configure the mount options when using a template stanza for configuration, e.g.:

task "mosquitto" {
  driver = "docker"
  config {
    image = "docker.io/eclipse-mosquitto:2"
    ports = ["mqtt", "wss"]
    volumes = [
     "local/mosquitto.conf:/mosquitto/config/mosquitto.conf",
      "/srv/nomad/mosquitto:/mosquitto/data",
    ]
  }
  template {
      destination = "local/mosquitto.conf"
      data = <<EOF
bind_address 0.0.0.0
allow_anonymous true
persistence true
persistence_location /mosquitto/data/
log_dest stdout
EOF
  }
}

Default docker registry is not set

Setting docker image config to:

    task "redis" {
      driver = "containerd-driver"
      config {
        image = "redis:3.2"
      }

Work s with docker driver but fails with containerd with error:

rpc error: code = Unknown desc = Error in pulling image redis:3.2: failed to resolve reference "redis:3.2": parse "dummy://redis:3.2": invalid port ":3.2" after host

So looks like the default docker registry is not set. Changing this config to this fixes the issue:

    task "redis" {
      driver = "containerd-driver"
      config {
        image = "docker.io/library/redis:3.2"
      }

But it feels like inconsistency and will break Nomad jobs of ppl that are migrating from docker driver to contained driver.

What do you think about setting the default docker registry host?

Logging doesn't work

Currently, if I launch a job (nomad job run redis.nomad) using nomad-driver-containerd and try to tail on its logs (stdout/stderr) using nomad alloc logs command. It doesn't show anything.

nomad alloc logs -f -stderr -job redis

Will show nothing.

As a workaround, right now I have been using ctr (containerd command-line tool) to check on the logs.

$ export CONTAINERD_NAMESPACE=nomad
$ ctr task ls (This will get you the container name, which is prefixed with the allocation ID)
$ ctr task attach <container_name> (This should tail on the logs)

[feature request] windows support

we have several clients that are using nomad in combination with docker, we would like to move them from docker towards containerd , most of their servers are windows machines tho

We are already using the awesome nomad-driver-iis and works like a charm, is there any way or plan for this driver to support installation on windows servers?

thank you

Taking over maintainership

Hi, I am interested in moving this project forward.

I have experience with Nomad, nomad-pack, Kubernetes, Docker (though cannot say that I contributed a lot to these projects). I already maintain several FOSS projects (see my GitHub profile page). I have plans to use nomad-driver-containerd for my daily job, but it is not good enough yet in its current form.

Release 0.9.4

Would it be possible to make a new release, with the support for cgroups v2 included ?

It is blocking the upgrade from Ubuntu 20.04 to Ubuntu 22.04

Since the new release comes with cgroups v2 enabled by default

Cannot launch task: stdout.fifo and stderr.fifo already closed

Hello,

I'm trying to get this driver to work with a sample go program that just listens on an http port and prints a message. The task won't launch and looking through the logs I see:

Feb 11 12:06:22 nomad-client nomad[74354]:     2022-02-11T12:06:22.287-0600 [WARN]  client.alloc_runner.task_runner.task_hook.logmon.nomad: failed to read from log fifo: alloc_id=2e96234e-24b1-8a05-77f2-6e6620986232 task=c-hello @module=logmon error="read /opt/nomad/alloc/2e96234e-24b1-8a05-77f2-6e6620986232/alloc/logs/.c-hello.stdout.fifo: file already closed" timestamp=2022-02-11T12:06:22.286-0600

Feb 11 12:06:22 nomad-client nomad[74354]:     2022-02-11T12:06:22.287-0600 [WARN]  client.alloc_runner.task_runner.task_hook.logmon.nomad: failed to read from log fifo: alloc_id=2e96234e-24b1-8a05-77f2-6e6620986232 task=c-hello @module=logmon error="read /opt/nomad/alloc/2e96234e-24b1-8a05-77f2-6e6620986232/alloc/logs/.c-hello.stderr.fifo: file already closed" timestamp=2022-02-11T12:06:22.286-0600

Feb 11 12:06:22 nomad-client nomad[74354]:     2022-02-11T12:06:22.294-0600 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon.stdio: received EOF, stopping recv loop: alloc_id=2e96234e-24b1-8a05-77f2-6e6620986232 task=c-hello err="rpc error: code = Unavailable desc = error reading from server: EOF"
Feb 11 12:06:22 nomad-client nomad[74354]:     2022-02-11T12:06:22.296-0600 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: plugin process exited: alloc_id=2e96234e-24b1-8a05-77f2-6e6620986232 task=c-hello path=/usr/local/bin/nomad pid=74844

Feb 11 12:06:22 nomad-client nomad[74354]:     2022-02-11T12:06:22.296-0600 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: plugin exited: alloc_id=2e96234e-24b1-8a05-77f2-6e6620986232 task=c-hello

Feb 11 12:06:22 nomad-client nomad[74354]:     2022-02-11T12:06:22.296-0600 [DEBUG] client.alloc_runner.task_runner: task run loop exiting: alloc_id=2e96234e-24b1-8a05-77f2-6e6620986232 task=c-hello

I have verified that the image runs using nerdctl. It also runs using the Nomad docker and podman task drivers.
I was able to launch the redis example using the driver, so I feel like the driver is generally working. Any help or pointers would be greatly appreciated.

Details:

  • Nomad v1.2.5
  • Version 0.9.3 of this driver
  • Host: Debian 5.10.92-1 arm64 running in a Vagrant box on Apple silicon-based MAC

Job File:

job "containerd" {
  datacenters = ["dc1"]

  group "c-service" {
    network {
      port "http" {
        to = 8080
      }
    }
    service {
      name = "c-service"
      tags = ["urlprefix-/"]
      port = "http"

      check {
        type = "http"
        path = "/health"
        interval = "2s"
        timeout  = "2s"
      }
    }


    task "c-hello" {
      driver = "containerd-driver"

      config {
        image = "docker.io/michaelerickson/go-hello-docker:latest"
        host_network = true
        # ports = ["web"]
      }

      resources {
        cpu    = 500
        memory = 256
      }
    }
  }
}

The code for the service I'm trying to launch (go v 1.17):

package main

import (
	"encoding/json"
	"fmt"
	"log"
	"net"
	"net/http"
	"os"

	"github.com/gorilla/mux"
)

// serviceStatus represents the health of our service
type serviceStatus struct {
	Status string
}

// loggingMiddleware logs all requests to our service
func loggingMiddleware(next http.Handler) http.Handler {
	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		log.Printf("%s %s", r.Method, r.RequestURI)
		next.ServeHTTP(w, r)
	})
}

// notAllowedHandler is called for all requests that are not specifically
// handled. It returns HTTP not allowed
func notAllowedHandler(w http.ResponseWriter, r *http.Request) {
	log.Printf("%s %s method not allowed", r.Method, r.RequestURI)
	http.Error(w, "Not Allowed", http.StatusMethodNotAllowed)
}

// healthCheckHandler responds to /health and verifies that the service is up
func healthCheckHandler(w http.ResponseWriter, _ *http.Request) {
	status := serviceStatus{Status: "OK"}
	response, err := json.Marshal(status)
	if err != nil {
		log.Printf("JSON error: %s", err)
		http.Error(w, "JSON error", http.StatusInternalServerError)
		return
	}
	w.Header().Set("Content-Type", "application/json")
	w.WriteHeader(http.StatusOK)
	w.Write(response)
}

// rootHandler responds to /
func rootHandler(w http.ResponseWriter, r *http.Request) {
	ctx := r.Context()
	srvAddr := ctx.Value(http.LocalAddrContextKey).(net.Addr)
	response := fmt.Sprintf("Hello, Docker! from: %s\n", srvAddr)
	w.Write([]byte(response))
}

func main() {
	httpPort := os.Getenv("HTTP_PORT")
	if httpPort == "" {
		httpPort = "8080"
	}

	log.Printf("Starting echo service on %s", httpPort)

	r := mux.NewRouter()

	r.HandleFunc("/health", healthCheckHandler)
	r.HandleFunc("/", rootHandler)
	r.Use(loggingMiddleware)

	log.Fatal(http.ListenAndServe(":"+httpPort, r))
}

The dockerfile that builds the image:

# syntax=docker/dockerfile:1

# Multistage build to generate the smallest possible runtime image.

##
## BUILD
##
FROM golang:1.17.6-bullseye AS build

WORKDIR /app

COPY go.mod ./
COPY go.sum ./

RUN go mod download

COPY *.go ./

# Build for linux-arm64
RUN CGO_ENABLED=0 GOOS=linux GOARCH=arm64 go build -o /docker-gs-ping

##
## Deploy
##
FROM gcr.io/distroless/static

COPY --from=build /docker-gs-ping /docker-gs-ping

EXPOSE 8080

USER nonroot:nonroot

ENTRYPOINT ["/docker-gs-ping"]

Allow mount source to be relative to task working directory

Per hashicorp/nomad#13229 (comment) and #123 we can mount stuff from ${NOMAD_TASK_DIR}/file with relative path local/file but not for ${NOMAD_TASK_DIR}/secrets. I think it should be allowed to have a consistent behavior, and be of convenience to some container that is not easy to change their config seeking behavior.

For reference here's how docker driver doing this for volume bind and simple bind:
https://github.com/hashicorp/nomad/blob/2697e63ad67c254d0d8f1be02a477807fe40c50a/drivers/docker/driver.go#L678-L689
https://github.com/hashicorp/nomad/blob/2697e63ad67c254d0d8f1be02a477807fe40c50a/drivers/docker/driver.go#L1253-L1263

Running with custom containerd snapshotter

So, I am trying to setup a Nomad in Docker kind of dev setup with containerd to run Nomad tasks. Due to the "containerd" on top "docker" situation, the default "overlayfs" snapshotter doesn't work.

With some effort, I have been able to get "fuse-overlayfs" working with containerd inside a docker container. So with some config and env variable updates (namely CONTAINERD_SNAPSHOTTER), I can pull images using ctr and run nerdctl without problems.

Now, I want to run containers via nomad -> nomad-driver-containerd -> containerd. With some experimentation, the only way I can get this to work is using the following patch:

--- a/containerd/containerd.go
+++ b/containerd/containerd.go
@@ -109,6 +109,7 @@ func (d *Driver) pullImage(imageName, imagePullTimeout string, auth *RegistryAut
 
 	pullOpts := []containerd.RemoteOpt{
 		containerd.WithPullUnpack,
+		containerd.WithPullSnapshotter("fuse-overlayfs"),
 		withResolver(d.parshAuth(auth)),
 	}
 
@@ -339,6 +340,7 @@ func (d *Driver) createContainer(containerConfig *ContainerConfig, config *TaskC
 	return d.client.NewContainer(
 		ctxWithTimeout,
 		containerConfig.ContainerName,
+		containerd.WithSnapshotter("fuse-overlayfs"),
 		containerd.WithRuntime(d.config.ContainerdRuntime, nil),
 		containerd.WithNewSnapshot(containerConfig.ContainerSnapshotName, containerConfig.Image),
 		containerd.WithNewSpec(opts...),

The image is pulled and the container started without issues, if I use the above version.

So, I have two questions:

  1. Am I missing something, or is this the only way to override the snapshotter setting? (I assumed that setting the default snapshotter in the containerd config would work, but it did not.)
  2. If this is the only way, would you accept a PR making the snapshotter configurable?

stdin and stdout of existing processes are lost after a restart of nomad

Issue

Nomad is successfully able to reattach a job using nomad-driver-containerd after a restart but stdin and stdout are lost in the process.

Versions

nomad-driver-containerd v0.9.1
nomad v1.1.4

How to reproduce

  1. Consider the following changes to the Vagrant agent configuration to have Nomad run in standard mode, not dev mode to persist state.
$ cat example/agent.hcl 
log_level = "DEBUG"
data_dir = "/tmp/nomad"

plugin "containerd-driver" {
  config {
    enabled = true
    containerd_runtime = "io.containerd.runc.v2"
    stats_interval = "5s"
  }
}

server {
  enabled = true
  bootstrap_expect = 1
  default_scheduler_config {
    scheduler_algorithm = "spread"
    memory_oversubscription_enabled = true

    preemption_config {
      batch_scheduler_enabled   = true
      system_scheduler_enabled  = true
      service_scheduler_enabled = true
    }
  }
}

client {
  enabled = true
  host_volume "s1" {
    path = "/tmp/host_volume/s1"
    read_only = false
  }
}
  1. Run the Hello job
$ nomad job run example/hello.hcl
  1. See new log lines are appended every 3 seconds
$ nomad alloc logs -f -job hello
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.
  1. Restart nomad
$ sudo systemctl restart nomad
  1. See logs aren't appended anymore despite the loop still running
$ nomad alloc logs -f -job hello
Hello world: sleeping for 3 seconds.
Hello world: sleeping for 3 seconds.

ps faux from the guest

$ nomad alloc exec -job hello /bin/ps faux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root       221  0.0  0.1  34424  2752 pts/0    Rs+  23:11   0:00 /bin/ps faux
root         1  0.0  0.1  18040  2956 ?        Ss   23:01   0:00 /bin/bash /tmp/
root       220  0.0  0.0   4376   648 ?        S    23:11   0:00 sleep 3

ps faux from the host

root      9112  0.0  0.4 112244  8636 ?        Sl   23:01   0:00 /usr/local/bin/containerd-shim-runc-v2 -namespace nomad -id hello-task-ea92913f-1951-8117-6f4c-b2fcd28e898b -address /run/containerd/containerd.sock
root      9138  0.0  0.1  18040  2956 ?        Ss   23:01   0:00  \_ /bin/bash /tmp/print.sh
root      9632  0.0  0.0   4376   672 ?        S    23:11   0:00      \_ sleep 3

Nomad still knows about the job

vagrant@vagrant:~/go/src/github.com/Roblox/nomad-driver-containerd$ nomad job status hello
ID            = hello
Name          = hello
Submit Date   = 2021-08-30T23:01:06Z
Type          = service
Priority      = 50
Datacenters   = dc1
Namespace     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group   Queued  Starting  Running  Failed  Complete  Lost
hello-group  0       0         1        0       1         0

Latest Deployment
ID          = 5bfe4965
Status      = successful
Description = Deployment completed successfully

Deployed
Task Group   Desired  Placed  Healthy  Unhealthy  Progress Deadline
hello-group  1        1       1        0          2021-08-30T23:11:18Z

Allocations
ID        Node ID   Task Group   Version  Desired  Status    Created     Modified
ea92913f  0b680b8d  hello-group  2        run      running   11m58s ago  11m47s ago

Observations

Looking at open file descriptors for the containerd driver it seems that the driver is not recovering the stdin and stdout named pipes for running jobs.

Before restart

root      9253  0.0  1.8 1339240 38656 ?       Sl   23:02   0:00  \_ /tmp/nomad-driver-containerd/containerd-driver
vagrant@vagrant:~/go/src/github.com/Roblox/nomad-driver-containerd$ sudo lsof -np 9253                                                                                                                     
COMMAND    PID USER   FD      TYPE             DEVICE SIZE/OFF    NODE NAME                                                                                                                                
container 9253 root  cwd       DIR              253,0     4096       2 /                                                                                                                                   
container 9253 root  rtd       DIR              253,0     4096       2 /                                                                                                                                   
container 9253 root  txt       REG              253,0 43373088 3801138 /tmp/nomad-driver-containerd/containerd-driver                                                                                      
container 9253 root  mem       REG              253,0   101168 2097669 /lib/x86_64-linux-gnu/libresolv-2.27.so                                                                                             
container 9253 root  mem       REG              253,0    26936 2097643 /lib/x86_64-linux-gnu/libnss_dns-2.27.so                                                                                            
container 9253 root  mem       REG              253,0    47568 2097645 /lib/x86_64-linux-gnu/libnss_files-2.27.so                                                                                          
container 9253 root  mem       REG              253,0  2030544 2097578 /lib/x86_64-linux-gnu/libc-2.27.so                                                                                                  
container 9253 root  mem       REG              253,0   144976 2097665 /lib/x86_64-linux-gnu/libpthread-2.27.so                                                                                            
container 9253 root  mem       REG              253,0    14560 2097595 /lib/x86_64-linux-gnu/libdl-2.27.so                                                                                                 
container 9253 root  mem       REG              253,0   170960 2097554 /lib/x86_64-linux-gnu/ld-2.27.so                                                                                                    
container 9253 root    0r      CHR                1,3      0t0       6 /dev/null                                                                                                                           
container 9253 root    1w     FIFO               0,12      0t0   54392 pipe                                                                                                                                
container 9253 root    2w     FIFO               0,12      0t0   54393 pipe                                                                                                                                
container 9253 root    3u     unix 0xffff97c5b45a4400      0t0   54398 type=STREAM                                                                                                                         
container 9253 root    4u  a_inode               0,13        0    9576 [eventpoll]                                                                                                                         
container 9253 root    5r     FIFO               0,12      0t0   54397 pipe                                                                                                                                
container 9253 root    6w     FIFO               0,12      0t0   54397 pipe                                                                                                                                
container 9253 root    7u     unix 0xffff97c5b45a6400      0t0   54401 /tmp/plugin735931456 type=STREAM                                                                                                    
container 9253 root    8r     FIFO               0,12      0t0   54402 pipe                          
container 9253 root    9w     FIFO               0,12      0t0   54402 pipe                          
container 9253 root   10r     FIFO               0,12      0t0   54403 pipe                          
container 9253 root   11w     FIFO               0,12      0t0   54403 pipe                          
container 9253 root   12u     unix 0xffff97c5b45a7c00      0t0   54242 /tmp/plugin735931456 type=STREAM
container 9253 root   15u     FIFO              253,0      0t0 3801166 /tmp/nomad/alloc/20294144-56b6-fc55-e971-02028f7c7e2a/alloc/logs/.hello-task.stdout.fifo                                            
container 9253 root   16u     FIFO              253,0      0t0 3801168 /tmp/nomad/alloc/20294144-56b6-fc55-e971-02028f7c7e2a/alloc/logs/.hello-task.stderr.fifo                                            
container 9253 root   17u     FIFO               0,22      0t0     706 /run/containerd/fifo/739129716/hello-task-20294144-56b6-fc55-e971-02028f7c7e2a-stdout
container 9253 root   18u     FIFO               0,22      0t0     710 /run/containerd/fifo/739129716/hello-task-20294144-56b6-fc55-e971-02028f7c7e2a-stderr
container 9253 root   19r     FIFO               0,22      0t0     706 /run/containerd/fifo/739129716/hello-task-20294144-56b6-fc55-e971-02028f7c7e2a-stdout
container 9253 root   20r     FIFO               0,22      0t0     710 /run/containerd/fifo/739129716/hello-task-20294144-56b6-fc55-e971-02028f7c7e2a-stderr

After restart

root     10218  0.3  1.5 1259156 32552 ?       Sl   23:21   0:00  \_ /tmp/nomad-driver-containerd/containerd-driver
vagrant@vagrant:~/go/src/github.com/Roblox/nomad-driver-containerd$ sudo lsof -np 10218
COMMAND     PID USER   FD      TYPE             DEVICE SIZE/OFF    NODE NAME
container 10218 root  cwd       DIR              253,0     4096       2 /
container 10218 root  rtd       DIR              253,0     4096       2 /
container 10218 root  txt       REG              253,0 43373088 3801138 /tmp/nomad-driver-containerd/containerd-driver
container 10218 root  mem       REG              253,0  2030544 2097578 /lib/x86_64-linux-gnu/libc-2.27.so
container 10218 root  mem       REG              253,0   144976 2097665 /lib/x86_64-linux-gnu/libpthread-2.27.so
container 10218 root  mem       REG              253,0    14560 2097595 /lib/x86_64-linux-gnu/libdl-2.27.so
container 10218 root  mem       REG              253,0   170960 2097554 /lib/x86_64-linux-gnu/ld-2.27.so
container 10218 root    0r      CHR                1,3      0t0       6 /dev/null
container 10218 root    1w     FIFO               0,12      0t0   61256 pipe
container 10218 root    2w     FIFO               0,12      0t0   61257 pipe
container 10218 root    3u     unix 0xffff97c5b16a4400      0t0   61273 type=STREAM
container 10218 root    4u  a_inode               0,13        0    9576 [eventpoll]
container 10218 root    5r     FIFO               0,12      0t0   61272 pipe
container 10218 root    6w     FIFO               0,12      0t0   61272 pipe
container 10218 root    7u     unix 0xffff97c5b16a6c00      0t0   61274 /tmp/plugin654816476 type=STREAM
container 10218 root    8r     FIFO               0,12      0t0   61275 pipe
container 10218 root    9w     FIFO               0,12      0t0   61275 pipe
container 10218 root   10r     FIFO               0,12      0t0   61276 pipe
container 10218 root   11w     FIFO               0,12      0t0   61276 pipe
container 10218 root   12u     unix 0xffff97c5b16f6800      0t0   61278 /tmp/plugin654816476 type=STREAM

Running nomad as non-root user with rootless containerd

Hi!

I would like to run nomad locally on my developer machine and connect to locally running containerd that I have started in rootless mode.

Basically I would not want to involve root user when possible in either running containerd or nomad.

However, when I run nomad with:

nomad agent -dev -plugin-dir="/usr/lib/nomad/plugins" -config="./local-nomad.hcl"

where /usr/lib/nomad/plugins contains the containerd-driver
and ./local-nomad.hcl looks like:

plugin "containerd-driver" {
	config {
		enabled = true
		containerd_runtime = "io.containerd.runc.v2"
		stats_interval = "5s"
		allow_privileged = false
	}
}

I get following error:

2022-08-09T17:38:00.462+0300 [ERROR] agent.plugin_loader.containerd-driver: Error in creating containerd client: plugin_dir=/usr/lib/nomad/plugins @module=containerd-driver err="failed to dial \"/run/containerd/containerd.sock\": context deadline exceeded" timestamp="2022-08-09T17:38:00.462+0300"

and nomad dies.

I suppose the containerd-driver is trying to look for the sock file in wrong place. Is there a way to configure the containerd address?

hostname not populated in /etc/hosts for containerd tasks

nomad version: v1.1.2
os version: Linux archlinux 5.9.14-arch1-1 #1 SMP PREEMPT Sat, 12 Dec 2020 14:37:12 +0000 x86_64 GNU/Linux

jobspec:

job "python" {
    datacenters = ["dc1"]
    type = "service"

    group "python" {
        count = 1

        network {
            mode = "bridge"
        }

        task "python" {
            driver = "containerd-driver"
            config {
                image = "python:3.7.11-buster"
                command = "sh"
                args = ["-c", "while true; do sleep 300; done"]
            }
        }
    }
}

after run python task, run nomad exec -task python c81d0472 bash

root@python-c81d0472-e79f-3656-debd-97afada978d1:/# hostname
python-c81d0472-e79f-3656-debd-97afada978d1

root@python-c81d0472-e79f-3656-debd-97afada978d1:/# cat /etc/hostname
debuerreotype

root@python-c81d0472-e79f-3656-debd-97afada978d1:/# cat /etc/hosts
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
ff00::0	ip6-mcastprefix
ff02::1	ip6-allnodes
ff02::2	ip6-allrouters

root@python-c81d0472-e79f-3656-debd-97afada978d1:/# hostname -I
172.26.64.3

root@python-c81d0472-e79f-3656-debd-97afada978d1:/# hostname -i
hostname: Name or service not known

root@python-c81d0472-e79f-3656-debd-97afada978d1:/# hostname -f
hostname: Name or service not known

I think there are many issue:

  • hostname python-c81d0472-e79f-3656-debd-97afada978d1 did not in /etc/hostname
  • should add 172.26.64.3 python-c81d0472-e79f-3656-debd-97afada978d1 to /etc/hosts ?
  • hostname -i and hostname -f should work?

change task.driver to docker, it will work fine.

so, I think this is bug of containerd driver.

maybe this link is helpful: hashicorp/nomad#10766

Unable to build on clean go install

Hello,

It seems that one of the dependencies is broken and on new go installs the build fails.

+ make build
/usr/bin/go build -o containerd-driver .
go: github.com/hashicorp/[email protected] requires
        github.com/hashicorp/[email protected] requires
        github.com/tencentcloud/[email protected]+incompatible: reading github.com/tencentcloud/tencentcloud-sdk-go/go.mod at revision v3.0.83: unknown revision v3.0.83

I've also added the comment to a bug report on go-discover (hashicorp/go-discover#172)

Add support for --runtime

Currently containerd-driver gives the ability to set the runtime at the plugin level

It doesn't allow you to choose runtime at the job level similar to this

Add a flag --runtime to allow selecting runtime per job (Assuming runtime is installed and available on the nomad client node)
This will allow users to run containers with multiple runtimes e.g. runc and runsc (gVisor) on the same node.

v0.9.3 reports as v0.9.2

I don't think v0.9.3 was actually built.

$ wget https://github.com/Roblox/nomad-driver-containerd/releases/download/v0.9.3/containerd-driver
$ chmod 755 containerd-driver
$ ./containerd-driver -v
containerd-driver v0.9.2

Can you fix ASAP? Thanks!

cgroups not getting applied on containers launched using nomad-driver-containerd

When I launch a container using nomad-driver-containerd, and it exceeds its limits, cgroups are not applied and the container doesn't get OOM killed. To give a comparison between docker and nomad-driver-containerd driver:

stress.nomad

job "stress" {
  datacenters = ["dc1"]

  group "stress-group" {
    task "stress-task" {
      driver = "docker"

      config {
        image = "docker.io/shm32/stress:1.0"
      }

      restart {
        attempts = 5
        delay    = "30s"
      }

      resources {
        cpu    = 500
        memory = 256
        network {
          mbits = 10
        }
      }
    }
  }
}
$ nomad job run stress.nomad

When stress.nomad exceeds 500 Mhz of CPU or 256 MB of memory, it's OOM killed.

However when I launch the same job (stress.nomad) using nomad-driver-containerd it keeps running and doesn't get OOM killed.

In the case of docker driver, IIUC docker is managing the cgroups for the container.
The question probably is, how does nomad manage resource constraints (cgroups) on workloads launched by other drivers e.g. QEMU, Java, exec, etc.
Does nomad apply/manage cgroups at the orchestration level?

The same image seems to be pulled in parallel causing disk exhaustion

We have about 100 parameterized job definitions that use the same image config:

config {
        image   = "username/backend:some_tag"

The problem is that disk space is exhausted on Nomad clients and it looks like the reason is that the image is being pulled individually for each job, despite specifying the same exact image with the same tag. When using docker Nomad driver this didn't happen and all jobs made use of a single image that was pulled and extracted once.

I might be wrong on the explanation but this is what I get from multiple (hundreds) of error messages like:

[ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=62ab19a7-4e67-c941-cc39-340394800fa1 task=main error="rpc error: code = Unknown desc = Error in pulling image username/backend:some_tag: failed to prepare extraction snapshot \"extract-138110298-tmpn sha256:bf868a0e662ae83512efeacb6deb2e0f0f1694e693fab8f53c110cb503c00b99\": context deadline exceeded"

I.e. it looks like each allocation has it's own extraction snapshot? Is it possible to configure the driver (or containerd) so that all jobs will share a single image snapshot?

Portmap Capabilities + CNI how to use in the driver

I just started to test how to use the capabilities by passing the parameters on the job description using cap_add, and I created a CNI network like the following:

{
  "cniVersion": "0.3.1",
  "name": "mycoolcnichain",
  "plugins": [
     {
        "type": "bridge",
        "isGateway": true,
        "ipMasq": false,
        "bridge": "mybridge",
        "ipam": {
            "type": "host-local",
            "subnet": "10.10.30.0/24",
            "routes": [
                { "dst": "0.0.0.0/0" }
            ],
         "dataDir": "/run/ipam-out-net"
        },
        "dns": {
          "nameservers": [ "8.8.8.8" ]
        }
    },
    {
      "type": "portmap",
      "capabilities": {"portMappings": true},
      "snat": false
    },
}

I would like to know how to use the CAP_ARGS to specify a port to Redis different from the default Redis port 6379 to 8888, my question is how to pass the CAP_ARGS to achieve that option.

CAP_ARGS should be

CAP_ARGS='{"portMappings":[{"hostPort":8888,"containerPort":6379,"protocol":"tcp","hostIP":"127.0.0.1"}]}

and job should be:

job "redis" {
  datacenters = ["dc1"]

  group "redis-group" {
    task "redis-task" {
      driver = "containerd-driver"

      config {
        image   = "docker.io/library/redis:alpine"
        seccomp = true
        cwd     = "/home/redis"
        cap_add         = ["CAP_ARGS"],
      }

      resources {
        cpu    = 500
        memory = 256
      }
    }
  }
}

[question] Is it production ready?

Hello,

I know this question usually have an answer like "it depends" but I wonder what is the current state of this module from your point of view?

Also, I see in Nomad docs what:

filesystem isolation none

Does it mean that allocations run with it have full access to underlying fs?

Containers (spec) not getting cleaned up after stopping the jobs.

Seeing this intermittently (not easy to reproduce at this point), when I launch multiple nomad jobs

e.g.

$ nomad job run redis.nomad
$ nomad job run capabilities.nomad
$ nomad job run privileged.nomad
$ nomad job run stress.nomad

After the containers (tasks) are running. Stop the containers (tasks).

$ nomad job stop redis
$ nomad job stop capabilities
$ nomad job stop privileged
$ nomad job stop stress

This stops all the tasks. However, containers (spec) are left behind and not cleaned up properly.
This was working before, and there could likely be a regression.

root@linux:/opt/gopath/src/github.com/nomad-driver-containerd/example# ctr task ls
TASK    PID    STATUS
root@linux:/opt/gopath/src/github.com/nomad-driver-containerd/example# ctr container ls
CONTAINER                     IMAGE                             RUNTIME
32522383_lucid_noyce5         docker.io/library/ubuntu:16.04    io.containerd.runc.v2
63c86855_agitated_cray6       docker.io/library/ubuntu:16.04    io.containerd.runc.v2
ba8f121f_ecstatic_volhard3    docker.io/library/ubuntu:16.04    io.containerd.runc.v2
bcddf7ad_jolly_bartik7        docker.io/library/redis:alpine    io.containerd.runc.v2
fb07ad0a_focused_thompson9    docker.io/library/redis:alpine    io.containerd.runc.v2

Forward Redis port 6379

I'm currently evaluating nomad to replace our lxd stack.

The standard docker driver works well, but I can't figure out what it brings; so I'd like to get rid of it, and use containerd directly instead - I followed your instructions and hello.nomad runs without issues.

However I'm testing your redis.nomad now and I can't find out how to forward ports. Redis runs and listens on its internal port 6379, but I can't forward this port to the host (dynamic or static port). I tried many combinations without success - host_network = true forwards to port 6379.

Do you please have a working example? Or, do I really have to install CNI plugin to get a working bridge, and have a config file in /opt/cni/config (nomad.hcl / client / cni_config_dir)? Where can I find such a config file?

Registry authentication

Hi, thanks for opensourcing this plugin.
I try to set it up with a private registry (gitlab), but I need to authenticate before pulling images. As I can see the auth config stanza is not supported.
Am I missing something ? Is it something doable ?

there is no ip address in the lo network

there is no ip address(127.0.0.1/8) in the lo network, so app can not listen on 127.0.0.1

root@bypass-route:/# ip addr
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
3: eth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 2e:f4:d0:00:17:83 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.26.64.2/20 brd 172.26.79.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::2cf4:d0ff:fe00:1783/64 scope link 
       valid_lft forever preferred_lft forever

when change to docker driver, the lo network is fine.

root@9b68f72d354e:/# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
3: eth0@if118: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 32:2c:db:6a:dc:f0 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.26.64.107/20 brd 172.26.79.255 scope global eth0
       valid_lft forever preferred_lft forever

the job file:

job "test2" {
  datacenters = ["dc1"]

  group "test2" {

    network {
      mode = "bridge"
    }

    task "test2" {
      driver = "containerd-driver"
      config {
        image           = "docker.io/library/ubuntu:20.04"
        command         = "sleep"
        args            = ["600s"]
      }

      resources {
        cpu    = 500
        memory = 256
      }
    }
  }
}

Is there any suggestion here?

thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.