Code Monkey home page Code Monkey logo

enclave-cc's Introduction

logo

Confidential Containers

CII Best Practices

Welcome to confidential-containers

Confidential Containers is an open source community working to leverage Trusted Execution Environments to protect containers and data and to deliver cloud native confidential computing.

We have a new release every 6 weeks! See Release Notes or Quickstart Guide

Our key considerations are:

  • Allow cloud native application owners to enforce application security requirements
  • Transparent deployment of unmodified containers
  • Support for multiple TEE and hardware platforms
  • A trust model which separates Cloud Service Providers (CSPs) from guest applications
  • Least privilege principles for the Kubernetes cluster administration capabilities which impact delivering Confidential Computing for guest applications or data inside the TEE

Get started quickly...

Further Detail

asciicast FOSSA Status

Contribute...

License

FOSSA Status

enclave-cc's People

Contributors

bigdata-memory avatar dcmiddle avatar dependabot[bot] avatar fidencio avatar fitzthum avatar fossabot avatar hairongchen avatar haokunx-intel avatar haosanzi avatar huliucheng1 avatar jepio avatar jiazhang0 avatar jiere avatar katexochen avatar mythi avatar piotrpalcz avatar portersrc avatar sameo avatar stevenhorsman avatar surajssd avatar wainersm avatar xynnn007 avatar yangliang3 avatar zhiwei-intel-h avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

enclave-cc's Issues

Roadmap to support new image format for eaa-kbc

As stated #108, we brought a unified format for encrypted images and a new mechanism to identify kbs resources like decryption keys. We need the following jobs to be done to support the enhancement for encrypted image.

For eaa-kbc

image pull failures with multi-layer images

I have debugged a problem where image pull fails for an image with 8 layers.

image-rs creates a pull thread for each layer and for some reason Occlum ends up in some lockup state with 8 threads. I tested a custom image-rs version that uses at most 4 threads but I also needed to add more resources to Occlum to make it finally working:

-.resource_limits.kernel_space_heap_size= "600MB" |
+.resource_limits.kernel_space_heap_size= "1024MB" |

TODO:

adapt enclave-agent to containerd Transfer service

A quote from containerd:
"The transfer service provides a simple interface to transfer artifact objects between any source and destination. This allows for
pull and push operations to be done in containerd whether requested from clients or plugins. It is experimental in this release
to allow for further plugin development and integration into existing plugins.

See the Transfer Docs"

  • consider moving to a per node GRPC service
  • support layer sharing between containers

build: use APT preferences to force SGX PSW and DCAP versions to what Occlum prefers

something like (but with an ARG override option) :

+++ b/tools/packaging/build/agent-enclave-bundle/Dockerfile
@@ -13,6 +13,7 @@ RUN apt-get update && \
 RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
 RUN echo "deb [arch=amd64 signed-by=/usr/share/keyrings/intel-sgx.gpg] https://download.01.org/intel-sgx/sgx_repo/ubuntu focal main" | tee -a /etc/apt/sources.list.d/intel-sgx.list \
  && wget -qO - https://download.01.org/intel-sgx/sgx_repo/ubuntu/intel-sgx-deb.key | gpg --dearmor --output /usr/share/keyrings/intel-sgx.gpg \
+ && wget -qO /etc/apt/preferences https://download.01.org/intel-sgx/sgx_repo/ubuntu/apt_preference_files/99sgx_2_17_100_focal_custom_version.cfg \
  && apt-get update \
  && env DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
     libsgx-dcap-ql \
@@ -44,7 +45,7 @@ RUN env DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommend
     occlum

integrate cosign signature verification feature into enclave-cc

cosign image signature verification feature in image-rs is released in CoCo V0.2.0. This issue request is to enable this feature in enclave-cc and make sure this signature verification method can be configured and used in e2e and operator CI.

dependencies and status:

  • dependency 1: rats-tls build into agent (#48)
  • dependency 2: enable image-rs cosign feature for agent (#87)
  • dependency 3: use updated verdictd supporting cosign resources
  • dependency 4: agent should fix the bug of agent reading config (#51 (comment))
  • dependency 5: encrypted and cosigned image for testing purpose: docker.io/eqmcc/helloworld_enc:latest, in CI if kata-cc has same kind of image and resource, we can use that for consistency

Add a CI pipeline as close to the Kata Containers one as possible

After chatting with @fidencio about the confidential-containers first release CI/CD requirements we felt that this would be an essential step for enclave-cc to integrate into the CC release.

Description

In kata-containers/kata-containers#3992 the Kata containers project is creating a kata-deploy image that can be used in to the operator payload. It also already has some Kubernetes CC integration tests as well as a set-up to install the pre-reqs.

In order to integrate Enclave-CC into the CC operator it needs to build it's own payload image and to demonstrate that Enclave-CC and 'Kata-CC' offer similar levels of functionality the Kata CC tests (or as close as possible) should be run on the Enclave-CC.

integrate Gramine into enclave-cc

this issue is to track the tasks and status of integration of Gramine into enclave-cc as it requires cooperation from other components and contributors.
Gramine integration includes the following tasks:

  • adapt enclave-agent to gramine
  • run gramine using runc
  • build bundle to integrate
  • integrate with ecc features
  • run with operator

Resolve FOSSA Failure

The FOSSA bot is reporting failing.

github.com/cilium/ebpf   (v0.9.1)  Golang[ policy flag](https://app.fossa.com/projects/git%2Bgithub.com%2Fconfidential-containers%2Fenclave-cc/refs/branch/main/540d6a00bbb0b44e31471265ae03d10fb8f34c3a/issues/licensing/2444275)
Cached by the Go Module Proxy at Tue, 19 Jul 2022 09:51:23 GMT

It might be a false positive on the cilium dependency, or that dependency may need to be removed.

It looks like in order to get into the issue inside fossabot you need to give it invasive access to your github account. Might be worth looking at using Snyk rather than FOSSA.

Install the RATS-TLS library in compile env to fix dependency bugs.

Although the PR #48 installs RATS-TLS in the runtime env, if we want to build a enclave agent referring to latest image-rs, which depends on the RATS-TLS, we need to install the RATS-TLS in the compile env too.

Otherwise, the ld complains that it can not find rats-tls as shown in here

 = note: /usr/bin/ld: cannot find -lrats_tls
          collect2: error: ld returned 1 exit status

The building Dockerfile should install the rats-tls in the builder, it may look like

FROM rust:1.63-bullseye as builder


RUN apt-get update && \
    env DEBIAN_FRONTEND=noninteractive apt-get install -y \
    protobuf-compiler

# FIX: install rats-tls
RUN git clone --depth 1 https://github.com/inclavare-containers/rats-tls.git && \ 
    cd rats-tls && \
    cmake -DRATS_TLS_BUILD_MODE="occlum" -DBUILD_SAMPLES=on -H. -Bbuild && \
    make -C build install

# Build enclave-agent
COPY src/ /enclave-cc/src/
RUN cd /enclave-cc/src/enclave-agent && \
    rustup component add rustfmt && \
    make

# Start preparing enclave-agent "bundle"
FROM ubuntu:20.04
...

improve payload image creation

Current payload image tooling in tools/packaging was drafted fairly quickly before the release and there few improvement areas:

  • build shim-rune in a container (#68)
  • build agent-enclave by COPYing the code rather than git cloneing (#69 )
  • move init maintenance to this repo (#43)
  • build init by COPYing the code rather than git cloneing (#69 )

update to Occlum NGO

  • adapt shim-rune to per pod unix-domain-socket (#18)
  • update agent-instance and boot-instance builds to install official Occlum debian packages

Enclave-CC development status for the first CoCo release

Let's track enclave-cc development status here for the first CoCo release. It will record what we have done, what we have left and potential issues.

Components

  • shim-rune
  • enclave-agent
  • image-rs
  • ocicrypt-rs
  • attestation agent
  • operator

Enclave-cc development status

  • rune: rune is ready from IC community.
  • shim-rune: Completed launching agent enclave container during pod sandbox creation and launching app container.
  • enclave-agent: Completed ttrpc communication with shim-rune and as the main service it can combine the image management, ocicrypt, and attestation agent to keep the encrypted image safe and form an encrypted fuse filesystem.
  • image-rs(related to enclave-cc part): Completed pulling encrypted images and also decrypting the image(ocicrypt), unpacking it to bundle, and then writing it out encrypted fuse filesystem.
  • ocicrypt-rs(related to enclave-cc part): Completed image decryption using sample KBC. The code is under review.
  • Attestation agent: Completed co-debugging between EAA-KBC and verdictd. The code will be PRed
  • Occlum needs to release an binary in inclavare container repo including all the patches related to enclave-cc
  • operator: will be started.
    • enclave signing
    • contribute scripts to build enclave-cc OCI bundles
    • build shim-rune
    • release enclave-cc artifact image (shim-run, tar files for enclave-agent and app-enclave OCI bundles)

Design topic collections

This issue collects the topics about the detailed sub-system/module designs for enclave-cc. The high level arch design will add a section to refer to these topics. Please feel free to contribute your topic.

  • new component: agent enclave (formerly stub enclave)
  • shim-rune changes @haosanzi
  • rune and PAL API v4 changes @YangLiang3
  • Local Attestation protocol for FUSE key transmission @bigdata-memory
  • FUSE encryption system

References

  • Development plan: #2
  • High level design doc: #1

RFC: Use runc in the first CoCo release

We have worked on getting the agent enclave and boot instance bundles installed using the operator. These are easy as they just involve copying the pre-built bundles to the host filesystem. Similarly, shim-rune installation is straighforward as it's just a stand-alone Go binary.

However, getting rune (and all of its dependencies) installed in a distro-agnostic way using the operator is currently not available. In addition, it seems to be possible to use enclave-cc simple deployment with just runc.

Since the enclave-cc arch diagram talks about rune, I thought it would make sense to submit this proposal to propose that we'll start with just runc.

Any feedback?

$ kubectl logs enclave-cc-pod
["init"]
Hello world!

Hello world!
...
$ kubectl get pod enclave-cc-pod -o json| jq .spec.containers[0]
{
  "command": [
    "/run/rune/boot_instance/build/bin/occlum-run",
    "/bin/hello_world"
  ],
  "env": [
    {
      "name": "LD_LIBRARY_PATH",
      "value": "/run/rune/boot_instance/build/lib:/lib/x86_64-linux-gnu/:/usr/lib/x86_64-linux-gnu/"
    }
  ],
  "image": "docker.io/huaijin20191223/scratch-base:v1.8",
  "imagePullPolicy": "IfNotPresent",
  "name": "hello-world",
  "resources": {
    "limits": {
      "sgx.intel.com/enclave": "1"
    },
    "requests": {
      "sgx.intel.com/enclave": "1"
    }
  },
  "securityContext": {
    "capabilities": {
      "add": [
        "IPC_LOCK"
      ]
    }
  },
  "terminationMessagePath": "/dev/termination-log",
  "terminationMessagePolicy": "File",
  "volumeMounts": [
    {
      "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount",
      "name": "kube-api-access-q5dnw",
      "readOnly": true
    }
  ],
  "workingDir": "/run/rune/boot_instance/"
}

[RFC] Development plan

I create this in order to gather the development resources and track the task list. At this moment, IC team from Ali and Intel developers covering container runtime parts will explicitly participate in this project. Also, I‘m glad to add more tasks to detail our works according to the feedback.

Components

Milestone 1: Initial PoC

The goal is to enable enclave-cc arch with a LibOS to launch an unencrypted/unsigned hello-world container image. The first LibOS to support this milestone is Occlum. Gramine still needs to discuss about PAL API adaption and decoupling design.

Milestone 2: initial code finalization

The primary goal is to submit and review the initial PoC code base, and enable container image protections.

  • import shim-rune (#4 ) component with a clean base from IC project. @haosanzi @zhiwei-intel-h
  • integrate ocicrypt-rs to enable the container image decryption support. @intchr @HaokunX-intel @zhiwei-intel-h
  • enable the image signature verification support. @zhiwei-intel-h
  • documentation works (in documentation repo and this repo).
  • add the basic CI/CD for compilation error check.
  • add the CI/CD runtime test on genuine SGX 2.0 machine for enclave-cc project.

Milestone 3: attest agent-enclave through remote attestation

The primary goal is to enable the image protections and E2E demo with remote attestation support.

Milestone 4: initial release

The goal is to accomplish the initial design of enclave-cc, and add operator deployment support for enclave-cc.

Follow-up Story

  • create the coordination of enclave-cc and kata-cc on Intel CPU, forming the so-called “small TEE and big TEE" similar to "small core and big core", and deploy them together in a node.
  • kata-sgx

Reference

update operator flows for NFD and Debug

There's work ongoing on Kata-CC side to improve RuntimeClass creation with capabilities provided by NFD and to configure the installation to be debug enabled.

This issue is to track/follow the work and implement the same functionality for enclave-cc.

enclave-agent build error.

Run "make" under src/enclave-agent, met below errors:

error[E0425]: cannot find function decode_config in crate base64
--> /home/jie/.cargo/registry/src/github.com-1ecc6299db9ec823/sequoia-openpgp-1.11.0/src/armor/base64_utils.rs:166:19
|
166 | match base64::decode_config(&bytes, base64::STANDARD) {
| ^^^^^^^^^^^^^ not found in base64

error[E0425]: cannot find value STANDARD in crate base64
--> /home/jie/.cargo/registry/src/github.com-1ecc6299db9ec823/sequoia-openpgp-1.11.0/src/armor/base64_utils.rs:166:49
|
166 | match base64::decode_config(&bytes, base64::STANDARD) {
| ^^^^^^^^ not found in base64
|
help: consider importing this constant
|
1 | use base64::alphabet::STANDARD;
|
help: if you import STANDARD, refer to it directly
|
166 - match base64::decode_config(&bytes, base64::STANDARD) {
166 + match base64::decode_config(&bytes, STANDARD) {
|

image pull errors

CoCo quickstart documentation uses bitnami/nginx:1.22.0 image as an example and I gave it a try. I'm seeing different image pull errors:

Failed to pull image "docker.io/bitnami/nginx:1.22.0": rpc error: code = Internal desc = failed to mount "unionfs" to "/run/enclave-cc/containers/nginx_1.22.0/rootfs", with error: EIO: I/O error

to debug this in more details, I ran enclave-agent (sudo runc run 123) from the bundle with OCCLUM_LOG_LEVEL=debug and used the "async-client" to debug. This time I'm getting different errors:

Green Thread 1 - pull_image -> Err(RpcStatus(code: INTERNAL message: "unpack destination \"/var/lib/image-rs/layers/sha256_3a52f76b4a6462386fe51fabf6cc829dbdef540dc64cdcc809f907bfb6c68195\" already exists")) ended: 5.798667018s

The latter blocks me from investigating the former error in details.

update boot-instance Occlum to 0.29.7

agent-enclave uses 0.29.5 so it's good for this to follow:

RUN echo "deb [arch=amd64 signed-by=/usr/share/keyrings/occlum.gpg] http://mirrors.openanolis.cn/inclavare-containers/ubuntu20.04 focal main" | tee -a /etc/apt/sources.list.d/occlum.list \
&& wget -qO - http://mirrors.openanolis.cn/inclavare-containers/ubuntu20.04/DEB-GPG-KEY.key | gpg --dearmor --output /usr/share/keyrings/occlum.gpg \
&& apt-get update

  • Change image-rs to use sefs fstype instead of unionfs
  • Change runtime boot struct user_rootfs_config
  • Update image-rs and change boot-instance/agent-instance Dockerfile to use "upstream" 0.29.6 Occlum

create rootfs_key dynamically and seal it

We have been waiting for #20 but in the mean time, let's work on something simpler to get rid of the static rootfs_key.

The proposal is to create rootfs_key dynamically and seal with with MRSIGNER key from Occlum's getkey ioctl().

Steps:

  • #126
  • #114
  • update image-rs Occlum snapshotter to make the key available under the containerd_id path
  • update runtime boot to get the key from containerd_id path

cc-operator-daemon-install POD keeps crashing in enclave-cc operator-based deployment.

Reproduce steps:

  1. $ kubectl apply -k github.com/confidential-containers/operator/config/release?ref=v0.2.0
  2. $ kubectl apply -f https://raw.githubusercontent.com/confidential-containers/operator/main/config/samples/enclave-cc/base/ccruntime-enclave-cc.yaml
    ccruntime.confidentialcontainers.org/ccruntime-enclave-cc created
  3. $ kubectl get pods -n confidential-containers-system
    NAME READY STATUS RESTARTS AGE
    cc-operator-controller-manager-5bf6d49bb5-94ff4 2/2 Running 0 9h
    cc-operator-daemon-install-6fmdz 0/1 CrashLoopBackOff 116 (3m47s ago) 9h

Use Kata Containers rust crate for container ID verification

Background

As mentioned in #14 (comment), that PR includes some code from the Kata Containers agent (added on kata-containers/kata-containers#1521).

Upcoming Kata changes

Hence, there are now two versions of this code. However, the original version has been moved into a separate rust crate in the Kata 3.x runtime-rs branch:

The plan is to merge the runtime-rs branch into Kata's main branch soon.

CoCo Plan

Once the runtime-rs branch has been merged into Kata's main branch, we should make this repo consume the kata-sys-util crate as a dependency.

Why bother?

The function is only ~18 lines of code, so is it worth doing this? I would say yes for the following reasons:

Unit tests

The original version of the code comes with a set of tests, whereas the version of the PR in this repo does not.

Maintenance issues

If there are multiple copies of the code, who's going to maintain them and ensure they stay in sync, with the latest fixes and improvements?

Rust eschews the golang approach of code copying (aka "vendoring") since crates.io makes that approach unnecessary.

Security

This is the more important reason. The backstory for my raising kata-containers/kata-containers#1521 was that container / sandbox IDs have a habit of ending up as part of path names for a container/sandbox specific data store. In CLI parlance...

$ cid="foo"
$ tree "/run/coco/containers/${cid}/"
/run/coco/containers/foo/
├── bar.json
└── baz.config

At some point, that path will be deleted:

$ sudo rm -rf "/run/foo/containers/${cid}/"

That seems reasonable. How about now?

$ cid='../../../sbin/init'
$ sudo /bin/rm "/run/foo/containers/${cid}/"

Ouch. Not so good. This is just an example, but the point is that the function in question provides a basic level of protection against these sorts of abuses (see the unit tests on the origional PR for further details).

/cc @YangLiang3, @quanweiZhou, @bergwolf for thoughts.

In summary, we should try to reuse code as much as possible in CoCo, but avoid copying it if at all possible.

But if we do have to copy code for some unusual situation, ensure that it is trivial for any interested parties to determine the original source location and authorship, aka it's provenance.

agent fail to start with "Failed to open Intel SGX device"

when start container with runc in bundle dir see following error:

root@iZ2ze49w79e4zvkn2mcbscZ:/opt/confidential-containers/share/enclave-cc-agent-instance# runc run 1234567
[get_driver_type /home/sgx/jenkins/ubuntuServer2004-release-build-trunk-218/build_target/PROD/label/Builder-UbuntuSrv20/label_exp/ubuntu64/linux-trunk-opensource/psw/urts/linux/edmm_utility.cpp:111] Failed to open Intel SGX device.
[ERROR] occlum-pal: Failed to create enclave with error code 0x2006: Invalid SGX device. Please make sure SGX module is enabled in the BIOS, and install SGX driver afterwards. (line 152, file src/pal_enclave.c)

and the device section in config.json of agent bundle is as below:
"devices": [
{
"path": "/dev/sgx_enclave",
"type": "c",
"major": 10,
"minor": 125,
"fileMode": 438
}
]

container ENV variables passing and parsing

For complicated workloads, the tenant will set some specific ENV variables in config file. In order to run the workload successfully in Libos, The ENV variables should be passed.
Shim and enclave-agent will work together to parse and combine the environment variables, and eventually pass them to the app enclave.

Roadmap for enclave-cc to support CoCo Key Broker System

Background

Now enclave-cc uses eaa-kbc & verdictd as underlying attestation & confidential resource broker componant. At the same time, CoCo key broker system is under development. CoCo Key Broker System includes

Currently, we also use eaa-kbc & verdictd, where

  • eaa-kbc: same functionality as cc-kbc
  • verdictd: same functionality as AS + CC-KBS

What we want

To support CoCo key broker system in enclave-cc, we need the following things. Let's make a simple roadmap for this

[RFC] Shim-rune Design Proposal

Design

According to Enclave Confidential Containers (Enclave-CC) Design and Architecture, shim-rune should launch the agent enclave container when pod creation. The agent enclave container is deployed in the form of OCI bundle rather than a container image, and thus shim-rune can call rune to start it. The agent enclave container has the same life cycle as the pod. Its main job is to receive and then load the actual workload's container image into an app enclave.

image

As is shown in the picture, the binary name of shim-rune is containerd-shim-rune-v2. shim-rune should implement the Containerd Runtime V2 (Shim API) for Enclave-CC. With shim-rune, Kubernetes can launch Pod and OCI-compatible containers with one shim per Pod.

Goals

  • implements the Containerd Runtime V2 (Shim API)
  • use rune OCI runtime to manage container
  • launch agent enclave container during PodSandbox creation
  • complete PullIImage with agent enclave container with ttrpc communication
  • launch application enclave container with FUSE encryption filesystem generated by agent enclave container.
  • kill/cleanup agent enclave container resource when stopping PodSandbox

Workflow

1. RunPodSandbox

Compared with containerd-shim-runc-v2, shim-rune should add the following functions when launching the pod sandbox.

  • In the Create API of shim-rune, shim-rune will create an agent enclave container with the pre-created oci bundle after the pause container is created successfully.
  • In the Start API of shim-rune, shim-rune will start the agent enclave container after the pause container is started successfully.

A pre-created oci bundle for the agent enclave container is needed instead of the form of a container image because the agent enclave container is the first component along with pod creation and there is no direct support for it to pull and unpack it in a TEE. The agent_container_instance field in the configuration file of shim-rune(In the Appendix section) shows the host path of oci bundle for the agent enclave container.

The agent enclave container has the same life cycle as the pod. Its main job is to receive and then load the actual workload's container image into an app enclave. The contents of the bundle file are as follows:

tree /opt/enclave-cc/agent-instance/ -L 1
/opt/enclave-cc/agent-instance/
├── config.json
└── rootfs

Note that oci bundle of the agent enclave container (agent_container_instance field in the shim-rune configuration file) must be read-only because the agent enclave containers of all Pods on this node share this bundle. Once modified, it will affect the behavior of existing or newly created agent enclave containers. Therefore, every time the agent enclave container is created, agent_container_instance is used as the read-only layer and then overlays read and write layers to aggregate the final oci bundle for the agent enclave container.

In addition, the oci bundle of agent container may have multiple different versions, they contain different versions of untrusted PAL (even support different LibOS) and its dependencies, such as configuration files. shim-rune just needs to call rune and pass the location of its oci bundle to rune.

rune receives the path of the oci bundle passed by shim-rune, parses the oci container configuration file, creates and enters a new mount namespace, and uses bind mount to mount the rootfs directory tree in the oci bundle as the container rootfs. Finally, LibOS is loaded and initialized through the PAL API pal_init() as the app container's No. 1 process runelet. Please refer to rune for the detail information.

2. Pull application Image

  1. The PullImage API of shim-rune will send PullImageReq to the agent enclave container.
  2. The agent enclave program uses theimage-rs to pull images and uses the encrypted file system capabilities provided by LibOS to create an encrypted unionfs (based sefs) image.
  3. The agent enclave store the encrypted unionfs (based sefs) image in host. We will discuss the path to store encrypted unionfs (based sefs) image in the next subsection(Create application container section).

Communication protocol

Below is the Image communication protocol between the shim-rune and agent enclave container. The communication protocol is referred to kata.

syntax = "proto3";
...
// Image defines the public APIs for managing images.
service Image {
    // PullImage pulls an image with authentication config.
    rpc PullImage(PullImageRequest) returns (PullImageResponse) {}
}

message PullImageRequest {
    // Image name (e.g. docker.io/library/busybox:latest).
    string image = 1;
    // Unique image identifier, used to avoid duplication when unpacking the image layers.
    string container_id = 2;
    // Use USERNAME[:PASSWORD] for accessing the registry
    string source_creds = 3;
}

message PullImageResponse {
    // Reference to the image in use. For most runtimes, this should be an
    // image ID or digest.
    string image_ref = 1;
}

3. Create app container

Ideally, just like kata-agent, the agent enclave container generates the bundle (config.json + rootfs) of the app container based on occlum or gramine. In this way, shim-rune can avoid generating the app container bundle which depends on specific Libos (occlum or gramine).

But LibOS has limited capabilities such as:

  • For UnionFS supported by Occlum, the lower layer must be a single-layer FUSE encrypted file system, and the app container image to be aggregated by agent enclave may contain multiple layers, that is, the lower layer is not a single layer.
  • Gramine may not be able to easily support the ability to aggregate filesystems.

Possible solutions:

  • In the long run, promote LibOS to realize the aggregation capability of supporting multiple lower layers.
  • In the short to medium term, the enclave-agent program cannot rely on the rootfs aggregation capability of a specific LibOS to construct the oci bundle of the app container. shim-rune is responsible for generating the bundle(config.json + rootfs)of the app container.

Considering these limitations and after discussing with Libos team, the first stage enclave-cc uses the following methods to run the application enclave container.

Occlum

Refer to the occlum guide of Runtime boot pre-generated UnionFS image. A pre-created oci bundle for the boot instance is needed in the host. The boot instance is responsible for using the customized init, mount, and boot a pre-generated UnionFS image.

Considering the implementation of occlum dynamic mount image, shim-rune should overlay the boot instance bundle and encrypted unionfs (based sefs) image to generate the final bundle of the application container. For example:

mount -t overlay overlay -o lowerdir=<path of boot instance bundle>:<path of encrypted unionfs (based sefs) image>, \
upperdir=upper,workdir=work merged

Then the question is where is the location of boot instance bundle and encrypted unionfs (based sefs) image?

  • Location of boot instance bundle

One host only needs a pre-created occlum oci bundle for the boot instance. The boot_container_instance field in shim configure file can show the location.

The pre-generated boot template instance looks like an oci bundle rootfs on the host side. It contains the occlum boot template instance.

tree /opt/enclave-cc/boot-instance -L 1
/opt/enclave-cc/boot-instance
└── rootfs

1 directory, 1 file
  • Location of encrypted unionfs (based sefs) image

One pod only has one agent enclave container. The agent container needs to manage multiple images. After the agent container pulls an image, it will convert all layers about this image into an encrypted unionfs (based sefs) image in the host.

Since one image corresponds to a encrypted unionfs (based sefs) image, we can refer to the implement in kata-agent, agent enclave contianer can generate based on image name. The agent enclave contianer can store the encrypted unionfs (based sefs) image to directory distinguished by , such as <agent_container_bundle_path>/rootfs/run/rune/<cid>/rootfs dir.

When shim-rune launch the application enclave container, shim-rune will generate the same based on the image name, and then find the corresponding encrypted unionfs (based sefs) image. Then shim-rune overlay the encrypted unionfs (based sefs) image and boot instance bundle to generate the final app container bundle.

Gramine

TODO

At last, shim-rune call rune to launch the app enclave container. The process is similar to launch an agent enclave container with rune.

4. Stop sandbox

Compared with containerd-shim-runc-v2, shim-rune adds the work of kill/cleanup work for the agent enclave container.

How to integrate with containerd

Because shim-rune supports containerd shim v2 API, you can add the associated configurations for shim-rune in the containerd config file, e.g, /etc/containerd/config.toml, on your system.

        [plugins.cri.containerd]
          ...
          [plugins.cri.containerd.runtimes.rune]
            runtime_type = "io.containerd.rune.v2"

then restart containerd on your system.

Appendix

shim configuration sample

log_level = "info" # "debug" "info" "warn" "error"

[containerd]
    socket = "/run/containerd/containerd.sock"
    agent_container_instance = "/opt/enclave-cc/agent-instance/"
    boot_container_instance = "/opt/enclave-cc/boot-instance/"
    agent_container_root_dir = "/run/containerd/agentenclave"
    agent_url = "tcp://0.0.0.0:7788"

where:

  • @log_level: specify the log level for shim-rune.
  • @socket: the containerd socket
  • @agent_container_instance: the host path of pre-created oci bundle for agent enclave container
  • @boot_container_instance: the host path of pre-created oci bundle for boot instance
  • @agent_container_root_dir: the root dir of agent enclave container running state
  • @agent_url: the listening address of the agent container, through which the shim communicates with the agent container to perform PullImage. (support unix and tcp communication)
    • Note: Currently occlum 0.27 version does not support cross_world_uds, temporarily implement ttrpc communication based on tcp
    • After occlum releases NGO at the end of June, the communication between shim-rune and agent container will be switched to ttrpc based on cross_world_uds

Reference

Specification of user defined claims in RA evidence in CC-KBC Attester for SGX

Related to #120

I am working on Occlum attester in cc-kbc confidential-containers/attestation-agent#136. Now the Evidence is defined as following. Please ignore the name as I think we can use a same format of Evidence for occlum and gramine.

struct SgxOcclumAttesterEvidence {
    /// Base64 encoded SGX quote.
    quote: String,
}

Now it only contains the base64-encoded sgx quote. We can include more claims in the Evidence by including the digest of the claims into report_data field, by which we can bond the claims to the quote.
That is, like a claim

{
    "a": "value a",
    ...
}

Could be part of the evidence.

The question is what we can include?

Some initial ideas:

  • As the verifier will get raw data mr_enclave from the quote, it will not know which payload is measured, s.t. what paylaod is corresponding to the mr_enclave. We could add the type or name of the payload, for example we use a key "mrenclave-id" to specify the payload, s.t. "mrenclave-id":"occlumv1.0+enclave-agentv1.0" (?) to tell the verifier which reference value should be used to compare
  • mr_signer: like mr_encalve, do we need to specify the signer of the sgx so file?

We might need to have a public specification for different keys and their usages?

FUSE key protection scheme

Background

We have discussed the FUSE key provisioning approach for a long time with the approaches #11 and #3.

Actually, all approaches surrounds corresponding scenarios. Without the background of scenario, it is inefficient to decide next actions.

There are 2 scenarios with different approaches to deploy enclave-cc, which heavily affects the approach of FUSE key provisioning. I recognized we don't need to seek one unified approach to cover both scenarios.

Scenario 1: Tenant owning platform

In order to run confidential containers, a tenant would like to pay for a VM or bare-metal as a platform/node to set up enclave-cc. In this scenario, a tenant has to also set up K8s and take the responsibility of platform maintenance. In this scenario, tenant hopes to have strong control on the platform.

Here is the workflow for this scenario:
屏幕快照 2022-08-25 下午12 35 36

  • App-Enclave and Agent-Enclave are signed by tenant as signer.
  • Use sealing key with MRSIGNER policy is much simpler than other approaches.
  • The FUSE key is protected by the sealing key which acts as a wrapping key.
  • By the way, LA protocol can still support this scenario.

Scenario 2: CSP owing platform

In this scenario, a tenant only needs to pay for a Pod to deploy a protected container image to enclave-cc provided by CSP. Obviously this is mainstream PaaS use model with efficient cost reduction, sacrificing the control of platform.

v1

Here is the workflow for this scenario 2 v1 approach:
屏幕快照 2022-08-24 上午11 48 24

  • Sealing key with MRSIGNER policy is not useful.
  • Even the enclave binaries used to host app-enclave and agent-encalve are same, their contents of configuration data or manifest are different and are reflected in MRENCLAVE so sealing key with MRENCLAVE policy plus other factors are not useful.
  • Launching local attestation protocol as described in #11
  • Relying Party authenticates the identity of Agent-Enclave as attester, and then provisions App-Enclave's MRENCLAVE to Agent-Enclave as reference value.
  • Agent-Enclave as verifier this time can authenticate the identity of App-Enclave with App-Enclave's MRENCLAVE as reference value.
  • This approach assumes App-Enclave doesn't need to authenticate the identity of Agent-Enclave.

v2

屏幕快照 2022-08-24 上午11 49 16

  • It is more flexible to support decoupling the configuration data or manifest and enclave binary in MRENCLAVE. This is what KSS can do. Here is a good material describing the details of using KSS.
  • V2 approach additionally needs the authentications to CONFIGID in remote and local attestation.
  • Occlum can leverage KSS to load a runtime configurable configuration data or manifest.
  • Still, this approach assumes App-Enclave doesn't need to authenticate the identity of Agent-Enclave.

v3

屏幕快照 2022-08-24 上午11 49 55

  • In order to allow App-Enclave to authenticate the identity of Agent-Enclave, the Agent-Enclave's MRENCLAVE is recorded in the configuration data or manifest of App-Enclave.
  • When launching App-Enclave, the configuration data or manifest of App-Enclave is provisioned to App-Enclave.
  • App-Enclave then uses Agent-Enclave's MRENCLAVE to authenticiate the identity of Agent-Enclave.
  • The integrity of App-Enclavev's configuration data or manifest containing Agent-Enclave's MRENCLAVE is verified by Agent-Enclave during local attestation.
  • Open: is there really necessary to authenticate agent-enclave by app-enclave? Even if the agent-enclave is disguised, it has no ability to spoof app-enclave to retrieve the genuine FUSE key.

Conclusions

  • Code sealing key with MRSIGNER policy as the first supported approach.
  • Add the details of identity authentication in #11 and merge this PR, driving the continuous development work to support the scenario 2.

CI failed because of key not found

https://github.com/confidential-containers/enclave-cc/actions/runs/5418442602/jobs/9850738755

error report

E0630 10:18:18.576510 3425599 remote_image.go:238] "PullImage from image service failed" err="rpc error: code = Internal desc = failed to handle layer: failed to get decrypt key missing private key needed for decryption" image="ghcr.io/confidential-containers/test-container-enclave-cc:encrypted"
time="2023-06-30T10:18:18+08:00" level=fatal msg="creating container: rpc error: code = Internal desc = failed to handle layer: failed to get decrypt key missing private key needed for decryption"
Error: Process completed with exit code 1.

Prepare new test images for image metadata enhencement

We are working on unifying the encrypted images on Attestation-Agent and image-rs side, related proposals are published (confidential-containers/guest-components#218 and confidential-containers/documentation#85). If interested please give any comments :-) The influence on enclave-cc are

  • When we use new version of Attestation-Agent for CI, new images should be made or enclave-agent will fail to decrypt the old ones
  • Wherever a confidential resource is used should be indicated in a kbs uri format

I'd like to help with this when things of AA and image-rs are finished.

agent need to set image-rs and attestation agent config in a proper place

now the ENABLE_SECURITY_VALIDATE env var is set in pull_image and image-rs client(with validate image flag) is initialized before calling pull_image so failed to catch the env var hence the signature verification is not enabled/disabled according to agent's config

the agent's config should be read in right after the agent start to later set the config for image-rs and attestation agent

tasks to do

limiting entry points with rootfs_entry

In #109 (in occlum) we opened things up to the root path. We should discuss whether we need limitations elsewhere in enclave-cc to prevent abusing that flexibility, or whether this does not really pose additional security risk.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.