Code Monkey home page Code Monkey logo

Comments (27)

mythi avatar mythi commented on July 16, 2024

@niteeshkd thanks for creating the report!

I also tried with the workload used in tests/e2e/enclave-cc-pod-sim.yaml in SGX SIM mode, it still reports the same error.

Is it possible you have HW leftovers installed and SIM fails because of that? I just double checked on my side that SIM works:

kind create cluster --image "kindest/node:v1.27.3" -n coco-sgx --config tests/e2e/enclave-cc-kind-config.yaml --wait 120s
kubectl label node coco-sgx-worker node.kubernetes.io/worker=
kubectl apply -k github.com/confidential-containers/operator/config/default
kubectl apply -k github.com/confidential-containers/operator/config/samples/enclave-cc/sim/
kubectl apply -f tests/e2e/enclave-cc-pod-sim.yaml
kubectl logs enclave-cc-pod-sim

I searched for Occlum errors using the log output you shared and found: occlum/occlum#1385
Not sure it's the same but at least others have reporter something similar (recently).

Anyway, two questions:

  1. would it be possible to try SIM with a fresh setup?
  2. can you also share kubectl logs <pod> for the failing pod?

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

@mythi , here is the output of kubectl logs <pod> for the failing pod for which i used tests/e2e/enclave-cc-pod-sim.yaml in SGX SIM mode .

$kubectl logs enclave-cc-pod-sim
Error from server (BadRequest): container "hello-world" in pod "enclave-cc-pod-sim" is waiting to start: image can't be pulled

You suggested to try SIM mode with a fresh setup. Do you mean to disable the SGX (i.e. from BIOS) and then try in SIM mode? Or is there some easy way to do it?

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

@myth, I rebooted the host, recreated the K8s cluster, reinstalled the operator v0.7.0 and ccruntime class. Then, i tried to deploy the pod using tests/e2e/enclave-cc-pod-sim.yaml (which is in SIM mode). It still shows the same error.

$ kubectl logs enclave-cc-pod-sim
Error from server (BadRequest): container "hello-world" in pod "enclave-cc-pod-sim" is waiting to start: image can't be pulled
# cat /run/containerd/agent-enclave/cid/stderr 
[ERROR] occlum-pal: Failed to create enclave with error code 0x2009: Invalid enclave metadata. (line 152, file src/pal_enclave.c)

from enclave-cc.

mythi avatar mythi commented on July 16, 2024

@niteeshkd the ccruntime payload must also be from the SIM overlay. With SIM, any system works (it does not have to be an SGX system)

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

Somehow, the ccruntime payload created for SIM does not create enclave-cc runtimeclass for me. I tried to follow the steps you motioned above (except that I am creating the K8s cluster using kubeadm instead of kind).

Here is what I observed.

$ kubectl get node
NAME              STATUS   ROLES                  AGE   VERSION
sl-coffeelake01   Ready    control-plane,worker   16m   v1.28.2

$ kubectl apply -k github.com/confidential-containers/operator/config/default

$ kubectl apply -k github.com/confidential-containers/operator/config/samples/enclave-cc/sim/
ccruntime.confidentialcontainers.org/ccruntime-enclave-cc-sgx-mode-sim created

$ kubectl get runtimeclass
No resources found

$ kubectl get pods -n confidential-containers-system
NAME                                              READY   STATUS    RESTARTS   AGE
cc-operator-controller-manager-65985f7975-pwrkq   2/2     Running   0          15m

from enclave-cc.

mythi avatar mythi commented on July 16, 2024

Does the node have the necessary labels to get the payload installation triggered?

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

Does the node have the necessary labels to get the payload installation triggered?

Yes, It does have.

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

@mythi I used the occlum development docker container occlum/occlum:0.29.7-ubuntu20.04 and ran the sample hello word application . I see error in the same line when the app is run in HW mode but not in SIM mode.

root@0026e1b1208e:/tmp# wget https://raw.githubusercontent.com/occlum/occlum/master/demos/hello_c/hello_world.c
root@0026e1b1208e:/tmp# occlum-gcc -o hello_world hello_world.c
root@0026e1b1208e:/tmp# occlum new occlum-instance
root@0026e1b1208e:/tmp# cp hello_world /tmp/occlum-instance/image/bin
root@0026e1b1208e:/tmp# cd occlum-instance

root@0026e1b1208e:/tmp/occlum-instance# occlum build --sgx-mode SIM
root@0026e1b1208e:/tmp/occlum-instance# occlum run /bin/hello_world
Hello World

root@0026e1b1208e:/tmp/occlum-instance# occlum build
root@0026e1b1208e:/tmp/occlum-instance# occlum run /bin/hello_world
[ERROR] occlum-pal: Failed to create enclave with error code 0x1: Unexpected error occurred. (line 152, file src/pal_enclave.c)

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

@mythi, I am also able to see the problem reported by the issue created against occlum in the occlum development container on my host with SGX. And, it is the same error which I am getting while creating enclave-cc.

root@0026e1b1208e:~/github/occlum/demos/runtime_boot# ./build_content.sh
...
tcs_num 32, tcs_max_num 4096, tcs_min_pool 32
The required memory is 2127261696B.
The required memory is 0x7ecb7000, 2077404 KB.
Succeed.
Built the Occlum image and enclave successfully
**[ERROR] occlum-pal: Failed to create enclave with error code 0x2009: Invalid enclave metadata. (line 152, file src/pal_enclave.c)**

from enclave-cc.

mythi avatar mythi commented on July 16, 2024

@niteeshkd we're discussing two problems, would you prefer we try to solve the SIM case first or move to the HW mode? I'm a bit concerned about the HW setup (coffeelake). Are you using the upstream SGX driver with it and SGX is correctly detected?

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

@mythi, I think solving SIM case is more important now. Do you want me to create another issue or can we discuss here? Regarding HW, i think SGX driver is setup properly as I am able to test creation of enclave etc. using some sample applications (e.g. SampleCode/LocalAttestation, SampleCommonLoader placed under sgxsdk/SampleCode) shipped with Intel SGX SDK.

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

@mythi, I found the reason of not able to deploy the runtime class for enclave-cc in SIM mode. After creating the single node cluster using kubeadm, I was labelling the node with the string node-role.kubernetes.io/worker= (as suggested under Prerequisites section on the installation page of Operator) . With this, i was not able to create the ccruntime class for enclave-cc in SIM mode although I was able to create the same in HW mode with operator v0.7.0 and ccruntime yaml file created with the commit #6f241fbc056f0a5d9e1bd2c10b2cedc0782b99ff .

But, after labelling the node with node.kubernetes.io/worker= , i am able to deploy the runtime class for enclave-cc in SIM mode too.

I think we should either document it how to label the node or we should fix to support both the labels.

from enclave-cc.

mythi avatar mythi commented on July 16, 2024

I think we should either document it how to label the node or we should fix to support both the labels.

Apologies for the confusion but good that it's working now. Looks like the pre-requisites need updating. The labels were updated in July confidential-containers/operator#195 since node-role is not an allowed prefix.

from enclave-cc.

mythi avatar mythi commented on July 16, 2024

Looks like the pre-requisites need updating

It's up-to-date in main. Is your permanent link pointing to v0.7.0 or some older rev?

from enclave-cc.

mythi avatar mythi commented on July 16, 2024

Regarding HW, i think SGX driver is setup properly as I am able to test creation of enclave etc. using some sample applications

I tested the HW flavor in Azure and works for me there. To get the simplest deployment to use SGX HW, these were my modifications to the yaml you're currently using with SIM:

diff --git a/tests/e2e/enclave-cc-pod-sim.yaml b/tests/e2e/enclave-cc-pod-sim.yaml
index 7749eed..963d0ab 100644
--- a/tests/e2e/enclave-cc-pod-sim.yaml
+++ b/tests/e2e/enclave-cc-pod-sim.yaml
@@ -10,8 +10,11 @@ spec:
     imagePullPolicy: IfNotPresent
     env:
     - name: OCCLUM_RELEASE_ENCLAVE
-      value: "0"
+      value: "1"
     workingDir: "/run/rune/boot_instance/"
     command:
     - /run/rune/boot_instance/build/bin/occlum-run
     - /bin/hello_world
+    resources:
+      limits:
+        sgx.intel.com/epc: 512Mi

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

Looks like the pre-requisites need updating

It's up-to-date in main. Is your permanent link pointing to v0.7.0 or some older rev?

Great! I think the link here under the section creating-a-sample-coco-workload-using-enclave-cc on the enclave-cc guide page which points to the the ccruntime of commit#6f241b made me to find the old label for node while deploying the operator.
I was using the ccruntime of commit#6f241b after installing the operator by typing kubectl apply -k "github.com/confidential-containers/operator/config/release?ref=v0.7.0" .

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

I tested the HW flavor in Azure and works for me there. To get the simplest deployment to use SGX HW, these were my modifications to the yaml you're currently using with SIM:

I also did the same . Here is what I find on my host with SGX1.

$ cpuid -1 | grep -i sgx
      SGX: Software Guard Extensions supported = true
      SGX_LC: SGX launch config supported      = true
   Software Guard Extensions (SGX) capability (0x12/0):
      SGX1 supported                           = true
      SGX2 supported                           = false
      SGX ENCLV E*VIRTCHILD, ESETCONTEXT       = false
      SGX ENCLS ETRACKC, ERDINFO, ELDBC, ELDUC = false
   SGX attributes: ECREATE SECS.ATTRIBUTES (0x12/1):
   SGX Enclave Page Cache (EPC) enumeration (0x12/0x2):
   SGX Enclave Page Cache (EPC) enumeration (0x12/0x3):
   
$ kubectl describe node sl-coffeelake01 | grep "sgx.intel.com"
                    nfd.node.kubernetes.io/extended-resources: sgx.intel.com/epc
  sgx.intel.com/enclave:    110
  sgx.intel.com/epc:        98566144
  sgx.intel.com/provision:  110
  sgx.intel.com/enclave:    110
  sgx.intel.com/epc:        98566144
  sgx.intel.com/provision:  110
  sgx.intel.com/enclave    0            0
  sgx.intel.com/epc        0            0
  sgx.intel.com/provision  0            0
  
$ git diff
diff --git a/tests/e2e/enclave-cc-pod-sim.yaml b/tests/e2e/enclave-cc-pod-sim.yaml
index 7749eed..2964f67 100644
--- a/tests/e2e/enclave-cc-pod-sim.yaml
+++ b/tests/e2e/enclave-cc-pod-sim.yaml
@@ -10,8 +10,11 @@ spec:
     imagePullPolicy: IfNotPresent
     env:
     - name: OCCLUM_RELEASE_ENCLAVE
-      value: "0"
+      value: "1"
     workingDir: "/run/rune/boot_instance/"
     command:
     - /run/rune/boot_instance/build/bin/occlum-run
     - /bin/hello_world
+    resources:
+      limits:
+        sgx.intel.com/epc: 64Mi
$ kubectl apply -k github.com/confidential-containers/operator/config/default
$ kubectl apply -k github.com/confidential-containers/operator/config/samples/enclave-cc/sim/
$ kubectl apply -f tests/e2e/enclave-cc-pod-sim.yaml
$ kubectl logs enclave-cc-pod-sim
Hello world!
$ kubectl delete -k github.com/confidential-containers/operator/config/samples/enclave-cc/sim/
$ kubectl apply -k github.com/confidential-containers/operator/config/samples/enclave-cc/hw
error: invalid Kustomization: yaml: line 8: did not find expected key
$ kubectl apply -k github.com/confidential-containers/operator/config/samples/enclave-cc/base
$ kubectl apply -f tests/e2e/enclave-cc-pod-sim.yaml
$ kubectl logs enclave-cc-pod-sim
Error from server (BadRequest): container "hello-world" in pod "enclave-cc-pod-sim" is waiting to start: trying and failing to pull image
$ sudo cat /run/containerd/agent-enclave/cid/stderr
[ERROR] occlum-pal: Failed to create enclave with error code 0x2009: Invalid enclave metadata. (line 152, file src/pal_enclave.c)

from enclave-cc.

mythi avatar mythi commented on July 16, 2024

$ kubectl apply -k github.com/confidential-containers/operator/config/samples/enclave-cc/hw
error: invalid Kustomization: yaml: line 8: did not find expected key

apologies, I forgot to mention this. I submitted a fix for this confidential-containers/operator#264

Just to double check, Azure with SGX HW worked for you too but it just fails on your Coffeelake system?

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

@mythi , I tested only on my Coffelake system.

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

apologies, I forgot to mention this. I submitted a fix for this confidential-containers/operator#264

I used this fix. I still see the same error on my Coffelake system.

$ git diff config/samples/enclave-cc/hw
diff --git a/config/samples/enclave-cc/hw/kustomization.yaml b/config/samples/enclave-cc/hw/kustomization.yaml
index 56bd669..506ae51 100644
--- a/config/samples/enclave-cc/hw/kustomization.yaml
+++ b/config/samples/enclave-cc/hw/kustomization.yaml
@@ -6,5 +6,6 @@ resources:
 
 nameSuffix: -sgx-mode-hw
 
+images:
 - name: quay.io/confidential-containers/reqs-payload
   newTag: e45d4e84c3ce4ae116f3f4d6c123c4829606026f

$ kubectl apply -k config/samples/enclave-cc/hw
$ kubectl apply -f tests/e2e/enclave-cc-pod-sim.yaml
$ kubectl logs enclave-cc-pod-sim
Error from server (BadRequest): container "hello-world" in pod "enclave-cc-pod-sim" is waiting to start: trying and failing to pull image
$ sudo cat /run/containerd/agent-enclave/fc7fd956873dd9104c97b996079eabb5bfbcf0117e8c753d093064d8586ce219/stderr
[ERROR] occlum-pal: Failed to create enclave with error code 0x2009: Invalid enclave metadata. (line 152, file src/pal_enclave.c)

from enclave-cc.

mythi avatar mythi commented on July 16, 2024

I used this fix. I still see the same error on my Coffelake system.

The fix only allows kubectl apply -k config/samples/enclave-cc/hw to work.

[ERROR] occlum-pal: Failed to create enclave with error code 0x2009: Invalid enclave metadata. (line 152, file src/pal_enclave.c)

I'm currently checking this with the Occlum team. One observation is that your system has < 100MB of EPC and our resource_limits.kernel_space_stack_size (which seems to trigger similar issues for other Occlum users too) is set to 128MB.

from enclave-cc.

mythi avatar mythi commented on July 16, 2024

I used this fix. I still see the same error on my Coffelake system.

The fix only allows kubectl apply -k config/samples/enclave-cc/hw to work.

[ERROR] occlum-pal: Failed to create enclave with error code 0x2009: Invalid enclave metadata. (line 152, file src/pal_enclave.c)

I'm currently checking this with the Occlum team.

It looks to be an Occlum issue and the fix will be available in their next release (v0.30.0).

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

Thanks a lot for your helps!

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

I tested the HW flavor in Azure and works for me there.

@mythi what type of processor is installed and what is EPC size?

from enclave-cc.

mythi avatar mythi commented on July 16, 2024

I'm using DC2s_v3 that has 8GiB of EPC.

from enclave-cc.

mythi avatar mythi commented on July 16, 2024

I used this fix. I still see the same error on my Coffelake system.

The fix only allows kubectl apply -k config/samples/enclave-cc/hw to work.

[ERROR] occlum-pal: Failed to create enclave with error code 0x2009: Invalid enclave metadata. (line 152, file src/pal_enclave.c)

I'm currently checking this with the Occlum team.

It looks to be an Occlum issue and the fix will be available in their next release (v0.30.0).

@niteeshkd we just moved to v0.30.0 so hopefully this is fixed for you in CoCo 0.8.0 release this week.

from enclave-cc.

niteeshkd avatar niteeshkd commented on July 16, 2024

Great to hear this! Thanks a lot!

from enclave-cc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.