Code Monkey home page Code Monkey logo

seele's Issues

Error: unable to join session keyring

正常运行一段时间后出现如下报错,此后提交的一切评测请求都是同样的报错

Error executing the submission: Execution got following internal error(s):
The runj process failed: time="2024-03-20T11:49:48Z" level=fatal 
msg="Error executing the container" 
error="Error initializing the container process: 
  unable to start container process: 
    error during container init: 
      unable to join session keyring: 
        unable to create session key: 
          disk quota exceeded

已经改大了/proc/sys/kernel/keys/maxkeys/proc/sys/kernel/keys/maxkeys到200M以上

关于高并发和性能的问题

评测机只跑seele,但是遇到了很多不合理的问题:

评测机环境

双路EPYC-9654 96核CPU,系统共192核384线程 2.24T内存
Ubuntu22.04系统配合kubernetes

1. WALL_TIME超时但是user和kernel time都很短

问题描述

例如一个简单的hello world C代码,上传文件,编译,运行 共三个子任务

# 运行步骤的返回
{'status': 'FAILED',
 'report': {
  'run_at': '2024-04-12T02:39:41.860021413Z', 
  'time_elapsed_ms': 58616, 
  'type': 'run_container', 
  'status': 'WALL_TIME_LIMIT_EXCEEDED', 
  'exit_code': 0, 
  'wall_time_ms': 12045, 
  'cpu_user_time_ms': 65, 
  'cpu_kernel_time_ms': 104, 
  'memory_usage_kib': 21156}, 
'embeds': {'cis_stdout': 'hello,world2\n', 'cis_stderr': ''}}

可以看到它已经正确输出hello,world2了,整个cpu_user+kernel不到200ms,但是wall_time有整整12s,即便如此,整个time_elapsed_ms却来到了58s,超时被强行结束的

复现条件

  1. 并发大于400 (也就是大约超过cpu线程数的时候)
  2. 是否有编译子任务不影响问题出现(python这些提交文件-执行的也会遇到)(编译子任务启用cache也会遇到)

讨论

在我们评测机上并发200的时候能以30 tasks/s的速度完成这个,但这种简单的helloworld期望应该是200 tasks/s以上的评测速度。其原因都是wall_time太长导致的。
我感觉应该是runj在面对高并发的时候本身成为瓶颈了:启动、退出容器都很慢

2. runj error: cannot start an already running container

问题描述

跟1一样的C helloworld,上传文件-编译-ls编译结果-执行,结果runj报错:
Error initializing the container process: cannot start an already running container

# 运行步骤的返回
{'id': 'F3oBscsWNXF7M6EQ', 'type': 'ERROR', 
'error': 'Error executing the submission: Execution got following internal error(s):\nThe runj process failed: time="2024-04-12T03:04:23Z" level=fatal msg="Error executing the container" error="Error initializing the container process: cannot start an already running container"\n'}
{'id': 'R7W7UmXlrBmQeWRr', 'type': 'ERROR',
'error': 'Error executing the submission: Execution got following internal error(s):\nThe runj process failed: time="2024-04-12T03:04:23Z" level=fatal msg="Error executing the container" error="Error initializing the container process: cannot start an already running container"\n'}

复现条件

600并发的时候约有5%的概率出现

3.compile编译阶段saves的文件无法被run阶段执行

问题描述

跟1一样的C helloworld,共4个子任务:上传文件->编译->ls编译结果->执行
编译ls编译结果子任务的action都是 "seele/run-judge/compile@1":
编译: source: solution.cpp saves:solution.cpp,solution command: g++ solusion.cpp -i solution
ls编译结果: source: solution.cpp,solution saves:solution.cpp,solution command: ls -la
特别注意ls编译结果子任务的source和saves都是一样的

# 运行步骤的返回
{'status': 'FAILED', 
'report': {
  'run_at': '2024-04-12T05:33:22.452141283Z', 
  'time_elapsed_ms': 2343, 
  'type': 'run_container', 
  'status': 'RUNTIME_ERROR', 
  'exit_code': 1, 'wall_time_ms': 2190, 'cpu_user_time_ms': 16, 'cpu_kernel_time_ms': 50, 'memory_usage_kib': 21408}, 
  'embeds': {
    'cis_stderr': 'exec ./solution: exec format error\n', 
    'cis_stdout': ''
  }
}

复现条件

200并发的时候100%出现
1000并发的时候变成问题1超时和问题2无法执行了
但是,如果ls编译结果步骤 saves改为只有solution(之前是solution和solution.cpp)就能不会出现exec format error

讨论

上面的问题1、2应该是runj对系统资源依赖导致并发上不去,这个问题我是真的不懂了
以及它在高并发的时候又不会出现,只是变成超时,太迷惑了

基于 docker / podman 的方案部署失败

我在尝试部署项目的时候,发现使用 docker / podman 部署时,无论怎么配置均无法正常工作:提交评测任务时 runj panic.但是在裸机部署时工作完全正常.

测试环境

KVM 虚拟机内全新安装的 archlinux,内核版本 Linux archlinux 6.1.52-1-lts #1 SMP PREEMPT_DYNAMIC Thu, 07 Sep 2023 05:17:41 +0000 x86_64 GNU/Linux,另外安装了 go、rust、protoc、skopeo、umoci 等包.

seele 使用 0.1.1 release 版本进行测试.

系统存在的用户有:

名称 uid gid
root 0 0
yzy1 1000 1000
seele 1001 1002

已经按照 指南 开启了 delegation.以 root 和 yzy1 用户执行 cat /sys/fs/cgroup/user.slice/user-$(id -u).slice/user@$(id -u).service/cgroup.controllers 输出的结果都为 cpuset cpu io memory pids

同时已经配置好了 subuid 和 subgid 文件.两个文件内容相同,如下:

yzy1:100000:65536
seele:165536:65536

使用的 config.toml 如下:

log_level = "info"
work_mode = "bare" # 标记 <1>

[exchange.demo]
type = "http"
address = "0.0.0.0"
port = 8080

[worker.action.run_container]
userns_user = "seele"
userns_group = "seele"
userns_uid = 1001
userns_gid = 1002

测试使用的 task 如下:

steps:
  prepare:
    action: "seele/add-file@1"
    files:
      - path: "main.txt"
        plain: |
          Hello

  run:
    action: "seele/run-judge/run@1"
    image: "alpine:3.18.3"
    command: "cat main.txt"
    files: ["main.txt"]
    fd:
      # 将程序的标准输出流重定向到文件中
      stdout: "out.txt"
    report:
      embeds:
        # 让 Seele 保存 out.txt 文件的内容
        - path: "out.txt"
          field: output
          truncate_kib: 100

对照组:裸机部署

手动编译 seele 和 runj 并放到 /usr/local/bin 目录下.以 root 用户登录并运行 seele config.toml.提交 task 后成功出现 "type": "COMPLETED" 结果.

实验组 1:root docker

安装 docker 包,并 systemctl enable docker 后重启并以 root 用户登录.

config.toml 的标记 <1> 处改为 containerized,之后执行命令:

docker run \
  --security-opt seccomp=unconfined \
  --security-opt apparmor=unconfined \
  --security-opt systempaths=unconfined \
  -v /etc/subuid:/etc/subuid \
  -v /etc/subgid:/etc/subgid \
  -v /sys/fs/cgroup:/sys/fs/cgroup \
  -v `pwd`/config.toml:/etc/seele/config.toml \
  --tmpfs /tmp:exec,mode=777,size=1G \
  --cgroupns host \
  --net host \
  ghcr.io/darkyzhou/seele

返回错误结果,并且 seele 打印一条日志:

2023-09-13T03:35:53.499094Z ERROR handle_submission: seele::composer:143: Error handling the submission: Error executing the submission: Execution got following internal error(s):
The runj process failed: time="2023-09-13T03:35:53Z" level=fatal msg="Error executing the container" error="Error initializing the container process: unable to start container process: error during container init: read init-p: connection reset by peer"

尝试给 docker 添加 RUNJ_DEBUG 环境变量,日志为:

2023-09-13T03:37:10.584983Z ERROR handle_submission: seele::composer:143: Error handling the submission: Error executing the submission: Execution got following internal error(s):
The runj process failed: time="2023-09-13T03:37:10Z" level=debug msg="nsexec[28]: => nsexec container setup"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-0[28]: ~> nsexec stage-0"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-0[28]: spawn stage-1"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-0[28]: -> stage-1 synchronisation loop"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-1[30]: ~> nsexec stage-1"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-1[30]: unshare user namespace"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-1[30]: request stage-0 to map user namespace"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-1[30]: request stage-0 to map user namespace"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-0[28]: stage-1 requested userns mappings"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-0[28]: update /proc/30/uid_map to '0 1001 1\n1 165536 65536\n'"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-0[28]: update /proc/30/gid_map to '0 1002 1\n1 165536 65536\n'"
time="2023-09 seele.submission.id="Mqb6B6FUQRURW2NA"

实验组 2:root podman

将实验组 1 中的 docker 换 podman,结果不变.

实验组 3:rootless podman

卸载 docker 换成 podman.

以 yzy1 用户登录,将 config.toml 的标记 <1> 处改为 rootless_containerized,之后执行与前面相同的命令.

不开 RUNJ_DEBUG 结果:

2023-09-13T04:10:09.942514Z ERROR handle_submission: seele::composer:143: Error handling the submission: Error execu
ting the submission: Execution got following internal error(s):
The runj process failed: time="2023-09-13T04:10:09Z" level=fatal msg="nsexec-1[38]: failed to unshare remaining name
spaces (except cgroupns): Operation not permitted"
time="2023-09-13T04:10:09Z" level=fatal msg="nsexec-0[36]: failed to sync with stage-1: next state: Success"
time="2023-09-13T04:10:09Z" level=fatal msg="Error executing the container" error="Error initializing the container 
process: unable to start container process: can't get final child's PID from pipe: EOF"
 seele.submission.id="EWgqgAEr7Ogsb3LN" 

开 RUNJ_DEBUG 结果:

2023-09-13T04:11:28.056095Z ERROR handle_submission: seele::composer:143: Error handling the submission: Error execu
ting the submission: Execution got following internal error(s):
The runj process failed: time="2023-09-13T04:11:28Z" level=debug msg="nsexec[36]: => nsexec container setup"
time="2023-09-13T04:11:28Z" level=debug msg="nsexec-0[36]: ~> nsexec stage-0"
time="2023-09-13T04:11:28Z" level=debug msg="nsexec-0[36]: spawn stage-1"
time="2023-09-13T04:11:28Z" level=debug msg="nsexec-0[36]: -> stage-1 synchronisation loop"
time="2023-09-13T04:11:28Z" level=debug msg="nsexec-1[38]: ~> nsexec stage-1"
time="2023-09-13T04:11:28Z" level=debug msg="nsexec-1[38]: unshare remaining namespace (except cgroupns)"
time="2023-09-13T04:11:28Z" level=fatal msg="nsexec-1[38]: failed to unshare remaining namespaces (except cgroupns):
 Operation not permitted"
time="2023-09-13T04:11:28Z" level=fatal msg="nsexec-0[36]: failed to sync with stage-1: next state: Success"
time="2023-09-13T04:11:28Z" level=fatal msg="Error executing the container" error="Error initializing the container 
process: unable to start container process: can't get final child's PID from pipe: EOF"
 seele.submission.id="XeNQmWQAX3QQcUqY" 

实验组 4:rootless podman + work_mode=ccontainerized

以 yzy1 用户登录,将 config.toml 的标记 <1> 处改为 containerized,之后执行与前面相同的命令.

不开 RUNJ_DEBUG 结果:

2023-09-13T04:13:04.840493Z ERROR handle_submission: seele::composer:143: Error handling the submission: Error execu
ting the submission: Execution got following internal error(s):
The runj process failed: time="2023-09-13T04:13:04Z" level=fatal msg="nsexec-0[36]: failed to use newuid map on 38: 
Operation not permitted"
time="2023-09-13T04:13:04Z" level=fatal msg="nsexec-1[38]: failed to sync with parent: read(SYNC_USERMAP_ACK): Succe
ss"
time="2023-09-13T04:13:04Z" level=fatal msg="Error executing the container" error="Error initializing the container 
process: unable to start container process: can't get final child's PID from pipe: EOF"

开 RUNJ_DEBUG 结果:

2023-09-13T04:13:56.386498Z ERROR handle_submission: seele::composer:143: Error handling the submission: Error execu
ting the submission: Execution got following internal error(s):
The runj process failed: time="2023-09-13T04:13:56Z" level=debug msg="nsexec[36]: => nsexec container setup"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-0[36]: ~> nsexec stage-0"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-0[36]: spawn stage-1"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-0[36]: -> stage-1 synchronisation loop"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-1[38]: ~> nsexec stage-1"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-1[38]: unshare user namespace"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-1[38]: request stage-0 to map user namespace"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-1[38]: request stage-0 to map user namespace"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-0[36]: stage-1 requested userns mappings"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-0[36]: update /proc/38/uid_map to '0 1001 1\n1 165536 65536\n'"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-0[36]: update /proc/38/uid_map got -EPERM (trying /usr/bin/newui
dmap)"
time=" seele.submission.id="bYRYArMo2vYcFWAr" 

总结

四种基于 docker / podman 的方式均无法成功部署 seele,想问一下具体是因为什么原因导致部署失败的.另外,在阅读源码的时候发现了一个文档中不存在的 rootless_containerized 的工作模式,想问一下这个模式是用于什么场景.

docs: Base64 Padding

In add_file part of docs, it says that the padding of base64 encoding need to be removed, but the example of the part(as follow) include "=", this is a bit confusing for me.

steps:
  prepare:
    action: "seele/add-files@1"
    files:
      - path: "main.h"
        base64: "ZXh0ZXJuIGludCBwb3dlcjs="
      - path: "main.c"
        base64: "I2luY2x1ZGUgPHN0ZGlvLmg+CiNpbmNsdWRlICJtYWluLmgiCgppbnQgcG93ZXIgPSAxMTQ1MTQ7CgppbnQgbWFpbih2b2lkKSB7CiAgcHJpbnRmKCJQb3dlcjogJWRcbiIsIHBvd2VyKTsKICByZXR1cm4gMDsKfQ=="

ubuntu2204使用docker启动没有输出任何log

测试环境

KVM 虚拟机内全新安装的 ubuntu22.04.04,内核版本5.15.0-97-generic, docker版本Docker version 25.0.3, build 4debf41
按照教程启动:
config.coml

[worker.action.run_container]
userns_user = "seele"
userns_uid = 1000
userns_gid = 1000

docker启动命令

docker run \
  --security-opt seccomp=unconfined \
  --security-opt apparmor=unconfined \
  --security-opt systempaths=unconfined \
  -v /etc/subuid:/etc/subuid \
  -v /etc/subgid:/etc/subgid \
  -v /sys/fs/cgroup:/sys/fs/cgroup \
  -v `pwd`:/etc/seele \
  --tmpfs /tmp:exec,mode=777,size=1G \
  --cgroupns host \
  --net host \
  ghcr.io/darkyzhou/seele

启动后拉取镜像,然后口没有任何输出了
但用 curl往绑定端口提交评测任务,可以返回正确执行代码后的结果,但是在docker logs里仍然没有任何信息

问题

请问这种情况有啥排查思路吗?
已经用docker exec -it xxx /bin/bash进入容器查看,
/tini -- /usr/local/bin/seele进程和其他N=cpu+1个/usr/local/bin/seele进程/线程

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.