darkyzhou / seele Goto Github PK

View Code? Open in Web Editor NEW

60.0 1.0 4.0 7.11 MB

面向云原生的在线评测系统. Cloud-Native oriented Online Judge system

Home Page: https://seele.darkyzhou.net

License: MIT License

Rust 76.56% Go 14.18% Makefile 0.34% Dockerfile 0.43% Shell 0.43% JavaScript 6.10% C 1.80% C++ 0.15%

cgroup cloud-native containers linux namespace oci online-judge rust acm

seele's People

Contributors

Stargazers

Watchers

Forkers

iq-scm chenjunyu19 tannineo qiushido

seele's Issues

关于高并发和性能的问题

评测机只跑seele，但是遇到了很多不合理的问题：

评测机环境

双路EPYC-9654 96核CPU，系统共192核384线程 2.24T内存
Ubuntu22.04系统配合kubernetes

1. WALL_TIME超时但是user和kernel time都很短

问题描述

例如一个简单的hello world C代码，上传文件，编译，运行共三个子任务

# 运行步骤的返回
{'status': 'FAILED',
 'report': {
  'run_at': '2024-04-12T02:39:41.860021413Z', 
  'time_elapsed_ms': 58616, 
  'type': 'run_container', 
  'status': 'WALL_TIME_LIMIT_EXCEEDED', 
  'exit_code': 0, 
  'wall_time_ms': 12045, 
  'cpu_user_time_ms': 65, 
  'cpu_kernel_time_ms': 104, 
  'memory_usage_kib': 21156}, 
'embeds': {'cis_stdout': 'hello,world2\n', 'cis_stderr': ''}}

可以看到它已经正确输出hello,world2了，整个cpu_user+kernel不到200ms，但是wall_time有整整12s，即便如此，整个time_elapsed_ms却来到了58s，超时被强行结束的

复现条件

并发大于400 (也就是大约超过cpu线程数的时候)
是否有编译子任务不影响问题出现（python这些提交文件-执行的也会遇到）（编译子任务启用cache也会遇到）

讨论

在我们评测机上并发200的时候能以30 tasks/s的速度完成这个，但这种简单的helloworld期望应该是200 tasks/s以上的评测速度。其原因都是wall_time太长导致的。
我感觉应该是runj在面对高并发的时候本身成为瓶颈了：启动、退出容器都很慢

2. runj error: cannot start an already running container

问题描述

跟1一样的C helloworld，上传文件-编译-ls编译结果-执行，结果runj报错：
Error initializing the container process: cannot start an already running container

# 运行步骤的返回
{'id': 'F3oBscsWNXF7M6EQ', 'type': 'ERROR', 
'error': 'Error executing the submission: Execution got following internal error(s):\nThe runj process failed: time="2024-04-12T03:04:23Z" level=fatal msg="Error executing the container" error="Error initializing the container process: cannot start an already running container"\n'}
{'id': 'R7W7UmXlrBmQeWRr', 'type': 'ERROR',
'error': 'Error executing the submission: Execution got following internal error(s):\nThe runj process failed: time="2024-04-12T03:04:23Z" level=fatal msg="Error executing the container" error="Error initializing the container process: cannot start an already running container"\n'}

复现条件

600并发的时候约有5%的概率出现

3.compile编译阶段saves的文件无法被run阶段执行

问题描述

跟1一样的C helloworld，共4个子任务：上传文件->编译->ls编译结果->执行
编译和ls编译结果子任务的action都是 "seele/run-judge/compile@1":
编译： source: solution.cpp saves:solution.cpp,solution command: g++ solusion.cpp -i solution
ls编译结果： source: solution.cpp,solution saves:solution.cpp,solution command: ls -la
特别注意ls编译结果子任务的source和saves都是一样的

# 运行步骤的返回
{'status': 'FAILED', 
'report': {
  'run_at': '2024-04-12T05:33:22.452141283Z', 
  'time_elapsed_ms': 2343, 
  'type': 'run_container', 
  'status': 'RUNTIME_ERROR', 
  'exit_code': 1, 'wall_time_ms': 2190, 'cpu_user_time_ms': 16, 'cpu_kernel_time_ms': 50, 'memory_usage_kib': 21408}, 
  'embeds': {
    'cis_stderr': 'exec ./solution: exec format error\n', 
    'cis_stdout': ''
  }
}

复现条件

200并发的时候100%出现
1000并发的时候变成问题1超时和问题2无法执行了
但是，如果ls编译结果步骤 saves改为只有solution（之前是solution和solution.cpp）就能不会出现exec format error

讨论

上面的问题1、2应该是runj对系统资源依赖导致并发上不去，这个问题我是真的不懂了
以及它在高并发的时候又不会出现，只是变成超时，太迷惑了

Error: unable to join session keyring

正常运行一段时间后出现如下报错，此后提交的一切评测请求都是同样的报错

Error executing the submission: Execution got following internal error(s):
The runj process failed: time="2024-03-20T11:49:48Z" level=fatal 
msg="Error executing the container" 
error="Error initializing the container process: 
  unable to start container process: 
    error during container init: 
      unable to join session keyring: 
        unable to create session key: 
          disk quota exceeded

已经改大了/proc/sys/kernel/keys/maxkeys和/proc/sys/kernel/keys/maxkeys到200M以上

基于 docker / podman 的方案部署失败

我在尝试部署项目的时候，发现使用 docker / podman 部署时，无论怎么配置均无法正常工作：提交评测任务时 runj panic．但是在裸机部署时工作完全正常．

测试环境

KVM 虚拟机内全新安装的 archlinux，内核版本 Linux archlinux 6.1.52-1-lts #1 SMP PREEMPT_DYNAMIC Thu, 07 Sep 2023 05:17:41 +0000 x86_64 GNU/Linux，另外安装了 go、rust、protoc、skopeo、umoci 等包．

seele 使用 0.1.1 release 版本进行测试．

系统存在的用户有：

名称	uid	gid
root	0	0
yzy1	1000	1000
seele	1001	1002

已经按照指南开启了 delegation．以 root 和 yzy1 用户执行 cat /sys/fs/cgroup/user.slice/user-$(id -u).slice/user@$(id -u).service/cgroup.controllers 输出的结果都为 cpuset cpu io memory pids．

同时已经配置好了 subuid 和 subgid 文件．两个文件内容相同，如下：

yzy1:100000:65536
seele:165536:65536

使用的 config.toml 如下：

log_level = "info"
work_mode = "bare" # 标记 <1>

[exchange.demo]
type = "http"
address = "0.0.0.0"
port = 8080

[worker.action.run_container]
userns_user = "seele"
userns_group = "seele"
userns_uid = 1001
userns_gid = 1002

测试使用的 task 如下：

steps:
  prepare:
    action: "seele/add-file@1"
    files:
      - path: "main.txt"
        plain: |
          Hello

  run:
    action: "seele/run-judge/run@1"
    image: "alpine:3.18.3"
    command: "cat main.txt"
    files: ["main.txt"]
    fd:
      # 将程序的标准输出流重定向到文件中
      stdout: "out.txt"
    report:
      embeds:
        # 让 Seele 保存 out.txt 文件的内容
        - path: "out.txt"
          field: output
          truncate_kib: 100

对照组：裸机部署

手动编译 seele 和 runj 并放到 /usr/local/bin 目录下．以 root 用户登录并运行 seele config.toml．提交 task 后成功出现 "type": "COMPLETED" 结果．

实验组 1：root docker

安装 docker 包，并 systemctl enable docker 后重启并以 root 用户登录．

将 config.toml 的标记 <1> 处改为 containerized，之后执行命令：

docker run \
  --security-opt seccomp=unconfined \
  --security-opt apparmor=unconfined \
  --security-opt systempaths=unconfined \
  -v /etc/subuid:/etc/subuid \
  -v /etc/subgid:/etc/subgid \
  -v /sys/fs/cgroup:/sys/fs/cgroup \
  -v `pwd`/config.toml:/etc/seele/config.toml \
  --tmpfs /tmp:exec,mode=777,size=1G \
  --cgroupns host \
  --net host \
  ghcr.io/darkyzhou/seele

返回错误结果，并且 seele 打印一条日志：

2023-09-13T03:35:53.499094Z ERROR handle_submission: seele::composer:143: Error handling the submission: Error executing the submission: Execution got following internal error(s):
The runj process failed: time="2023-09-13T03:35:53Z" level=fatal msg="Error executing the container" error="Error initializing the container process: unable to start container process: error during container init: read init-p: connection reset by peer"

尝试给 docker 添加 RUNJ_DEBUG 环境变量，日志为：

2023-09-13T03:37:10.584983Z ERROR handle_submission: seele::composer:143: Error handling the submission: Error executing the submission: Execution got following internal error(s):
The runj process failed: time="2023-09-13T03:37:10Z" level=debug msg="nsexec[28]: => nsexec container setup"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-0[28]: ~> nsexec stage-0"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-0[28]: spawn stage-1"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-0[28]: -> stage-1 synchronisation loop"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-1[30]: ~> nsexec stage-1"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-1[30]: unshare user namespace"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-1[30]: request stage-0 to map user namespace"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-1[30]: request stage-0 to map user namespace"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-0[28]: stage-1 requested userns mappings"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-0[28]: update /proc/30/uid_map to '0 1001 1\n1 165536 65536\n'"
time="2023-09-13T03:37:10Z" level=debug msg="nsexec-0[28]: update /proc/30/gid_map to '0 1002 1\n1 165536 65536\n'"
time="2023-09 seele.submission.id="Mqb6B6FUQRURW2NA"

实验组 2：root podman

将实验组 1 中的 docker 换 podman，结果不变．

实验组 3：rootless podman

卸载 docker 换成 podman．

以 yzy1 用户登录，将 config.toml 的标记 <1> 处改为 rootless_containerized，之后执行与前面相同的命令．

不开 RUNJ_DEBUG 结果：

2023-09-13T04:10:09.942514Z ERROR handle_submission: seele::composer:143: Error handling the submission: Error execu
ting the submission: Execution got following internal error(s):
The runj process failed: time="2023-09-13T04:10:09Z" level=fatal msg="nsexec-1[38]: failed to unshare remaining name
spaces (except cgroupns): Operation not permitted"
time="2023-09-13T04:10:09Z" level=fatal msg="nsexec-0[36]: failed to sync with stage-1: next state: Success"
time="2023-09-13T04:10:09Z" level=fatal msg="Error executing the container" error="Error initializing the container 
process: unable to start container process: can't get final child's PID from pipe: EOF"
 seele.submission.id="EWgqgAEr7Ogsb3LN"

开 RUNJ_DEBUG 结果：

2023-09-13T04:11:28.056095Z ERROR handle_submission: seele::composer:143: Error handling the submission: Error execu
ting the submission: Execution got following internal error(s):
The runj process failed: time="2023-09-13T04:11:28Z" level=debug msg="nsexec[36]: => nsexec container setup"
time="2023-09-13T04:11:28Z" level=debug msg="nsexec-0[36]: ~> nsexec stage-0"
time="2023-09-13T04:11:28Z" level=debug msg="nsexec-0[36]: spawn stage-1"
time="2023-09-13T04:11:28Z" level=debug msg="nsexec-0[36]: -> stage-1 synchronisation loop"
time="2023-09-13T04:11:28Z" level=debug msg="nsexec-1[38]: ~> nsexec stage-1"
time="2023-09-13T04:11:28Z" level=debug msg="nsexec-1[38]: unshare remaining namespace (except cgroupns)"
time="2023-09-13T04:11:28Z" level=fatal msg="nsexec-1[38]: failed to unshare remaining namespaces (except cgroupns):
 Operation not permitted"
time="2023-09-13T04:11:28Z" level=fatal msg="nsexec-0[36]: failed to sync with stage-1: next state: Success"
time="2023-09-13T04:11:28Z" level=fatal msg="Error executing the container" error="Error initializing the container 
process: unable to start container process: can't get final child's PID from pipe: EOF"
 seele.submission.id="XeNQmWQAX3QQcUqY"

实验组 4：rootless podman + `work_mode=ccontainerized`

以 yzy1 用户登录，将 config.toml 的标记 <1> 处改为 containerized，之后执行与前面相同的命令．

不开 RUNJ_DEBUG 结果：

2023-09-13T04:13:04.840493Z ERROR handle_submission: seele::composer:143: Error handling the submission: Error execu
ting the submission: Execution got following internal error(s):
The runj process failed: time="2023-09-13T04:13:04Z" level=fatal msg="nsexec-0[36]: failed to use newuid map on 38: 
Operation not permitted"
time="2023-09-13T04:13:04Z" level=fatal msg="nsexec-1[38]: failed to sync with parent: read(SYNC_USERMAP_ACK): Succe
ss"
time="2023-09-13T04:13:04Z" level=fatal msg="Error executing the container" error="Error initializing the container 
process: unable to start container process: can't get final child's PID from pipe: EOF"

开 RUNJ_DEBUG 结果：

2023-09-13T04:13:56.386498Z ERROR handle_submission: seele::composer:143: Error handling the submission: Error execu
ting the submission: Execution got following internal error(s):
The runj process failed: time="2023-09-13T04:13:56Z" level=debug msg="nsexec[36]: => nsexec container setup"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-0[36]: ~> nsexec stage-0"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-0[36]: spawn stage-1"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-0[36]: -> stage-1 synchronisation loop"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-1[38]: ~> nsexec stage-1"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-1[38]: unshare user namespace"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-1[38]: request stage-0 to map user namespace"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-1[38]: request stage-0 to map user namespace"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-0[36]: stage-1 requested userns mappings"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-0[36]: update /proc/38/uid_map to '0 1001 1\n1 165536 65536\n'"
time="2023-09-13T04:13:56Z" level=debug msg="nsexec-0[36]: update /proc/38/uid_map got -EPERM (trying /usr/bin/newui
dmap)"
time=" seele.submission.id="bYRYArMo2vYcFWAr"

总结

四种基于 docker / podman 的方式均无法成功部署 seele，想问一下具体是因为什么原因导致部署失败的．另外，在阅读源码的时候发现了一个文档中不存在的 rootless_containerized 的工作模式，想问一下这个模式是用于什么场景．

docs: Base64 Padding

In add_file part of docs, it says that the padding of base64 encoding need to be removed, but the example of the part(as follow) include "=", this is a bit confusing for me.

steps:
  prepare:
    action: "seele/add-files@1"
    files:
      - path: "main.h"
        base64: "ZXh0ZXJuIGludCBwb3dlcjs="
      - path: "main.c"
        base64: "I2luY2x1ZGUgPHN0ZGlvLmg+CiNpbmNsdWRlICJtYWluLmgiCgppbnQgcG93ZXIgPSAxMTQ1MTQ7CgppbnQgbWFpbih2b2lkKSB7CiAgcHJpbnRmKCJQb3dlcjogJWRcbiIsIHBvd2VyKTsKICByZXR1cm4gMDsKfQ=="

README中的错别字

此处的“即”应该改为“既“吧？

ubuntu2204使用docker启动没有输出任何log

测试环境

KVM 虚拟机内全新安装的 ubuntu22.04.04，内核版本5.15.0-97-generic， docker版本Docker version 25.0.3, build 4debf41
按照教程启动：
config.coml

[worker.action.run_container]
userns_user = "seele"
userns_uid = 1000
userns_gid = 1000

docker启动命令

docker run \
  --security-opt seccomp=unconfined \
  --security-opt apparmor=unconfined \
  --security-opt systempaths=unconfined \
  -v /etc/subuid:/etc/subuid \
  -v /etc/subgid:/etc/subgid \
  -v /sys/fs/cgroup:/sys/fs/cgroup \
  -v `pwd`:/etc/seele \
  --tmpfs /tmp:exec,mode=777,size=1G \
  --cgroupns host \
  --net host \
  ghcr.io/darkyzhou/seele

启动后拉取镜像，然后口没有任何输出了
但用 curl往绑定端口提交评测任务，可以返回正确执行代码后的结果，但是在docker logs里仍然没有任何信息

问题

请问这种情况有啥排查思路吗？
已经用docker exec -it xxx /bin/bash进入容器查看，
有/tini -- /usr/local/bin/seele进程和其他N=cpu+1个/usr/local/bin/seele进程/线程

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

darkyzhou / seele Goto Github PK

seele's People

Contributors

Stargazers

Watchers

Forkers

seele's Issues

评测机环境

1. WALL_TIME超时但是user和kernel time都很短

问题描述

复现条件

讨论

2. runj error: cannot start an already running container

问题描述

复现条件

3.compile编译阶段saves的文件无法被run阶段执行

问题描述

复现条件

讨论

测试环境

对照组：裸机部署

实验组 1：root docker

实验组 2：root podman

实验组 3：rootless podman

实验组 4：rootless podman + work_mode=ccontainerized

总结

测试环境

问题

Recommend Projects

Recommend Topics

Recommend Org

实验组 4：rootless podman + `work_mode=ccontainerized`