Code Monkey home page Code Monkey logo

podmanxrootd's Introduction


Bockjoo Kim, August 28, 2023

[1] Introduction

This doocumentation describes a procedure for a podman based XRootD deployment at RC for the Florida CMS Tier2.

[2] Preparing the Machine for the XRootD podman Needs Privileged Actions: Puppetization

[2-1] Hardware and OS

cmsio machines can be used as the host machines.
RHEL8 or RHEL9 host needs to be prepared.
For this documentation, two VMs are used to evaluate the podman solution.

[2-2] Package Installation

A. Explannation
podman, podman-compose, buildah, and fuse-overlayfs are the primary packages needed.
The others are either automatic subsidiary packages or convenience packages.
B. In Puppet, something like is needed: 
yum -y install podman btrfs-progs-devel conmon containernetworking-plugins  \
       containers-common crun device-mapper-devel git glib2-devel glibc-devel \
       glibc-static go golang-github-cpuguy83-md2man gpgme-devel iptables \
       libassuan-devel libgpg-error-devel libseccomp-devel libselinux-devel \
       make pkgconfig
yum install -y slirp4netns
yum install -y fuse-overlayfs
yum install -y netavark
yum install podman-compose buildah

[2-3] User Namespace

A. A process of configuring user namespaces 
At this stage, podman ps would result in:
-bash-4.2$ podman ps
cannot clone: Invalid argument
user namespaces are not enabled in /proc/sys/user/max_user_namespaces
Error: could not get runtime: cannot re-exec process
-bash-4.2$ cat /proc/sys/user/max_user_namespaces
[~]# echo 10000 > /proc/sys/user/max_user_namespaces

-bash-4.2$ podman ps
ERRO[0000] cannot find mappings for user bockjoo: No subuid ranges found for user "bockjoo" in /etc/subuid 
ERRO[0000] cannot find mappings for user bockjoo: No subuid ranges found for user "bockjoo" in /etc/subuid 

Add the podman user to /etc/subuid
For example, 
echo bockjoo:100000:65536 >> /etc/subuid
Add the podgman group to /etc/subgid
For example,
echo bockjoo:100000:65536 >> /etc/subgid

[ ~]# echo bockjoo:100000:65536 >> /etc/subgid
[ ~]# echo bockjoo:100000:65536 >> /etc/subgid

-bash-4.2$ podman ps
-bash-4.2$ echo $?

One can use the usermod command to update subuid and subgid:
usermod --add-subuids 100000-65536 --add-subgids 100000-65536 bockjoo

B. In Puppet, do something like: 
echo 15000 > /proc/sys/user/max_user_namespaces
usermod --add-subuids 100000-65536 --add-subgids 100000-65536 bockjoo

[2-4] Configure cgroup V2 support

A. A description
It allows the user to limit the amount of resources a rootless container can use.

B. In Puppet, one can do something like :
grep -q systemd.unified_cgroup_hierarchy=1 /proc/cmdline || \
grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=1"
systemctl reboot

[2-5] Container Storage Configuration

A long explanation
If the main podman user home directory is mounted on an NFS, it is not capable of overlayfs.
There is a multiple way of doing this, i.e., preparing the container storage to store images, etc.
One is configuring graphroot parameter in ~/.config/containers/storage.conf
Another is symlinking ~/.local/share/containers to a disk is local to the machine.
Here's the second method assuming the main podman user is bockjoo:
mkdir /opt/cms
chown -R bockjoo:avery /opt/cms
if [ -d ~/.local/share/containers ] ; then
   if [ -L ~/.local/share/containers ] ;  then
      ls -al ~/.local/share/containers
      echo Warning ~/.local/share/containers is already a symlink
      cp -pR ~/.local/share/containers /opt/cms/.local/share
      rm -rf ~/.local/share/containers
      cd ~/.local/share
      ln -s /opt/cms/.local/share/containers 
   mkdir -p /opt/cms/.local/share/containers
   chown -R bockjoo:avery /opt/cms
   su - bockjoo -c "mkdir -p ~/.local/share ; cd ~/.local/share ; ln -s /opt/cms/.local/share/containers ;"   

If the machine is just prepared, one will just need to execute:

mkdir -p /opt/cms/.local/share/containers
chown -R bockjoo:avery /opt/cms
su - bockjoo -c "mkdir -p ~/.local/share ; cd ~/.local/share ; ln -s /opt/cms/.local/share/containers ;"   
B. In Puppet, one can do something like :
mkdir /opt/cms ; chown bockjoo:avery /opt/cms

[2-6] Information on Users who read or write through XRootD

A. Who has access to the /cmsuf through XRootD
These XRootD users are defined in /etc/xrootd/voms-mapfile and /etc/grid-security/grid-mapfile.
Correspondg users in the username space need to be created to access the Lustre filesystem
B. For Puppet, if needed, you might want to refer to

[2-7] Preparing the podman accounts in the user namespace

 A. Explanation 
It appears the users in the podman user namespace need to exist for the users in the
podman XRootD container to read and write files in Lustre, /cmsuf
 B. In Puppet, one can do something like:
See /cmsuf/t2/operations/

For the whole account choreographing, see /cmsuf/t2/operations/

[2-8] The XRootD backend storage directory ownership and permission

A. Explanation 
/cmsuf/data/store has various subdirectories owned by the users accessing the XRootD.
We will use /cmsuf/podman/data/store instead for the podman XRootD as they need the podman user ownership.
So, /cmsuf/podman needs to be owned by the user who is running podman container (bockjoo:avery)
B. One time operation 
mkdir -p /cmsuf/podman
chown bockjoo:avery /cmsuf/podman

[2-9] Additional Ports for the transition from the regular XRootD to the podman XRootD:

A. Explanation
During the transition, both the regular XRootD to the podman XRootD need to coexist.
B. Puppet 
ports 1095 and 3121 need to be open

[2-10] Persistent User to prevent the issue [6-2]

A. Explanation
After logging out from machine, Podman containers are stopped for some users. To prevent that, enable lingering for users running containers.
B. Puppet 
loginctl enable-linger bockjoo

[3] T2 Actions: Unprivileged

[3-1] Checking the podman requirements

[3-1-1] OS > rhel7 ?
uname -a | grep "el8\|el9"

[3-1-2] Do podman packages exist?
rpm_qa=$(rpm -qa)
for p in podman podman-compose buildah fuse-overlayfs ; do
    printf "$rpm_qa\n" | grep $p

[3-1-3] It should support cgroup2fs
stat -fc %T /sys/fs/cgroup/

[3-1-4] There should be non-zero username space
cat /proc/sys/user/max_user_namespaces

[3-1-5] podman user is added to /etc/subuid and /etc/subgid for the user namespace
grep bockjoo /etc/subuid
grep bockjoo /etc/subgid

[3-1-6] Does /opt/cms exist and is it owned by bockjoo:avery?
stat -c %U:%G /opt/cms

[3-1-7] Do podman users/groups exist?

for u in $(cat /cmsuf/t2/operations/ | grep useradd | awk '{print $NF}') ; do
   getent passwd $u

for g in $(cat /cmsuf/t2/operations/ | grep groupadd | cut -d\# -f1 | awk '{print $NF}') ; do
   getent group $g

[3-1-8] Is /cmsuf/podman owned by bockjoo:avery?
stat -c %U:%G /cmsuf/podman

[3-2] Building the XRootD Images

cd /cmsuf/t2/operations/podman/xrootd/server
# Cleanup image: 
podman image rm $(podman images | grep xrootd_server | awk '{print $3}')
# Building
buildah bud -f Dockerfile.Systemd -t xrootd_server

[3-3] Starting the XRootD Server in the Podman Container

[3-1] One time setup in each podman machine

# XRootD configurations
if [ ! -d /opt/cms/etc/xrootd ] ; then
   mkdir -p /opt/cms/etc/xrootd
   xrootd_configs="xrootd-clustered.cfg Authfile ban-robots.txt macaroon-secret robots.txt"
   xrootd_configs="$xrootd_configs scitokens.cfg scitokens_mapfile_wlcg.json scitokens-map.json voms-mapfile" 
   for f in $xrootd_configs ; do
       /bin/cp /cmsuf/t2/operations/podman/xrootd/etc/xrootd/$f /opt/cms/etc/xrootd/
   if [ $(/bin/hostname -s | grep -q 2 ; echo $?) -eq 0 ] ; then
      /bin/cp /cmsuf/t2/operations/podman/xrootd/etc/xrootd/xrootd-clustered.cfg.redirector /opt/cms/etc/xrootd/xrootd-clustered.cfg
      if [ $(grep ^all.manager /opt/cms/etc/xrootd/xrootd-clustered.cfg | awk '{print $NF}' | cut -d. -f1) == $(/bin/hostname -s) ] ; then
         echo OK redirector
         echo ERROR the redirector needs to be reconfigured.
         vi /opt/cms/etc/xrootd/xrootd-clustered.cfg

if [ ! -d /opt/cms/etc/grid-security ] ; then
   mkdir -p /opt/cms/etc/grid-security/xrd
   mapfiles="ban-mapfile grid-mapfile voms-mapfile voms-mapfile2"
   for f in $mapfiles ; do
       /bin/cp /cmsuf/t2/operations/podman/xrootd/etc/grid-security/$f /opt/cms/etc/grid-security/
   certs="hostcert_$(/bin/hostname -s).pem hostkey_$(/bin/hostname -s).pem"
   for f in $certs ; do
       /bin/cp /cmsuf/t2/operations/podman/xrootd/etc/grid-security/$f /opt/cms/etc/grid-security/

# For the FQHN
if [ ! -d /opt/cms/etc/sysconfig ] ; then
   mkdir -p /opt/cms/etc/sysconfig
   echo HOSTNAME=$(/bin/hostname -s) > /opt/cms/etc/sysconfig/xrootd

# For the FQHN
if [ ! -d /opt/cms/etc/systemd/system/systemd-hostnamed.service.d ] ; then
   mkdir -p /opt/cms/etc/systemd/system/systemd-hostnamed.service.d ] ; then
   for f in [email protected] [email protected] ; do
       /bin/cp /cmsuf/t2/operations/podman/xrootd/etc/systemd/system/$f /opt/cms/etc/systemd/system
   /bin/cp /cmsuf/t2/operations/podman/xrootd/etc/systemd/system/systemd-hostnamed.service.d/override.conf /opt/cms/etc/systemd/system/systemd-hostnamed.service.d

# For the FQHN
if [ ! -f /opt/cms/etc/hosts ] ; then
   /bin/cp /cmsuf/t2/operations/podman/xrootd/etc/hosts /opt/cms/etc/hosts

# For the FQHN
if [ ! -f /opt/cms/etc/hostname ] ; then
   echo $(/bin/hostname -s) > /opt/cms/etc/hostname

# selinux
if [ ! -f /opt/cms/etc/selinux/config ] ; then
   /bin/cp /cmsuf/t2/operations/podman/xrootd/etc/selinux/config /opt/cms/etc/selinux/config

# File mapping /cmsuf/podman/data/store -> /store ( storage_podman.xml and storage.xml )
# To switch the mapping for /cmsuf/data/store -> /store, use storage_regular.xml
if [ ! -d /opt/cms/etc/SITECONF/local/PhEDEx ] ; then
   mkdir -p /opt/cms/etc/SITECONF/local/PhEDEx
   storage_maps="storage_opt.xml storage_regular.xml storage_podman.xml storage.xml"
   for f in $storage_maps ; do
      /bin/cp /cmsuf/t2/operations/podman/xrootd/etc/SITECONF/local/PhEDEx/$f /opt/cms/etc/SITECONF/local/PhEDEx/

container_id=$(podman run -d --rm --name xrootd_server \
               --cgroup-manager=cgroupfs --tmpfs /tmp \
               --tmpfs /run \
               -v /sys/fs/cgroup:/sys/fs/cgroup:ro \
               -v /opt/cms/etc/xrootd/:/etc/xrootd/:ro \
               -v /opt/cms/etc/grid-security/grid-mapfile:/etc/grid-security/grid-mapfile:ro \
               -v /opt/cms/etc/grid-security/ban-mapfile:/etc/grid-security/ban-mapfile:ro \
               -v /opt/cms/etc/grid-security/voms-mapfile:/etc/grid-security/voms-mapfile:ro \
               -v /opt/cms/etc/grid-security/hostcert.pem:/etc/grid-security/hostcert.pem:ro \
               -v /opt/cms/etc/grid-security/hostkey.pem:/etc/grid-security/hostkey.pem:ro \
               -v /opt/cms/etc/grid-security/xrd/:/etc/grid-security/xrd:rw \
               -v /opt/cms/etc/sysconfig/xrootd:/etc/sysconfig/xrootd:ro \
               -v /opt/cms/etc/systemd/system/[email protected]:/etc/systemd/system/[email protected]:ro \
               -v /opt/cms/etc/systemd/system/[email protected]:/etc/systemd/system/[email protected]:ro \
               -v /opt/cms/etc/hosts:/etc/hosts:rw \
               -v /opt/cms/etc/hostname:/etc/hostname:rw \
               -v /opt/cms/etc/systemd/system/systemd-hostnamed.service.d/:/etc/systemd/system/systemd-hostnamed.service.d/:ro \
               -v /opt/cms/etc/selinux/config:/etc/selinux/config:rw \
               -v /opt/cms/podman/:/opt/cms/podman/:rw \
               -v /opt/cms/store/:/opt/cms/store/:rw \
               -v /opt/cms/etc/SITECONF/:/etc/SITECONF/:ro \
               -v /cmsuf/:/cmsuf/:rw \
               --systemd=true --network=host --cgroup-manager=systemd localhost/xrootd_server:latest)

# xrd cert/key pair preparation
podman exec -it ${container_id} cp /etc/grid-security/hostcert.pem /etc/grid-security/xrd/xrdcert.pem
podman exec -it ${container_id} cp /etc/grid-security/hostkey.pem /etc/grid-security/xrd/xrdkey.pem
podman exec -it ${container_id} chown xrootd:xrootd /etc/grid-security/xrd/xrdcert.pem
podman exec -it ${container_id} chown xrootd:xrootd /etc/grid-security/xrd/xrdkey.pem

# xrootd/cmsd servers need to be restarted after xrd cert/key pair preparation
podman exec -it ${container_id} systemctl restart [email protected]
podman exec -it ${container_id} systemctl restart [email protected]
# /cmsuf/t2/operations/ This creates the following directories
# with the proper mode and the proper ownership inside the container
# The script will fail if the accounts in the Section [2-7] do not exist

[4] Tests with Possible Use Cases

Inpections with the container

podman exec -it ${container_id} ps auxwww # To check the xrootd/cmsd processes are running
podman exec -it ${container_id} /bin/bash # To inpect the container interactively

[4-1] check with xrdmapc and xrdfs behavior, a.k.a. AAA in CMS

xrdmapc --list all 
xrdmapc --list all 

xrdfs query config version
xrdfs query config version

xrdfs ls -l /store
xrdfs locate /store/mc/SAM/GenericTTbar/AODSIM/CMSSW_9_2_6_91X_mcRun1_realistic_v2-v1/00000/A64CCCF2-5C76-E711-B359-0CC47A78A3F8.root

[4-2] Test for the /store/user read/write (user bockjoo)

xrdcp upload

# For some reason, there was the permission denied error initially ( Morning of Aug 21, but it became successful Afternoon of Aug 21 )

[bockjoo@cms ~]$ xrdcp -d 1 -f file://`pwd`/sitedb.list root://
[2023-08-21 15:03:36.307318 -0400][Info   ][AsyncSock         ] [] TLS hand-shake done.
[2023-08-21 15:03:38.734174 -0400][Info   ][AsyncSock         ] [] TLS hand-shake done.
[bockjoo@cms ~]$ ls /cmsuf/podman/data/store/user/bockjoo/
sitedb.list  test_21AUG2023.txt
[bockjoo@cms ~]$ xrdcp -d 1 -f file://`pwd`/sitedb.list root://
[2023-08-21 15:06:20.251453 -0400][Info   ][AsyncSock         ] [] TLS hand-shake done.
[2023-08-21 15:06:22.567718 -0400][Info   ][AsyncSock         ] [] TLS hand-shake done.
[bockjoo@cmspodman1 ~]$ ls -al /cmsuf/podman/data/store/user/bockjoo/subdir/sitedb.list 
-rw-r--r-- 1 podman podman 27150 Aug 21 15:06 /cmsuf/podman/data/store/user/bockjoo/subdir/sitedb.list

xrdcp download

[bockjoo@cms ~]$ ls -al  /cmsuf/podman/data/store/user/bockjoo/subdir1/sitedb.list 
-rw-r--r-- 1 563326 563751 27150 Aug 21 19:19 /cmsuf/podman/data/store/user/bockjoo/subdir1/sitedb.list
[bockjoo@cms ~]$ xrdcp -d 1 -f root:// ./
[2023-08-21 19:20:53.262015 -0400][Info   ][AsyncSock         ] [] TLS hand-shake done.
[2023-08-21 19:20:55.666187 -0400][Info   ][AsyncSock         ] [] TLS hand-shake done.
[bockjoo@cms ~]$ ls -al sitedb.list 
-rwxr-xr-x 1 bockjoo avery 27150 Aug 21 19:20 sitedb.list

gfal-copy upload and download

[bockjoo@cms ~]$ /usr/bin/python3 $(which gfal-copy) -f file://`pwd`/sitedb.list davs://
Copying file:///home/bockjoo/sitedb.list   [DONE]  after 0s                                                  
[bockjoo@cms ~]$ ls -al /cmsuf/podman/data/store/user/bockjoo/gfal_copy/sitedb.list 
-rw-r--r-- 1 563326 563751 27150 Aug 21 19:49 /cmsuf/podman/data/store/user/bockjoo/gfal_copy/sitedb.list

[bockjoo@cms ~]$ /usr/bin/python3 $(which gfal-copy) -f davs:// ./
Copying davs://   [DONE]  after 0s       
[bockjoo@cms ~]$ ls -al sitedb.list 
-rwxr-xr-x 1 bockjoo avery 27150 Aug 21 20:02 sitedb.list

Token Handling
 eval `oidc-agent`
 oidc-add -l
 oidc-add bockjoo_xrd
 export BEARER_TOKEN=$(oidc-token --scope=offline_access --time=3600 bockjoo_xrd)
XRootD Token Configuration
# add the following lines to appropriate places in /opt/cms/etc/xrootd/xrootd-clustered.cfg
xrd.tls /etc/grid-security/xrd/xrdcert.pem /etc/grid-security/xrd/xrdkey.pem
xrd.tlsca certdir /etc/grid-security/certificates
xrootd.tls capable all
sec.protocol /usr/lib64 ztn
[bockjoo@cms ~]$ export XrdSecPROTOCOL="ztn,unix" ; xrdmapc --list all
      Srv cmspodman1.ufhpc:1094
[bockjoo@cms ~]$ export XrdSecPROTOCOL="ztn,unix" ; xrdmapc --list all

xrdcp download

mdYHMS=$(date +%b%d%Y+%H+%M+%S) xrdcp -d 1 -f file://pwd/sitedb.list root://$mdYHMS ( eval oidc-agent oidc-add bockjoo_xrd export BEARER_TOKEN=$(oidc-token --scope=offline_access --time=3600 bockjoo_xrd) export XrdSecPROTOCOL="ztn,unix" xrdcp -d 1 -f root://$mdYHMS?authz=Bearer%20$BEARER_TOKEN ./ ) ls -al sitedb.list_$mdYHMS /usr/bin/python3 $(which gfal-ls) davs://$mdYHMS /usr/bin/python3 $(which gfal-rm) davs://$mdYHMS rm -f sitedb.list_$mdYHMS

gfal-copy download (upload does not work as I can not get the storage.write:/ scope)
mdYHMS=$(date +%b%d%Y+%H+%M+%S)
xrdcp -d 1 -f file://`pwd`/sitedb.list root://$mdYHMS
   eval `oidc-agent`
   oidc-add bockjoo_xrd
   export BEARER_TOKEN=$(oidc-token --scope=offline_access --time=3600 bockjoo_xrd)
   export XrdSecPROTOCOL="ztn,unix"
   /usr/bin/python3 $(which gfal-copy) -f davs://$mdYHMS ./
ls -al sitedb.list_$mdYHMS

/usr/bin/python3 $(which gfal-ls) davs://$mdYHMS
/usr/bin/python3 $(which gfal-rm) davs://$mdYHMS
rm -f sitedb.list_$mdYHMS 

[4-3] Role Based Tests (cmsprod)

Change the grid-mapfile for the role

sed -i 's|"/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=bockjoo/CN=556538/CN=Bockjoo Kim" bockjoo|#"/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=bockjoo/CN=556538/CN=Bockjoo Kim" bockjoo|' /opt/cms/etc/grid-security/grid-mapfile
grep Bockjoo /opt/cms/etc/grid-security/grid-mapfile

Restart the Container, Thus XRootD

podman stop $container_id
container_id=$(podman run -d --rm --name xrootd_server \
               --cgroup-manager=cgroupfs --tmpfs /tmp \
               --tmpfs /run \
               -v /sys/fs/cgroup:/sys/fs/cgroup:ro \
               -v /opt/cms/etc/xrootd/:/etc/xrootd/:ro \
               -v /opt/cms/etc/grid-security/grid-mapfile:/etc/grid-security/grid-mapfile:ro \
               -v /opt/cms/etc/grid-security/ban-mapfile:/etc/grid-security/ban-mapfile:ro \
               -v /opt/cms/etc/grid-security/voms-mapfile:/etc/grid-security/voms-mapfile:ro \
               -v /opt/cms/etc/grid-security/hostcert.pem:/etc/grid-security/hostcert.pem:ro \
               -v /opt/cms/etc/grid-security/hostkey.pem:/etc/grid-security/hostkey.pem:ro \
               -v /opt/cms/etc/grid-security/xrd/:/etc/grid-security/xrd:rw \
               -v /opt/cms/etc/sysconfig/xrootd:/etc/sysconfig/xrootd:ro \
               -v /opt/cms/etc/systemd/system/[email protected]:/etc/systemd/system/[email protected]:ro \
               -v /opt/cms/etc/systemd/system/[email protected]:/etc/systemd/system/[email protected]:ro \
               -v /opt/cms/etc/hosts:/etc/hosts:rw \
               -v /opt/cms/etc/hostname:/etc/hostname:rw \
               -v /opt/cms/etc/systemd/system/systemd-hostnamed.service.d/:/etc/systemd/system/systemd-hostnamed.service.d/:ro \
               -v /opt/cms/etc/selinux/config:/etc/selinux/config:rw \
               -v /opt/cms/podman/:/opt/cms/podman/:rw \
               -v /opt/cms/store/:/opt/cms/store/:rw \
               -v /opt/cms/etc/SITECONF/:/etc/SITECONF/:ro \
               -v /cmsuf/:/cmsuf/:rw \
               --systemd=true --network=host --cgroup-manager=systemd localhost/xrootd_server:latest)
#### Check the proxy has phedex or production role for cmsprod
source /cvmfs/$(source /cvmfs/ ; cmsos| cut -d_ -f1 | sed 's#[a-z]\|[A-Z]##g')-x86_64/
export X509_USER_PROXY=/home/bockjoo/.cmssoft/phedex_proxy
voms-proxy-info -fqan

xrdmapc and xrdfs with the role

   export X509_USER_PROXY=/home/bockjoo/.cmssoft/phedex_proxy
   xrdfs query config version
   xrdfs query config version
   xrdfs query config version
   xrdfs query config version
   xrdfs ls /store
# on the podman machine with the xrootd server
podman exec -it $container_id tail -50 /var/log/xrootd/clustered/xrootd.log | grep bockjoo | grep "login as"
       export X509_USER_PROXY=/home/bockjoo/.cmssoft/phedex_proxy ; 
       xrdfs locate  /store/mc/SAM/GenericTTbar/AODSIM/CMSSW_9_2_6_91X_mcRun1_realistic_v2-v1/00000/A64CCCF2-5C76-E711-B359-0CC47A78A3F8.root ; 

xrdcp upload and remove the file

/home/bockjoo/.cmssoft/phedex_proxy is mapped to cmsprod account in the container
# on the client
   mdYHMS=$(date +%b%d%Y+%H+%M+%S)
   export X509_USER_PROXY=/home/bockjoo/.cmssoft/phedex_proxy
   xrdcp -d 1 -f file://`pwd`/sitedb.list root://$mdYHMS
   ls -al /cmsuf/podman/data/store/user/rucio/bockjoo/sitedb.list_$mdYHMS
   /usr/bin/python3 $(which gfal-ls) davs://$mdYHMS

# on the podman machine
podman exec -it $container_id ls -al /cmsuf/podman/data/store/user/rucio/bockjoo/sitedb.list_$mdYHMS

# on the client
   export X509_USER_PROXY=/home/bockjoo/.cmssoft/phedex_proxy
   /usr/bin/python3 $(which gfal-rm) davs://$mdYHMS


[bockjoo@cms ~]$ # on the client
[bockjoo@cms ~]$ (
>    mdYHMS=$(date +%b%d%Y+%H+%M+%S)
>    export X509_USER_PROXY=/home/bockjoo/.cmssoft/phedex_proxy
>    xrdcp -d 1 -f file://`pwd`/sitedb.list root://$mdYHMS
>    ls -al /cmsuf/podman/data/store/user/rucio/bockjoo/sitedb.list_$mdYHMS
>    /usr/bin/python3 $(which gfal-ls) davs://$mdYHMS
> )
[2023-08-24 21:18:37.408742 -0400][Info   ][AsyncSock         ] [] TLS hand-shake done.
[2023-08-24 21:18:39.835221 -0400][Info   ][AsyncSock         ] [] TLS hand-shake done.
-rw-r--r-- 1 559474 563752 27150 Aug 24 21:18 /cmsuf/podman/data/store/user/rucio/bockjoo/sitedb.list_Aug242023+21+18+37

[bockjoo@cmspodman1 ~]$ # on the podman machine
[bockjoo@cmspodman1 ~]$ mdYHMS=Aug242023+21+18+01
[bockjoo@cmspodman1 ~]$ podman exec -it $container_id ls -al /cmsuf/podman/data/store/user/rucio/bockjoo/sitedb.list_$mdYHMS
-rw-r--r-- 1 cmsprod cmsdata 27150 Aug 25 01:18 /cmsuf/podman/data/store/user/rucio/bockjoo/sitedb.list_Aug242023+21+18+01

[bockjoo@cms ~]$ # on the client
[bockjoo@cms ~]$ (
>    mdYHMS=Aug242023+21+18+01
>    export X509_USER_PROXY=/home/bockjoo/.cmssoft/phedex_proxy
>    /usr/bin/python3 $(which gfal-rm) davs://$mdYHMS
> )
davs://      DELETED

[4-4] Analysis Scenario

Change the grid-mapfile for the role cms0001 on cmspodman1 and cmspodman2

grep -q ^"#\"/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=bockjoo/CN=556538/CN=Bockjoo Kim\"" /opt/cms/etc/grid-security/grid-mapfile
if [ $? -eq 0 ] ; then
   echo INFO Bockjoo Kim DN is commented out in the grid-mapfile
   sed -i 's|"/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=bockjoo/CN=556538/CN=Bockjoo Kim" bockjoo|#"/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=bockjoo/CN=556538/CN=Bockjoo Kim" bockjoo|' /opt/cms/etc/grid-security/grid-mapfile
grep Bockjoo /opt/cms/etc/grid-security/grid-mapfile

Restart the Container, Thus XRootD

podman stop $container_id
container_id=$(podman run -d --rm --name xrootd_server \
               --cgroup-manager=cgroupfs --tmpfs /tmp \
               --tmpfs /run \
               -v /sys/fs/cgroup:/sys/fs/cgroup:ro \
               -v /opt/cms/etc/xrootd/:/etc/xrootd/:ro \
               -v /opt/cms/etc/grid-security/grid-mapfile:/etc/grid-security/grid-mapfile:ro \
               -v /opt/cms/etc/grid-security/ban-mapfile:/etc/grid-security/ban-mapfile:ro \
               -v /opt/cms/etc/grid-security/voms-mapfile:/etc/grid-security/voms-mapfile:ro \
               -v /opt/cms/etc/grid-security/hostcert.pem:/etc/grid-security/hostcert.pem:ro \
               -v /opt/cms/etc/grid-security/hostkey.pem:/etc/grid-security/hostkey.pem:ro \
               -v /opt/cms/etc/grid-security/xrd/:/etc/grid-security/xrd:rw \
               -v /opt/cms/etc/sysconfig/xrootd:/etc/sysconfig/xrootd:ro \
               -v /opt/cms/etc/systemd/system/[email protected]:/etc/systemd/system/[email protected]:ro \
               -v /opt/cms/etc/systemd/system/[email protected]:/etc/systemd/system/[email protected]:ro \
               -v /opt/cms/etc/hosts:/etc/hosts:rw \
               -v /opt/cms/etc/hostname:/etc/hostname:rw \
               -v /opt/cms/etc/systemd/system/systemd-hostnamed.service.d/:/etc/systemd/system/systemd-hostnamed.service.d/:ro \
               -v /opt/cms/etc/selinux/config:/etc/selinux/config:rw \
               -v /opt/cms/podman/:/opt/cms/podman/:rw \
               -v /opt/cms/store/:/opt/cms/store/:rw \
               -v /opt/cms/etc/SITECONF/:/etc/SITECONF/:ro \
               -v /cmsuf/:/cmsuf/:rw \
               --systemd=true --network=host --cgroup-manager=systemd localhost/xrootd_server:latest)

Check the role cms0001

# on the client
xrdfs ls /store/user/bockjoo (

# on the podman machine with the xrootd server
podman exec -it $container_id tail -50 /var/log/xrootd/clustered/xrootd.log | grep bockjoo | grep "login as"

[bockjoo@cms ~]$ xrdfs ls /store/user/bockjoo

[bockjoo@cmspodman1 ~]$ podman exec -it $container_id tail -50 /var/log/xrootd/clustered/xrootd.log | grep bockjoo | grep "login as"
230825 02:46:44 062 XrootdXeq: bockjoo.4066878:39@cms-data pvt IPv4 TLSv1.3 login as cms0001

xrdcp upload and remove the file

# client
mdYHMS=$(date +%b%d%Y+%H+%M+%S)
xrdcp -d 1 -f file://`pwd`/sitedb.list root://$mdYHMS/sitedb.list_$mdYHMS
ls -al /cmsuf/podman/data/store/temp/user/cms0001.$mdYHMS/sitedb.list_$mdYHMS

# podman
podman exec -it $container_id  ls -al /cmsuf/podman/data/store/temp/user/cms0001.${mdYHMS}/sitedb.list_${mdYHMS}

/usr/bin/python3 $(which gfal-ls) davs://$mdYHMS/sitedb.list_$mdYHMS
/usr/bin/python3 $(which gfal-rm) davs://$mdYHMS/sitedb.list_$mdYHMS

[4-5] Production Scenario

[4-5-1] SAM WebDAV Tests

Change the grid-mapfile for the role
sed -i 's|#"/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=bockjoo/CN=556538/CN=Bockjoo Kim" bockjoo|"/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=bockjoo/CN=556538/CN=Bockjoo Kim" bockjoo|' /opt/cms/etc/grid-security/grid-mapfile
grep Bockjoo /opt/cms/etc/grid-security/grid-mapfile
Restart the Container, Thus XRootD
podman stop $container_id
container_id=$(podman run -d --rm --name xrootd_server \
               --cgroup-manager=cgroupfs --tmpfs /tmp \
               --tmpfs /run \
               -v /sys/fs/cgroup:/sys/fs/cgroup:ro \
               -v /opt/cms/etc/xrootd/:/etc/xrootd/:ro \
               -v /opt/cms/etc/grid-security/grid-mapfile:/etc/grid-security/grid-mapfile:ro \
               -v /opt/cms/etc/grid-security/ban-mapfile:/etc/grid-security/ban-mapfile:ro \
               -v /opt/cms/etc/grid-security/voms-mapfile:/etc/grid-security/voms-mapfile:ro \
               -v /opt/cms/etc/grid-security/hostcert.pem:/etc/grid-security/hostcert.pem:ro \
               -v /opt/cms/etc/grid-security/hostkey.pem:/etc/grid-security/hostkey.pem:ro \
               -v /opt/cms/etc/grid-security/xrd/:/etc/grid-security/xrd:rw \
               -v /opt/cms/etc/sysconfig/xrootd:/etc/sysconfig/xrootd:ro \
               -v /opt/cms/etc/systemd/system/[email protected]:/etc/systemd/system/[email protected]:ro \
               -v /opt/cms/etc/systemd/system/[email protected]:/etc/systemd/system/[email protected]:ro \
               -v /opt/cms/etc/hosts:/etc/hosts:rw \
               -v /opt/cms/etc/hostname:/etc/hostname:rw \
               -v /opt/cms/etc/systemd/system/systemd-hostnamed.service.d/:/etc/systemd/system/systemd-hostnamed.service.d/:ro \
               -v /opt/cms/etc/selinux/config:/etc/selinux/config:rw \
               -v /opt/cms/podman/:/opt/cms/podman/:rw \
               -v /opt/cms/store/:/opt/cms/store/:rw \
               -v /opt/cms/etc/SITECONF/:/etc/SITECONF/:ro \
               -v /cmsuf/:/cmsuf/:rw \
               --systemd=true --network=host --cgroup-manager=systemd localhost/xrootd_server:latest)

Check bockjoo login as bockjoo
# on client
#   export X509_USER_PROXY=/home/bockjoo/.cmssoft/phedex_proxy
   xrdfs ls /store/user/bockjoo

# on the podman machine with the xrootd server
podman exec -it $container_id tail -50 /var/log/xrootd/clustered/xrootd.log | grep bockjoo | grep "login as"
export X509_CERT_DIR=/cvmfs/
export X509_USER_PROXY=/home/bockjoo/.cmsuser.proxy
export X509_USER_PROXY_NONCMS=/home/bockjoo/.griduser.proxy
export SAME_SENSOR_HOME=$HOME/cmssam/SiteTests/testjob
# The reference test
   cd $SAME_SENSOR_HOME/../SRMv2/tests/nap   
   $SAME_SENSOR_HOME/../SE/ -H ${host} -E ${host}:1094 -X $X509_USER_PROXY -N $X509_USER_PROXY_NONCMS -T RD3PCP /store/mc/SAM/  -T WRDEL3PCP /store/user/bockjoo -C /dev/null  > /opt/cms/services/T2/ops/webdav/runSAMWebDAV.$(echo $host | cut -d. -f1).out 2>&1 

# The WebDAV SAM test on the podman
   cd $SAME_SENSOR_HOME/../SRMv2/tests/nap   
   $SAME_SENSOR_HOME/../SE/ -H ${host} -E ${host}:1094 -X $X509_USER_PROXY -N $X509_USER_PROXY_NONCMS -T RD3PCP /store/mc/SAM/  -T WRDEL3PCP /store/user/bockjoo -C /dev/null  > /opt/cms/services/T2/ops/webdav/runSAMWebDAV.$(echo $host | cut -d. -f1).out 2>&1 

[4-5-2] SAM XRootD Tests

Preparation for the xrootd python (maybe unnecessary)
source /cvmfs/
cd ~/bin
ln -sf $(which cmake) cmake3
pip install --upgrade xrootd
Setting up latest SAM test package
cd /opt/cms/services/T2/ops/
git clone
cd cmssam/
cd SiteTests/SRMv2/tests/
git clone
Running the latest WebDAV SAM tests
export X509_CERT_DIR=/cvmfs/
export X509_USER_PROXY=/home/bockjoo/.cmsuser.proxy
export X509_USER_PROXY_NONCMS=/home/bockjoo/.griduser.proxy
export SAME_SENSOR_HOME=/opt/cms/services/T2/ops/cmssam/SiteTests/testjob

# Reference Test with a cmsio machine
    cd $SAME_SENSOR_HOME/../SRMv2/tests/nap
    $SAME_SENSOR_HOME/../SE/ -d --print-all -H ${host} -E ${host}:1094 -X $X509_USER_PROXY -N $X509_USER_PROXY_NONCMS -T RD3PCP /store/mc/SAM/  -T WRDEL3PCP /store/user/bockjoo// -C /dev/null  > /opt/cms/services/T2/ops/cmssam/se_webdav.$(echo $host | cut -d. -f1).out ; 

# Podman WebDAV test with a podman 
    cd $SAME_SENSOR_HOME/../SRMv2/tests/nap
    $SAME_SENSOR_HOME/../SE/ -d --print-all -H ${host} -E ${host}:1094 -X $X509_USER_PROXY -N $X509_USER_PROXY_NONCMS -T RD3PCP /store/mc/SAM/  -T WRDEL3PCP /store/user/bockjoo// -C /dev/null  > /opt/cms/services/T2/ops/cmssam/se_webdav.$(echo $host | cut -d. -f1).out ; 
Running the latest WebDAV SAM tests with token
CMS Token Twiki
Setting up the Token
eval `oidc-agent`
oidc-add bockjoo_xrd
#export BEARER_TOKEN=$(oidc-token --scope=offline_access --scope=storage.modify:/store/temp/user --time=3600 bockjoo_xrd)
export BEARER_TOKEN=$(oidc-token --scope=offline_access --time=3600 bockjoo_xrd)
SAM WebDAV Test This will not work until I have the token with the write-scope
# Reference
     cd $SAME_SENSOR_HOME/../SRMv2/tests/nap
     $SAME_SENSOR_HOME/../SE/ -d --print-all -H ${host} -E ${host}:1094 -X $X509_USER_PROXY -N $X509_USER_PROXY_NONCMS -T RD3PCP /store/mc/SAM/  -T WRDEL3PCP /store/temp/user/bockjoo_sam// -I $BEARER_TOKEN -C /dev/null > /opt/cms/services/T2/ops/cmssam/se_webdav.$(echo $host | cut -d. -f1).out

# Podman WebDAV test with a podman 
    cd $SAME_SENSOR_HOME/../SRMv2/tests/nap
    $SAME_SENSOR_HOME/../SE/ -d --print-all -H ${host} -E ${host}:1094 -X $X509_USER_PROXY -N $X509_USER_PROXY_NONCMS -T RD3PCP /store/mc/SAM/  -T WRDEL3PCP /store/user/bockjoo// -I $BEARER_TOKEN -C /dev/null  > /opt/cms/services/T2/ops/cmssam/se_webdav.$(echo $host | cut -d. -f1).out ; 

# This runs fine.
Running the latest XRootD SAM tests, fails, but I think it's due to 5.5.5 sever with 5.6.1 client
export X509_CERT_DIR=/cvmfs/
export X509_USER_PROXY=/home/bockjoo/.cmsuser.proxy
export X509_USER_PROXY_NONCMS=/home/bockjoo/.griduser.proxy
export SAME_SENSOR_HOME=/opt/cms/services/T2/ops/cmssam/SiteTests/testjob
export SAME_SENSOR_HOME=/cmsuf/t2/operations/opt/cms/services/T2/ops/cmssam/SiteTests/testjob
# The reference test
   cd $SAME_SENSOR_HOME/../SRMv2/tests/nap   
   $SAME_SENSOR_HOME/../SE/ -H ${host} -P 1094 -S T2_US_Florida -4 -C /dev/null -d --print-all > /opt/cms/services/T2/ops/cmssam/cmssam_xrootd_endpnt_uf.$(echo $host | cut -d. -f1).out

# The WebDAV SAM test on the podman
   cd $SAME_SENSOR_HOME/../SRMv2/tests/nap   
   $SAME_SENSOR_HOME/../SE/ -H ${host} -P 1094 -S T2_US_Florida -4 -C /dev/null -d --print-all > /opt/cms/services/T2/ops/cmssam/cmssam_xrootd_endpnt_uf.$(echo $host | cut -d. -f1).out

# This does not run with the xrootd server TLS and errs:
# XRootDStatus.code=110 "[FATAL] TLS error: resource temporarily unavailable"

[5] A Plan for the Migration from /cmsuf/data(accessible through the regular xrootd) to /cmsuf/podman/data(accessible through the contained xrootd)

Within Florida, 
1 Turn into a read-only XRootD
2 Configure cmsio machines for the XRootD podman containerization 
3 Two XRootD instances need to be coexist until we migrate all to the podman XRootD
4 ports 1095 and 3121 need to be open on the cmsio machines for the read-write podman XRootD

The overall picture of the migration will follow these steps:
1) would disable writes to your RSE/Lustre(T2_US_Florida) storage (A CMS thing)
2) you would copy all the data from /cmsuf/data/store to /cmsuf/podman/data/store
3) then update storage.json (PhEDEx/storage.xml and JobConfig/site-local-config.xml) with new /cmsuf/podman/data/store
4) re-enable writes to your RSE/Lustre storageT2_US_Florida)

[6] Troubleshooting

[6-1] Users in the user namespace (high UID/GID users) should exist to read/write files to /cmsuf/podman/data/store/? area

The user, bockjoo, inside the container is
       [bockjoo@cmspodman1 ~]$ podman exec -it ${container_id} id bockjoo
       uid=4575(bockjoo) gid=5000(avery) groups=5000(avery)
Starting uid/gid in the username space is 
       [bockjoo@cmspodman1 ~]$ grep bockjoo /etc/subuid | cut -d: -f2
       [bockjoo@cmspodman1 ~]$ grep bockjoo /etc/subgid | cut -d: -f2

The user uid/gid in the host for the contained user bockjoo would be:
       563326(=558752 + 4575 - 1)/563751(558752 + 5000 - 1)

The contained user, bockjoo, can see the CMS storage directory /cmsuf/podman/data/store:
       [bockjoo@cmspodman1 ~]$  podman exec -it ${container_id} su - bockjoo -c "ls /cmsuf/podman/data/store"
       backfill  local  temp  user
       [bockjoo@cmspodman1 ~]$ stat -c %U:%G /cmsuf/podman/data/store/user/bockjoo
The corresponding user, podman, in the host exist:
       [bockjoo@cmspodman1 ~]$ id podman
       uid=563326(podman) gid=563751(podman) groups=563751(podman)

On the other hand, the user, cmsprod, inside the container is
       [bockjoo@cmspodman1 ~]$ podman exec -it ${container_id} id cmsprod
       uid=723(cmsprod) gid=5001(cmsdata) groups=5001(cmsdata)
This user fails to see /cmsuf/podman/data/store/user/bockjoo:
       [bockjoo@cmspodman1 ~]$ podman exec -it ${container_id} su - cmsprod -c "ls /cmsuf/podman/data/store"
       ls: cannot access '/cmsuf/podman/data/store': Permission denied
This is because the corresponding user to the contained user, cmsprod, does not exist in the host:
The uig/gid of the contained user, cmsprod, in the host would be;
       559474(=558752 + 723 - 1)/563752(558752 + 5001 - 1)
but the uid does not exist in the host;
       [bockjoo@cmspodman1 ~]$ getent passwd 559474
       [bockjoo@cmspodman1 ~]$ echo $?

We need the "mirror image" users, the hostside container users.
Thus, the need for [2-7] Preparing the podman accounts in the user namespace.

[6-2] Container Stops after Logout

Tried --restart always or nohup but didn't work
Does this issue exist on a non-VM host? The container in a physical machine did not show this issue.

[6-3] lsof issue [Feb. 23, 2024]

lsof does not show the open files for the xrootd process

[6-4] Shoveler on the podman XRootD [Feb. 23, 2024]

This is something to test. I hope it's trivial.

[7] Testing in the production-like host

       All of the road bocks above should be cleared


[17] How to auto-start rootless pods using systemd :

podmanxrootd's People


bockjoo avatar



Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.