Code Monkey home page Code Monkey logo

rke2-ansible's People

Contributors

aceeric avatar adamacosta avatar aleiner avatar bgulla avatar copypastefail avatar daemonslayer2048 avatar danazag avatar dgvigil avatar fathiq avatar fimmicon avatar houstondad avatar hubvu avatar jcox10 avatar jroeber avatar jthapliya avatar laszlojau avatar mddamato avatar mlflr avatar mmikitka avatar robnk23 avatar ron1 avatar testinproduction0 avatar tuckcodes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rke2-ansible's Issues

Tarball install w/SELinux enforcing fails

Environmental Info
RKE2 Version: v1.20.10+rke2r1
[root@ip-10-11-12-13 rke2]# rke2 -v
rke2 version v1.20.10+rke2r1 (a4f6020)
go version go1.16.6b7

Node(s) CPU architecture, OS, and Version
Just one server node
[root@ip-10-11-12-13 rke2]# uname -a
Linux ip-10-11-12-13.donut.org 3.10.0-1160.42.2.el7.x86_64 #1 SMP Tue Sep 7 14:49:57 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

[root@ip-10-11-12-13 rke2]# cat /etc/redhat-release
CentOS Linux release 7.9.2009 (Core)

Cluster Configuration
One server node

Describe the bug
rke2-server v1.20.10 tarball install on CentOS 7.9 w/SELinux enforcing fails to start w/flag --selinux

See issue rancher/rke2#1865 for more information including reproduction details.

Question: Advertice node-ip ?

Hi,

First, thanks for this playbook. Since k3s has no support for GlusterFS, this is a very good alternative to k3sup.

Now to my question: it's maybe a stupid question, i have a special setup where all hosts where in a public setup.
Therefore the given requirement was to use wireguard for all traffic between the hosts. . Also a level 4 proxy (haproxy) should be used for the api server. So ive created the following test setup

server 1

  • public ip: 10.10.10.140
  • wireguard_ip: 192.168.0.1
  • additional hostnames: api.rke2.lb.local loadbalancer.rke2.lb.local
  • os: ubuntu 20.04.2 LTS
  • software: haproxy

server 2

  • public ip: 10.10.10.141
  • wireguard_ip: 192.168.0.2
  • additional hostnames: server-01.rke2.lb.locall
  • os: ubuntu 20.04.2 LTS

server 3

  • public ip: 10.10.10.142
  • wireguard_ip: 192.168.0.3
  • additional hostnames: server-02.rke2.lb.local
  • os: ubuntu 20.04.2 LTS

server 4

  • public ip: 10.10.10.143
  • wireguard_ip: 192.168.0.4
  • additional hostnames: server-03.rke2.lb.local
  • os: ubuntu 20.04.2 LTS

Every server knows the additonal hostnames of the other servers.

The HAProxy config looks like this:

global
    log /dev/log	local0
    log /dev/log	local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

    # Default SSL material locations
    ca-base /etc/ssl/certs
    crt-base /etc/ssl/private

    # See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
    ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
    ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
    ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets

defaults
    log	global
    mode	tcp
    option	dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http

frontend rke2_api
    bind :6443
    default_backend rke2_api_servers

backend rke2_api_servers
    server server-01 192.168.0.2:6443
    #server server-02 192.168.0.3:6443
    #server server-03 192.168.0.4:6443

frontend rke2_join
    bind :9345
    default_backend rke2_join_servers

backend rke2_join_servers
    server server-01 192.168.0.2:9345
    #server server-02 192.168.0.3:9345
    #server server-03 192.168.0.4:9345

server-02 and server-03 are disabled until the setup is done.

Now i've added the file inventory/my-cluster/group_vars/all.ymlwith the following content

kubernetes_api_server_host: api.rke2.lb.local

So the registration will use the loadbalancer.

My hosts-ini has the following content:

[rke2_servers]
[email protected]
[email protected]
[email protected]

[rke2_agents]

[rke2_cluster:children]
rke2_servers
rke2_agents

After running the playbook successfully, the output of kubectl get nodes -o wide shows the folling content

NAME                STATUS   ROLES                       AGE   VERSION          INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
rke2-lb-server-01   Ready    control-plane,etcd,master   50m   v1.21.2+rke2r1   10.10.10.141   <none>        Ubuntu 20.04.2 LTS   5.4.0-77-generic   containerd://1.4.4-k3s2
rke2-lb-server-02   Ready    control-plane,etcd,master   46m   v1.21.2+rke2r1   10.10.10.142   <none>        Ubuntu 20.04.2 LTS   5.4.0-77-generic   containerd://1.4.4-k3s2
rke2-lb-server-03   Ready    control-plane,etcd,master   47m   v1.21.2+rke2r1   10.10.10.143   <none>        Ubuntu 20.04.2 LTS   5.4.0-77-generic   containerd://1.4.4-k3s2

So during the registration the public ip was used instead of the wireguard ip. The log of the Loadbalancer also show no traffic on port 6443. Is it possible to modify this? Is this the option --node-ip during start? Is it possible to make this configurable in the playbook?

My current setup has no agents jet, but is there the same problem there?

Add to ansible-galaxy

Hi guys,

I've been playing around with this project a bit and am impressed with it. It works well in isolation, but because of the structure of this repository, to get the roles integrated into an existing ansible repo I had to manually copy the roles/ out and paste them into our repo.
I would like to open some discussion around restructuring the repo so that it can be easily imported into ansible-galaxy, for easier integration into customers existing infrastructure repositories.

Would something like that be possible? Hope what I'm asking for makes sense, would be happy to clarify.

Latest refactor PR breaks Ubuntu builds

Customer reported that the latest refactor breaks Ubuntu builds due to a couple of breaking changes. Customer reported the following recommendations to remediate:

  1. Move tarball code code from RPM -> Tarball ansible file
  2. In tarball ansible file remove all delegate_to 127.0.01

Add pofile cis-1.5 after initial rollout

Hi,

i've setup a cluster before the greater rework was done. The second cluster i've setup was after the rework. But i didn't saw, that profile: cis-1.5 wasn't default anymore. Is there any documention how to enable this after inital setup?

Thanks!

Failed: this task 'ansible.builtin.command' has extra param

Greetings,
Testing out some of the latest changes and ran into an issue that I thought I would report.

$ git log -1
commit 1fc0e3694c3ae4749d29d0a314ce1a65507b27e6 (HEAD -> main, origin/main, origin/HEAD)
Author: Mike D'Amato <[email protected]>
Date:   Fri Aug 6 13:31:40 2021 -0400

    Tarball needs agent service installation (#66)

Deploying from a Ubuntu 20.04 fully updated system to three VM's all Rocky 8.4 and fully updated (1 server; two agents)

$ ansible-playbook --version
ansible-playbook 2.9.6
  config file = ~/Code/Github_RancherFederal_rke2-ansible/ansible.cfg
  configured module search path = ['~/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3/dist-packages/ansible
  executable location = /usr/bin/ansible-playbook
  python version = 3.8.10 (default, Jun  2 2021, 10:49:15) [GCC 9.4.0]
$ ansible-playbook site.yml -i inventory/my-cluster/hosts.ini -K
[snip]
TASK [rke2_server : Setup initial server] **************************************************************************************************
fatal: [192.168.1.54]: FAILED! => {"reason": "this task 'ansible.builtin.command' has extra params, which is only allowed in the following modules: add_host, meta, shell, include, script, win_command, import_tasks, set_fact, include_role, win_shell, include_vars, import_role, include_tasks, command, group_by, raw\n\nThe error appears to be in '~/Code/Github_RancherFederal_rke2-ansible/roles/rke2_server/tasks/first_server.yml': line 22, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Wait for kubelet process to be present on host\n  ^ here\n"}

There is no wait that I can tell. It get's to this spot and immediately errors out.

Thoughts?
Thanks!

channel downgrade not respected on yum-based systems

I tried downgrading to channel=v1.19 and ran into a problem that I believe is as a result of this line:

state: latest # noqa package-latest

Because I had already run the play with channel=stable, the more recent .repo files already existed in /etc/yum.repos.d, so despite ansible reporting the major/minor versions specified to be installed, the latest was in-fact installed.

After manually removing the newer .repo files, the play deployed the correct version.

"Failed: this task 'ansible.builtin.command' has extra params" on HEAD w/ansible-playbook 2.10.3

Running HEAD w/ansible-playbook 2.10.3 fails with message: "ERROR! this task 'ansible.builtin.command' has extra params". This seems to be caused by Ansible issue "Fix missing ansible.builtin FQCNs in hardcoded action names" (ansible/ansible#71824). The following tasks currently using the problematic 'ansible.builtin.command' FQCN twice each:

  • roles/rke2_agent/tasks/main.yml
  • roles/rke2_server/tasks/other_servers.yml

This regression was introduced in the recently introduced "Begin Idempotency" commit: https://github.com/rancherfederal/rke2-ansible/pull/85A. The use of 'ansible.builtin.command' in this commit is also inconsistent with the commit associated with recently closed issue: "
Failed: this task 'ansible.builtin.command' has extra param" (#69)

The simple fix for this issue is to replace the use of 'ansible.builtin.command' with 'command' in these two tasks.

Optionally use INSTALL_RKE2_ARTIFACT_PATH for rke2.linux with "Rancher RKE2 Common" yum repo for rke2-selinux

Based on the directions taken by the Rancher system-upgrade-controller and the forthcoming Rancher 2.6 capi-based system-agent-installer-rke2, it seems the Rancher preference is to manage the rke2.linux self-extracting binary via a version-specific container image rather than a "Rancher RKE2 versioned" yum repo. However, on RHEL-based systems, it still seems to make sense to manage the rke2-selinux rpm via a "Rancher RKE2 Common" yum repo.

So, consider enabling use of the "Rancher RKE2 Common" yum repo separate from the "Rancher RKE2 versioned" yum repo. Also, handle configuration of selinux including yum installation of rke2-selinux package and enabling of selinux for containerd in config.yaml.

[Channels] Do you provide others RKE2 channel ?

Hello,

As mentioned in the title, do you provision other rke2 channel (v1.22/v1.21...) as specified in vars file of the role rke_common?
I would like to install rancher web UI to manage the cluster but (at this moment) it seems impossible to install the UI because Rancher helm chart only supports the Kubernetes < v1.22.0 (as mentioned here : rancher/rancher#34060 (comment)
InkedMicrosoftTeams-image_LI
)

So I test the v1.19, but it seems the env file is needed. I can't find any data about it.

Another issue:
One of the tarball link is dead (in the docs), maybe you should remove it.

Thanks for the reply.

Add more RKE2 config host var parameters

Add the ability to add individual hostvars for these specific rke2 config parameters.

See rke2-ansible/roles/rke2_common/tasks/config.yml for reference.

node_ip
node_name
bind_address
advertise_address
node_taints=[]
node_labels=[]
node_external_ip

Support an empty rke2_agents inventory group for RKE2 standalone deployments

I have been able to deploy an RKE2 standalone server by adding a single host to [rke2_servers] and NO hosts to [rke2_agents]. Due to the nature of the Ansible inventory that I am working with, I would like to remove the assumption that the rke2_agents inventory group is defined.

In particular, wherever we have when clauses like:

- inventory_hostname in groups['rke2_agents']

I would like to replace it with

- inventory_hostname in groups['rke2_agents'] | default(false)

This would not be required if it was possible to create an empty group at runtime, similar to add_host, but that does not seem to be possible.

A PR with the proposed change is forthcoming.

Idempotency

An operation is idempotent if the result of performing it once is exactly the same as the result of performing it repeatedly without any intervening actions

One should be able to run this playbook repeatedly and if no variables or inventory changes then everything should be left as is. There should not be concern about "did I run the playbook with this host/variable yet?" because if nothing has changed then the ansible-playbook run should not change anything.

Provide ability to test locally on laptop using something like Ansible Molecule

It would be nice to just run molecule test -s rhel-7.8 or molecule test -s ubuntu-20.04 to run end to end simulation (linting,multi-os,convergeof,idempotence) of ci-cd workflow as to avoid having to push up changes, wait for building of machines and find breaks in local environments after x period of time. Also, it'll save some $$ on AWS costs, and allow folks like me who like to code and test offline while on trains, planes and automobiles :)

Node specific config fails to be written to config.yml

Ansible version: 2.9.27
ansible.utils version: 2.4.3

Example inventory

[rke2_servers]
123.123.123.123 ansible_ssh_user="root" node_ip="10.0.0.1" bind_address="10.0.0.1" advertise_adress="10.0.0.1" node_external_ip="123.123.123.133"

[rke2_agents]

[rke2_cluster:children]
rke2_servers
rke2_agents

Running the config tasks of rke_common results in the following debug result:

[...]
TASK [rke2_common : Debug config] *************************************************************************************
ok: [123.123.123.123] => {
    "rke2_config": {
        "cni": "cilium",
        "debug": true,
        "node-label": [],
        "node-taint": [],
        "profile": "cis-1.6",
        "selinux": true
    }
}
Friday 21 January 2022  14:08:08 +0100 (0:00:00.030)       0:00:18.701 ******** 

TASK [rke2_common : Add node-ip to rke2_config] ***********************************************************************
ok: [123.123.123.123]
Friday 21 January 2022  14:08:08 +0100 (0:00:00.032)       0:00:18.733 ******** 

TASK [rke2_common : Debug changes] ************************************************************************************
ok: [123.123.123.123] => {
    "updated_rke2_config": {
        "changed": false,
        "failed": false,
        "rke2_config": {
            "cni": "cilium",
            "debug": true,
            "node-ip": "10.0.0.4",
            "node-label": [],
            "node-taint": [],
            "profile": "cis-1.6",
            "selinux": true
        }
    }
}
Friday 21 January 2022  14:08:08 +0100 (0:00:00.029)       0:00:18.763 ******** 
Friday 21 January 2022  14:08:08 +0100 (0:00:00.027)       0:00:18.790 ******** 

TASK [rke2_common : Debug config] *************************************************************************************
ok: [123.123.123.123] => {
    "rke2_config": {
        "cni": "cilium",
        "debug": true,
        "node-label": [],
        "node-taint": [],
        "profile": "cis-1.6",
        "selinux": true
    }
}
[...]

As you can see updated_rke2_config.changed is false even though the config did actually change. If I remove changed_when: false from the Add node-ip to rke2_config task the update works as expected.

Allow providing generic RKE2 configuration

Allow providing generic RKE2 configuration parameters without needing to write ansible logic.

Config file parameters should be used as source of truth and ansible should be able to parse for key parameters like profile: cis-1.x

Ironbank configuration

Add ability to switch on or add configuration required to use ironbank versions of RKE2 images

Using the tarball installation method on Centos8 causes "Unable to watch for tunnel endpoints..." error

Description
I am deploying a single-node RKE2 cluster using this ansible playbook. When using the air-gapped tarball installation method, I get the following error when running kubectl get pods -A :

The connection to the server was refused - did you specify the right host or port?

and when checking the system logs for rke2-server.service, I repeatedly see:

level=warning msg="Unable to watch for tunnel endpoints: Get \"https://127.0.0.1:6443/api/v1/namespaces/default/endpoints?fieldSelector=metadata.name%3Dkubernetes&resourceVersion=0&watch=true\": dial tcp 127.0.0.1:6443: connect: connection refused"

When deploying without the tarballs, I do not see this error.

Steps to Reproduce

  1. Clone the rke2-ansible repo onto what we'll call the "deployment VM".
  2. Install and deploy a base Centos8 VM from any of the available mirrors: http://isoredirect.centos.org/centos/8/isos/x86_64/. We'll call this the "target VM".
  3. Make sure the target VM is reachable at a known IP from the deployment VM.
  4. Configure passwordless SSH from your deployment VM onto the target VM.
  5. Configure passwordless sudo on the Centos8 VM for a user you control. We'll call them "test".
  6. Set the following values in inventory/sample/hosts.ini:
[rke2-servers]
{Insert IP of target VM} ansible_user=test

[rke2_cluster:children]
rke2_servers
  1. Download v1.20.7+rke2r2 tarballs (rke2-images.linux-amd64.tar.zst and rke2.linux-amd64.tar.gz) from https://github.com/rancher/rke2/releases/tag/v1.20.7%2Brke2r2 and place them in tarball_install
  2. Run the ansible playbook:
ansible-playbook site.yaml -i inventory/sample/hosts.ini
  1. Copy the kubeconfig file from the traget VM to ~/.kube/config on your deployment VM:
ssh test@{insert target VM IP} "sudo cp /etc/rancher/rke2/rke2.yaml ~/.kube/config && sudo chown \$USER:\$USER ~/.kube/config
scp test@{insert target VM IP}:/~/.kube/config ~/.kube/config
  1. Replace the loopback address in ~/.kube/config with the target VM IP.
  2. Run kubectl get pods -A. You should get no response or the response shown in the description.
  3. Access the target VM and run sudo systemctl status rke2-server.service to see the "Unable to watch..." message.

Additional Detail
RKE2 Version: 1.20.7+rke2r2
Possibly related to this issue from the RKE2 repo.

Support CIS 1.6

cis-1.5 is supported by the playbooks but since RKE2 supports CIS 1.6 it would be great if the ansible playbooks support this as well.

EL8 minimal install doesn't include tar

When testing on a minimal install of EL8 (RHEL and CentOS), tar isn't included which means I ended up with an error:

fatal: [192.168.1.155]: FAILED! => {"changed": false, "cmd": "tar -xf /tmp/ansible.gwik8e25rke2-install.XXXXXXXXXX/rke2.linux-amd64.tar.gz -C /usr/local", "msg": "[Errno 2] No such file or directory: b'tar': b'tar'", "rc": 2}

It would be nice if tar was on the list for ansible to check/install or have it on the pre-requirements list.

Thanks!

Stagger start RKE2 server nodes

To avoid any race conditions and a lot of error logs of servers that can't start until the first server is healthy and learner promotions are completed we should block starting/restarting servers until they are showing Ready state.

A possible solution is polling something like this on the host in question until True is observed.

/var/lib/rancher/rke2/bin/kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml \
--server https://127.0.0.1:6443 get no {{ inventory_hostname }} \
-o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'

Tarball install does not update contents when install path is changed

Environmental Info:
RKE2 Version:

rke2 version v1.21.6+rke2r1 (b915fc986e84582458af7131fe7f4e686f2af493)
go version go1.16.6b7

Node(s) CPU architecture, OS, and Version:

OpenSuse 15.3 (SLES 15 SP3)

Describe the bug:

When tarball install changes to /opt/rke2 or a custom path is used instead of /usr/local, installation fails because a base directory change is not reflected to tarball contents.

Steps To Reproduce:

Use OpenSuse/SLES servers as targets.

Expected behavior:
I expected the same behavior as https://get.rke2.io/ script. There is a section that handles this:

# unpack_tarball extracts the tarball, correcting paths and moving systemd units as necessary
unpack_tarball() {
    info "unpacking tarball file to ${INSTALL_RKE2_TAR_PREFIX}"
    mkdir -p ${INSTALL_RKE2_TAR_PREFIX}
    tar xzf "${TMP_TARBALL}" -C "${INSTALL_RKE2_TAR_PREFIX}"
    if [ "${INSTALL_RKE2_TAR_PREFIX}" != "${DEFAULT_TAR_PREFIX}" ]; then
        info "updating tarball contents to reflect install path"
        sed -i "s|${DEFAULT_TAR_PREFIX}|${INSTALL_RKE2_TAR_PREFIX}|" ${INSTALL_RKE2_TAR_PREFIX}/lib/systemd/system/rke2-*.service ${INSTALL_RKE2_TAR_PREFIX}/bin/rke2-uninstall.sh
        info "moving systemd units to /etc/systemd/system"
        mv -f ${INSTALL_RKE2_TAR_PREFIX}/lib/systemd/system/rke2-*.service /etc/systemd/system/
        info "install complete; you may want to run:  export PATH=\$PATH:${INSTALL_RKE2_TAR_PREFIX}/bin"
    fi
}

Actual behavior:
Tarball install fails in two ways:

  1. Playbook tries to move systemd units from /usr/local/... to /etc/systemd..., but source files are in tarball_dir (/opt/bin).
  2. After fixing previous issue, we can also check that systemctl start rke2-server fails. Original rke2-server.service file points to wrong binary path. See ExecStart=/usr/local/bin/rke2 server. Therefore, rke2 agent service and uninstall script are also affected.

Additional context / logs:

Allow generic manifest configurations

RKE2 allows the user to provide manifest files for default helm chart configs as well as manifest files to be deployed once the cluster is healthy.

User should be able to provide a directory or key/common manifest configurations for ansible to apply.

Config file created incorrectly when using multiple values in rke2_kubelet_args

If I consume the playbook and set the following vars:

vars:
    rke2_kubelet_args:
      - "feature-gates=DynamicKubeletConfig=false"
      - "image-gc-high-threshold=100"
      - "image-gc-low-threshold=99"

Then the config file that is generated has multiple instances of the kubelet-arg: variable set, instead of 1 single instance set with an array of values.

Expected output:

kubelet-arg: 
 - feature-gates=DynamicKubeletConfig=false
 - image-gc-high-threshold=100
 - image-gc-low-threshold=99

Actual:

kubelet-arg: feature-gates=DynamicKubeletConfig=false
kubelet-arg: image-gc-high-threshold=100
kubelet-arg: image-gc-low-threshold=99

In my specific case, kubelet then fails to load as it only knows about image-gc-low-threshold and not image-gc-high-threshold so I think this could've easily been missed in other clusters.

Offending task:

- name: Add rke2_kubelet_args
lineinfile:
path: /etc/rancher/rke2/config.yaml
line: "kubelet-arg: {{ item }}"
with_items:
- "{{ rke2_kubelet_args | default([]) }}"

EDIT:

I also assume this is the case for the other array based vars:

  • kube-apiserver-arg
  • kube-scheduler-arg
  • kube-controller-manager-arg
  • kubelet-arg
  • node-label

Wrong extension for images

Dear rancherfederal team.

According to documentation the rke2-images.linux-amd64.tar.zst should be used, seems like that in the Ansible tasks a typo occurred

Images Install

If the rke2-images.linux-amd64.tar.zst file is found in the tarbarll_install/ directory then this playbook will use those images and not docker.io or a private registry.

src: "{{ playbook_dir }}/tarball_install/rke2-images.linux-amd64.tar.gz"

path: "{{ playbook_dir }}/tarball_install/rke2-images.linux-amd64.tar.gz"

systemctl start fails causing fail on ansible run

In my manual attempts to deploy rke2, I often would get failure return codes from systemctl start .... Because of the way the restart logic is written the service appears to fail and restart several times during initialization. While that logic is beyond the scope of this project, I wonder if perhaps failures could be ignored during the start tasks in the role(s)?

Enable RPM Installs for Air-Gapped environments

Based on the feedback received regarding issue #86, I am attempting to move my air-gapped, selinux-enabled, rke2-ansible-based installation from using the tarball method to using the rpm method. In doing so, I encountered problems with task roles/rke2_common/tasks/rpm_install.yml due to its dependencies on the following internet URLs not typically available in air-gapped environments:

  1. URL https://update.rke2.io/v1-release/channels - enables determination of the rke2 version to be installed
  2. URL https://rpm.rancher.io/rke2 - defines yum repository rke2 baseurl prefix
  3. URL https://rpm.rancher.io/public.key - specifies public key to access rke2 yum repositories

Proposed enhancement: Integrate additional, optional parameters rke_version and repo_baseurl_prefix into task rpm_install.yml.

WDYT of this proposed enhancement to the RPM installation method?

Adding Flux CD & SOPS

Heyho,

awesome repo, currently using it to setup a small cluster at work at probably switching to using it instead of my terraformed rke cluster.

I am a huge gitops fan and so i added a flux cd role to a fork of this repo (and i will add another one for injecting a secret for sops support in flux).

I was wondering if you are interested in upstreaming this as an optional feature. Since this repo is a full playbook and not a role this would also help us/me greatly maintain our fork :P

// Robert

Undefined object in "Add primary configuration items" task

I'm having trouble running the playbook on three Ubuntu 20.04 machines from an Ubuntu 20.04 host running Ansible 2.10.

The message I see, after running ansible-playbook -i inventory/my-cluster/hosts.ini site.yml -vvv, is:

TASK [rke2_common : Add primary configuration items] **************************************************************************************************************************************************************
task path: /home/jon/rke2-ansible/roles/rke2_common/tasks/config.yml:17
The full traceback is:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/ansible/executor/task_executor.py", line 585, in _execute
    self._task.post_validate(templar=templar)
  File "/usr/lib/python3/dist-packages/ansible/playbook/task.py", line 307, in post_validate
    super(Task, self).post_validate(templar)
  File "/usr/lib/python3/dist-packages/ansible/playbook/base.py", line 431, in post_validate
    value = templar.template(getattr(self, name))
  File "/usr/lib/python3/dist-packages/ansible/template/__init__.py", line 844, in template
    d[k] = self.template(
  File "/usr/lib/python3/dist-packages/ansible/template/__init__.py", line 798, in template
    result = self.do_template(
  File "/usr/lib/python3/dist-packages/ansible/template/__init__.py", line 1066, in do_template
    res = j2_concat(rf)
  File "<template>", line 12, in root
  File "/usr/lib/python3/dist-packages/ansible/template/__init__.py", line 264, in wrapper
    ret = func(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/ansible/plugins/filter/core.py", line 69, in to_nice_yaml
    transformed = yaml.dump(a, Dumper=AnsibleDumper, indent=indent, allow_unicode=True, default_flow_style=False, **kw)
  File "/usr/lib/python3/dist-packages/yaml/__init__.py", line 290, in dump
    return dump_all([data], stream, Dumper=Dumper, **kwds)
  File "/usr/lib/python3/dist-packages/yaml/__init__.py", line 278, in dump_all
    dumper.represent(data)
  File "/usr/lib/python3/dist-packages/yaml/representer.py", line 27, in represent
    node = self.represent_data(data)
  File "/usr/lib/python3/dist-packages/yaml/representer.py", line 58, in represent_data
    node = self.yaml_representers[None](self, data)
  File "/usr/lib/python3/dist-packages/yaml/representer.py", line 231, in represent_undefined
    raise RepresenterError("cannot represent an object", data)
yaml.representer.RepresenterError: ('cannot represent an object', AnsibleUndefined)
fatal: [head]: FAILED! => {
    "changed": false
}

It seems to refer to line 17 of roles/rke2_common/tasks/config.yml, specifically it's looking for an rke2_config variable. A search on the repository doesn't show this defined anywhere. Am I missing something? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.