Code Monkey home page Code Monkey logo

kairos's People

Contributors

3pings avatar antongisli avatar areitz86 avatar c0ffee avatar christianprim avatar ci-robbot avatar dramich avatar ionitacatalina avatar itxaka avatar jimmykarily avatar kpiyush17 avatar ludea avatar mauromorales avatar mudler avatar ognian avatar omahs avatar oz123 avatar paynejacob avatar princesso avatar renovate[bot] avatar robarnold avatar saamalik avatar santhoshdaivajna avatar scuzhanglei avatar sdwilsh avatar stelb avatar venkatnsrinivasan avatar vfiftyfive avatar vipsharm avatar xiaoxianboy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kairos's Issues

HA support

Currently Kairos supports creating clusters automatically with a single master node and multiple workers. But doesn't automatically setup HA. HA can be manually configured with the k3s and k3s-agent block, but that leaves everything to be configured from the user.

In order to enable automatic HA, requirements:

re Hybrid approach, it likely depends on #72.

Documentation: bundles

Bundles has been already introduced and available to the master branch, however docs are not updated to reflect the new featureset.

Action items

  • Update docs to include references to c3os bundles
  • Add examples on how to use bundles

ubuntu flavor

Currently the image definition is there, but lacks few things in order to boot properly. This card is to make the ubuntu flavor functional and into releases.

Prepare intro video demo for README page

Would be super-helpful to have a small video to show in the front page, or either a gif that shows how c3os works, and how you can deploy quickly a cluster with it.

It would be sufficient to go over the simplest workflow which is described in the docs to have a functional kubernetes cluster.

The video should show at least:

  • Download c3os, turn on a machine (VM should be enough)
  • Create a simple config, push it to the machine
  • Show the cluster being booted up
  • Install something on it to show that is functional

Network token rotation

Network token rotation should be documented, although few steps could be eased out with the cli

Split c3os images

Is your feature request related to a problem? Please describe.
Currently images comes with p2p mesh support out of the box, which is good, but we want to offer also images without it.

Describe the solution you'd like
Additional c3os light and core variant for each flavor:

light: Will contain only the c3os-agent and the c3os-cli
core: Will contain only the c3os-agent

Besides, we can push the framework images to onboard and unlock bring-your-image scenarios.

The light/core will differ from the current releases, as they don't need to embed the k3s binary. We can think also of an additional flavor that keeps the k3s binary in and a slimmed c3os-agent-provider without mesh, but that's room for a later iteration.

Describe alternatives you've considered
N/A

Additional context
This seems a natural next step to me. As I've seen so far both use-cases, where a slimmed down image would be more appealing for simpler use-cases. This is also the result from what I can observe from people usage reports as well - there is a tendency to prefer Alpine, which is almost half size of the openSUSE image.

Change release version numbering

Currently our release version schema is composed of: k8s version plus c3OS version followed by a dash.

This card is about to change the versioning as to have the version of c3OS in front, followed by the k3s/k8s version as metadata, for instance:

v(semver)+k8sversion

This would also ease out core/light image releases.

Documentation: customize image

Documentation around image customization is lacking core aspects (e.g. change kernel, set hardcoded boot options, etc. )

merge k3s and k3s-agent config blocks

Currently the k3s and k3s-agent are two separated blocks, which makes the configuration cumbersome and less intuitive.

The proposal is to switch away from two separate blocks, and streamline only one block that can be used for both.

For example, consider the current status quo:

k3s:
  args: ..
k3s-agent:
  args: ...

we could switch away to something more simple as:

k3s:
  args: ..
  server: true

So, instead of laying down both systemd settings, we could also generate them in runtime with the appropriate service, allowing the user full customization of how the k3s bins should run.

Action items

  • Merge k3s and k3s-agent blocks into k3s
  • Add an additional field to optionally discriminate between server and agent, otherwise fall back to all the config provided by the user
  • Optional: generate the services files in runtime. This might not be necessary at all, so at dev discretion

SLSA

In order to provide a secure supply chain, the packages repository should try to comply as much as possible to the SLSA standards

Action Items

The package repository to comply to SLSA standard should at least:

  • sign all the produced packages images packages with cosign
  • use only sources images signed with cosign
  • don't use keyless if possible
  • packages should have checksum, signed etc so they can be checked back during build phase
  • use mhash to verify runtime images, push mhash signature with cosign to check back

Open questions

  • Can we secure the build environment? we should dogfood #347

c3os release 0.57

๐Ÿ—บ What's left for release

๐Ÿ”ฆ Highlights

  • c3os bundles support
  • Initial split of c3os images to core and providers
  • docs to reflect new features

โœ… Release Checklist

  • Stage 0 - Finishing Touches
    • Check c3os/packages, and for any needed update
    • Make sure CI tests are passing.
  • Stage 1 - Infrastructure Testing
    • How: Using the testing version, make sure that manual and k8s upgrades are working from the latest release, and that docs are still aligned
    • Where:
      • Two c3os nodes
        • Deploy latest release with automatic node setup
        • Upgrade to testing release
        • Analyze
          • Create deployments
          • Keep cluster running overnight
          • Run upgrades and verify workload is still running
          • Keep cluster running overnight
  • Stage 3 - Release
    • Tag the release on master
      • c3os
      • provider-c3os
  • Stage 4 - Update Upstream
    • Update the examples to the final release
    • Update the upstream testing branches to the final release and create PRs.
  • Make required changes to the release process.

Kairos release 1.0

๐Ÿ—บ What's left for release

๐Ÿ”ฆ Highlights

  • Support for manually configured HA
  • Configure boot options, such as kernel boot parameters after install
  • Automatically detect installation device with 'auto'
  • Preliminary support to bundles
  • Netboot support
  • Secure Boot support
  • Repository changed from quay.io/c3os to quay.io/kairos
  • Kairos now publishes core images only. Provider-kairos ships k3s.

โœ… Release Checklist

  • Stage 0 - Finishing Touches
    • Check c3os/packages, and for any needed update
    • Make sure CI tests are passing.
  • Stage 1 - Infrastructure Testing
    • How: Using the testing version, make sure that manual and k8s upgrades are working from the latest release, and that docs are still aligned
    • Where:
      • Two c3os nodes
        • Deploy latest release with automatic node setup
        • Upgrade to testing release
        • Analyze
          • Create deployments
          • Keep cluster running overnight
          • Run upgrades and verify workload is still running
          • Keep cluster running overnight
  • Stage 3 - Release
    • Tag the release on master.
      • Run NO_PUSH=true go run ./.github/tag.go <tag> to check that the correct tag will be created
      • Run go run ./.github/tag.go <tag> to tag a new release
  • Stage 4 - Update Upstream
    • Update the examples to the final release
    • Update the upstream testing branches to the final release and create PRs.
  • Make required changes to the release process.

:penguin: Add support for NVIDIA Jetson Nano

Is your feature request related to a problem? Please describe.
Support the NVIDIA Jetson Nano device. More and more enterprises are building AI/inferencing apps using Jetson appliances.

Describe the solution you'd like
N/A

Describe alternatives you've considered
N/A

Additional context
N/A

Cloud config log files not available

C3OS version:
Alpine, latest

CPU architecture, OS, and Version:

Describe the bug
There are no logs for either cos-setup-boot and cos-setup-network.
Ideally, those should be also covered by logrotate (see #75)

To Reproduce

Expected behavior
Have logs in /var/logs, similarly as cos-reconcile

Logs

Additional context
Reported by @deekue in chat

c3os-agent segfault on RPI4

Here after is the journald log

Feb 03 12:01:47 monitor systemd[1]: c3os-agent.service: Scheduled restart job, restart counter is at 1.
Feb 03 12:01:47 monitor systemd[1]: Stopped c3os agent.
Feb 03 12:01:47 monitor systemd[1]: Started c3os agent.
Feb 03 12:01:47 monitor c3os[3034]: panic: runtime error: invalid memory address or nil pointer dereference
Feb 03 12:01:47 monitor c3os[3034]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x60 pc=0xb83bc4]
Feb 03 12:01:47 monitor c3os[3034]: goroutine 1 [running]:
Feb 03 12:01:47 monitor c3os[3034]: main.agent({0xf9042f, 0x15}, {0x40009018d8, 0x2, 0x2}, 0x0)
Feb 03 12:01:47 monitor c3os[3034]:         /work/cli/agent.go:42 +0x94
Feb 03 12:01:47 monitor c3os[3034]: main.main.func6(0x40002be160)
Feb 03 12:01:47 monitor c3os[3034]:         /work/cli/main.go:165 +0x10c
Feb 03 12:01:47 monitor c3os[3034]: github.com/urfave/cli.HandleAction({0xc4fde0, 0x1195e48}, 0x40002be160)
Feb 03 12:01:47 monitor c3os[3034]:         /go/pkg/mod/github.com/urfave/[email protected]/app.go:524 +0xfc
Feb 03 12:01:47 monitor c3os[3034]: github.com/urfave/cli.Command.Run({{0xe3b142, 0x5}, {0x0, 0x0}, {0x4000936070, 0x1>
Feb 03 12:01:47 monitor c3os[3034]:         /go/pkg/mod/github.com/urfave/[email protected]/command.go:173 +0x5fc
Feb 03 12:01:47 monitor c3os[3034]: github.com/urfave/cli.(*App).Run(0x400025a000, {0x40000b2000, 0x2, 0x2})
Feb 03 12:01:47 monitor c3os[3034]:         /go/pkg/mod/github.com/urfave/[email protected]/app.go:277 +0x618
Feb 03 12:01:47 monitor c3os[3034]: main.main()
Feb 03 12:01:47 monitor c3os[3034]:         /work/cli/main.go:338 +0x1458
Feb 03 12:01:47 monitor systemd[1]: c3os-agent.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Feb 03 12:01:47 monitor systemd[1]: c3os-agent.service: Failed with result 'exit-code'.

CNI

It would be interesting to explore the idea of building up a CNI around edgevpn. In this case we could natively handle the networking from the stack below. This is mostly explorative as now. Just had this idea on top of my head.

Add ubuntu and fedora flavor

those can be easily tweaked from the openSUSE ones, and should be able to support pairing and automated setup as the openSUSE one as well

๐ŸŒฑ SBOM

In order to keep track and be transparent on what is shipped on each release, would be preferred to have an automated process that collects SBOM information in c3os context

Action items

On releases, we should attach among artifacts:

Open questions

  • How to collect k3s in SBOM?
  • How to collect bundles SBOM?

We use

  • K3s
  • Base distro packages (kernel, systemd, grub, etc)
  • Our packages
  • Elemental-cli (fork)
  • EdgeVPN (uses libp2p)
  • Luet
  • KubeVIP

Deliverables

(those might already exist, to ๐Ÿ‘€ )

  • tool to create spdx files out from OS packages information
  • tool to create spdx files out from luet installed packages

Already existing tools

Action Items

  • SBOM attached to releases for kairos-io/kairos
  • SBOM attached to releases for kairos-io/provider-kairos

Accept kubernetes config style as input

cloud config and c3os blocks should be ideally configurable as standard kubernetes resources. To follow the same API-style, a configuration similar to kubernetes API is wanted for consistency, example taken from kind:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
    image: kindest/node:$KUBE_VERSION
    kubeadmConfigPatches:
      - |
        kind: InitConfiguration
        nodeRegistration:
          kubeletExtraArgs:
            node-labels: "ingress-ready=true"

Alpine: c3os-install keeps looping after install with netboot

Description

Seems that after installing with netboot, the c3os-install service keeps looking so it tries to install again.

Context

this was originally reported on the Matrix chat.

Notes

We should check if the nodepair.enable bootarg is present after the automated install. that triggers the c3os-install component to kick in.

Action items

  • Check if the problem is reproducible
  • Document netboot process - might be also that this is caused by re-trying to boot with netinstall after a successful installation

๐Ÿง rockylinux flavor

Currently the image definition is there, but lacks few things in order to boot properly. This card is to make the rockylinux flavor functional and into releases.

Allow to scope roles

Currently roles are dynamically assigned, but they can't be scoped to single nodes.

  • Allow with configuration to scope node role's done in bd63acd
  • Role scoping via hostname
  • Allow to set node roles (controlplane, worker, etc as k3s options)
  • Set roles manually from the CLI done in bd63acd

๐Ÿง tumbleweed flavor

Currently the image definition is there, but lacks few things in order to boot properly. This card is to make the tumbleweed flavor functional and into releases.

  • opensuse should then track tumbleweed
  • new flavor "opensuse-leap" to track opensuse leap
  • update docs, this is a breaking change in our repos (opensuse tracks tumbleweed now, and redirect users of leap to the new images)

edgevpn operator

A c3os operator would be nice to represent EdgeVPN resources and cloud configs natively in the Kubernetes cluster. Ideally the operator would connect over the network and be a shim for a kube API CRD.

Kairos release 1.1.1

๐Ÿ—บ What's left for release

Deliverables, as they are currently flying:

๐Ÿ”ฆ Highlights

This release is focused on bringing support for more base images

โœ… Release Checklist

  • Stage 0 - Finishing Touches
    • Check c3os/packages, and for any needed update
    • Make sure CI tests are passing.
  • Stage 1 - Infrastructure Testing
    • How: Using the testing version, make sure that manual and k8s upgrades are working from the latest release, and that docs are still aligned
    • Where:
      • Two c3os nodes
        • Deploy latest release with automatic node setup
        • Upgrade to testing release
        • Analyze
          • Create deployments
          • Keep cluster running overnight
          • Run upgrades and verify workload is still running
          • Keep cluster running overnight
  • Stage 3 - Release
    • Tag the release on master.
      • Run NO_PUSH=true go run ./.github/tag.go <tag> to check that the correct tag will be created
      • Run go run ./.github/tag.go <tag> to tag a new release
  • Stage 4 - Update Upstream
    • Update the examples to the final release
    • Update the upstream testing branches to the final release and create PRs.
  • Make required changes to the release process.

doc: netbooting

Currently, even if supported, there are no specific docs for netbooting. This card is to expand the docs to cover that scenario, and provide some inline examples. Ideally, this should be tested by the CI, but might be non obvious - so manual QA tests for now would do.

Alpine: test fails "Call to flock failed"

C3OS version:
Current master, not released

CPU architecture, OS, and Version:

Describe the bug
Test failure: https://github.com/c3os-io/c3os/runs/7554332112?check_suite_focus=true

To Reproduce
N/A

Expected behavior

Logs

      <string>: {"level":"info","ts":1658989504.254293,"caller":"provider/bootstrap.go:96","msg":"Configuring VPN"}
      {"level":"info","ts":1658989504.3057752,"caller":"provider/bootstrap.go:96","msg":"Configuring VPN"}
      Provider c3os at /usr/bin/agent-provider-c3os had an error: Failed setup VPN: could not start svc: failed starting service: edgevpn.  * Call to flock failed: Resource temporarily unavailable
       * WARNING: edgevpn is already starting
       (exit status 1)

Additional context
Maybe a /run / /var/run issue?

:seedling: p2p api: communication via socket

EdgeVPN supports communication via socket, and currently kairos sets up an API which is accessible from localhost.

This card is about to change the default edgevpn service to bind to a socket instead of a local port:

  • replace the HTTP listening API with socket
  • Set up services and all the necessary parts to scope the socket to a specific system user
  • Configure the kairos cli to talk to that instead of the API

Publish framework images as part of releases

Is your feature request related to a problem? Please describe.
Currently framework images are pushed only from master builds

Describe the solution you'd like
framework images to be published also as part of the release process

Describe alternatives you've considered

Additional context

Split CLI

The current CLI has both control and agent logic embedded. It makes sense to extract them into two separate CLIs, one c3os which is user-facing and the other one c3os-agent which is running inside the nodes.

  • Refactor current cli code into an agent-generic framework library
  • Build two binaries instead of 1, always from the same Dockerfile

ntpd fails to start on alpine ARM image

The logs say:

Jan  6 12:00:20 monitor daemon.warn openrc[1809]: WARNING: clock skew detected!
Jan  6 12:00:21 monitor daemon.err acpid[1830]: /dev/input/event0: No such file or directory
Jan  6 12:00:21 monitor daemon.err /etc/init.d/ntpd[1855]: start-stop-daemon: failed to start `
/usr/sbin/ntpd'
Jan  6 12:00:21 monitor daemon.err /etc/init.d/ntpd[1837]: ERROR: ntpd failed to start
โ€ฆ
Jan  6 12:00:30 monitor auth.info sshd[1887]: User c3os not allowed because account is locked
Jan  6 12:00:32 monitor auth.err sshd[1887]: error: Could not get shadow information for NOUSER
Jan  6 12:00:32 monitor auth.info sshd[1887]: Failed password for invalid user c3os from 192.168.77.138 port 50054 ssh2
Jan  6 12:00:33 monitor auth.info sshd[1887]: Connection closed by invalid user c3os 192.168.77.138 port 50054 [preauth]

Is there some configuration to add to cloud-init to enable ntp?

Split c3os-provider to its own repo

Is your feature request related to a problem? Please describe.

With the c3os split, the c3os provider should sit in a specific repository, following its own release, versioning and tests.

Describe the solution you'd like

  • A separate repo with the c3os provider code
  • Adapt c3os agent to the split - interactive-installer depends on few provider parts, new events to be emitted
  • Move cli code parts inside the provider. It makes sense as most of the mesh connectivity is associated with the provider.
  • Migrate specific tests and releases over the new repo
  • Update docs

Describe alternatives you've considered

Additional context

fedora flavor

Currently the image definition is there, but lacks few things in order to boot properly. This card is to make the fedora flavor functional and into releases.

Split c3os in variants

Is your feature request related to a problem? Please describe.
p2p mesh support is nice, but would be even nicer to have a variant without so people can choose what to pick

Describe the solution you'd like
a c3os variant with stripped out mesh support/c3os provider

Describe alternatives you've considered

Additional context

systemctl can't start k3s

Hey there!
I really have been enjoying getting started with c3os!
I think it will serve as a great replacement for my 1U SuperMicro Server that was running K3OS.

I was leveraging the Manual Installation Portion - once I booted from the USB in GRUB2.

I had just shelled into the IP, SCP'd over a cloud_init file that looked like this:

c3os:
  network_token: "...."
  role: "master"
vpn:
  # EdgeVPN environment options
  DHCP: "false"
  ADDRESS: "10.1.0.2/24"
  
stages:
   initramfs:
     - name: "Set user and password"
       users:
        c3os:
          passwd: "c3os"
   network:
     - if: '[ ! -f "/run/cos/recovery_mode" ]'
       name: "Setup k3s"
       environment_file: "/etc/sysconfig/k3s"
       environment:
         K3S_TOKEN: "..."
       systemctl:
         start: 
         - k3s

Then just ran sudo elemental install /dev/sda --cloud-init ./cloud_init.yaml.
The install went great!
And then once it was done, I rebooted, popp'd the USB drive out, shell'd into the box and snagged the kubeconfig from: sudo cat /etc/rancher/k3s/k3s.yaml.
I was able to then on my workstation, hop on the kubectl with that --kubeconfig file.
I installed cert-manager and rancher.
It all was working like a charm.
Got into the dashboard and everything.
Reset the self-generated password.
Powered it down for the night.

When I booted it up this morning, I couldn't get kubectl to interact with the node.
I hopp'd on and took a peek at the journalctl logs:

Apr 27 15:54:21 c3os kernel: Bridge firewalling registered
Apr 27 15:54:23 c3os k3s[1852]: time="2022-04-27T15:54:23.449018811Z" level=info msg="Starting k3s v1.21.10+k3s1 (471f5eb3)"
Apr 27 15:54:23 c3os k3s[1852]: time="2022-04-27T15:54:23.478135963Z" level=info msg="Configuring sqlite3 database connection pooling: maxIdleConns=2, maxOpenConns=0, connMaxLifetime=0s"
Apr 27 15:54:23 c3os k3s[1852]: time="2022-04-27T15:54:23.478294563Z" level=info msg="Configuring database table schema and indexes, this may take a moment..."
Apr 27 15:54:23 c3os k3s[1852]: time="2022-04-27T15:54:23.498595746Z" level=info msg="Database tables and indexes are up to date"
Apr 27 15:54:23 c3os k3s[1852]: time="2022-04-27T15:54:23.799893800Z" level=info msg="Kine listening on unix://kine.sock"
Apr 27 15:54:23 c3os k3s[1852]: time="2022-04-27T15:54:23.824414503Z" level=fatal msg="starting kubernetes: preparing server: bootstrap data already found and encrypted with different token"
Apr 27 15:54:23 c3os systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
Apr 27 15:54:23 c3os systemd[1]: k3s.service: Failed with result 'exit-code'.
Apr 27 15:54:23 c3os systemd[1]: Failed to start Lightweight Kubernetes.
Apr 27 15:54:23 c3os elemental[1731]: ERRO[2022-04-27T15:54:23Z] Job for k3s.service failed because the control process exited with error code.
Apr 27 15:54:23 c3os elemental[1731]: See "systemctl status k3s.service" and "journalctl -xe" for details.
Apr 27 15:54:23 c3os elemental[1731]: ERRO[2022-04-27T15:54:23Z] failed to run systemctl start k3s: exit status 1
Apr 27 15:54:23 c3os elemental[1731]: ERRO[2022-04-27T15:54:23Z] 1 error occurred:
Apr 27 15:54:23 c3os elemental[1731]:         * failed to run systemctl start k3s: exit status 1
Apr 27 15:54:23 c3os elemental[1731]:  
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Stage 'network'. Defined stages: 1. Errors: true
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Done executing stage 'network'
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Running stage: network.after
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing /system/oem/00_datasource.yaml
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing /system/oem/00_rootfs.yaml
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing /system/oem/02_agent.yaml
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing /system/oem/03_branding.yaml
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing /system/oem/04_installer.yaml
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing /system/oem/05_network.yaml
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing /system/oem/06_recovery.yaml
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing /system/oem/07_live.yaml
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing /system/oem/10_accounting.yaml
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing /system/oem/11_persistency.yaml
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing /system/oem/20_recovery_mode.yaml
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing /system/oem/21_grub.yaml
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing /oem/99_custom.yaml
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Done executing stage 'network.after'
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Running stage: network.before
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing BOOT_IMAGE=(loop0)/boot/vmlinuz console=tty1 console=ttyS0 root=LABEL=COS_ACTIVE cos-img/filename=/cOS/active.img pani>
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Done executing stage 'network.before'
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Running stage: network
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing BOOT_IMAGE=(loop0)/boot/vmlinuz console=tty1 console=ttyS0 root=LABEL=COS_ACTIVE cos-img/filename=/cOS/active.img pani>
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Done executing stage 'network'
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Running stage: network.after
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Executing BOOT_IMAGE=(loop0)/boot/vmlinuz console=tty1 console=ttyS0 root=LABEL=COS_ACTIVE cos-img/filename=/cOS/active.img pani>
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Done executing stage 'network.after'
Apr 27 15:54:23 c3os elemental[1731]: INFO[2022-04-27T15:54:23Z] Some errors found but were ignored. Enable --strict mode to fail on those or --debug to see them in the log
Apr 27 15:54:23 c3os elemental[1731]: WARN[2022-04-27T15:54:23Z] 2 errors occurred:
Apr 27 15:54:23 c3os elemental[1731]:         * No metadata/userdata found. Bye
Apr 27 15:54:23 c3os elemental[1731]:         * failed to run systemctl start k3s: exit status 1
Apr 27 15:54:23 c3os elemental[1731]:  
Apr 27 15:54:23 c3os systemd[1]: cos-setup-network.service: Succeeded.
Apr 27 15:54:23 c3os systemd[1]: Finished cOS setup after network.
Apr 27 15:54:23 c3os systemd[1]: Started c3os agent.
Apr 27 15:54:23 c3os elemental[1339]: INFO[2022-04-27T15:54:23Z] Command output:

###
### Further Down In the Journalctl Logs, the below repeats several times
###

Apr 27 16:44:54 c3os systemd[1]: k3s.service: Failed with result 'exit-code'.
Apr 27 16:44:54 c3os systemd[1]: Failed to start Lightweight Kubernetes.
Apr 27 16:44:59 c3os c3os[1866]: 2022-04-27T16:44:59.704Z        INFO        c3os        service/node.go:300        Applying role 'auto'
Apr 27 16:44:59 c3os c3os[1866]: 2022-04-27T16:44:59.704Z        INFO        c3os        service/role.go:115        Role loaded. Applying auto
Apr 27 16:44:59 c3os c3os[1866]: 2022-04-27T16:44:59.705Z        INFO        c3os        role/auto.go:23        Active nodes:[]
Apr 27 16:44:59 c3os c3os[1866]: 2022-04-27T16:44:59.706Z        INFO        c3os        role/auto.go:24        Advertizing nodes:[]
Apr 27 16:44:59 c3os c3os[1866]: 2022-04-27T16:44:59.706Z        INFO        c3os        role/auto.go:27        Not enough nodes
Apr 27 16:44:59 c3os c3os[1866]: 2022-04-27T16:44:59.706Z        INFO        c3os        service/node.go:300        Applying role 'master'
Apr 27 16:44:59 c3os c3os[1866]: 2022-04-27T16:44:59.706Z        INFO        c3os        service/role.go:115        Role loaded. Applying master
Apr 27 16:44:59 c3os c3os[1866]: 2022-04-27T16:44:59.706Z        WARN        c3os        [email protected]/log.go:175        Failed applying rolemasternode doesn't have an ip yet
Apr 27 16:44:59 c3os systemd[1]: k3s.service: Scheduled restart job, restart counter is at 449.
Apr 27 16:44:59 c3os systemd[1]: Stopped Lightweight Kubernetes.
Apr 27 16:44:59 c3os systemd[1]: Starting Lightweight Kubernetes...
Apr 27 16:44:59 c3os sh[13919]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
Apr 27 16:44:59 c3os sh[13924]: Failed to get unit file state for nm-cloud-setup.service: No such file or directory
Apr 27 16:45:01 c3os k3s[13930]: time="2022-04-27T16:45:01.230256008Z" level=info msg="Starting k3s v1.21.10+k3s1 (471f5eb3)"
Apr 27 16:45:01 c3os k3s[13930]: time="2022-04-27T16:45:01.234393722Z" level=info msg="Configuring sqlite3 database connection pooling: maxIdleConns=2, maxOpenConns=0, connMaxLifetime=0s"
Apr 27 16:45:01 c3os k3s[13930]: time="2022-04-27T16:45:01.234616000Z" level=info msg="Configuring database table schema and indexes, this may take a moment..."
Apr 27 16:45:01 c3os k3s[13930]: time="2022-04-27T16:45:01.235844848Z" level=info msg="Database tables and indexes are up to date"
Apr 27 16:45:01 c3os k3s[13930]: time="2022-04-27T16:45:01.268516607Z" level=info msg="Kine listening on unix://kine.sock"
Apr 27 16:45:01 c3os k3s[13930]: time="2022-04-27T16:45:01.316561140Z" level=fatal msg="starting kubernetes: preparing server: bootstrap data already found and encrypted with different token"
Apr 27 16:45:01 c3os systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
Apr 27 16:45:01 c3os systemd[1]: k3s.service: Failed with result 'exit-code'.

I'm thinking I just may have configured the cloud-init incorrectly?
Should I have given a token that was alpha-numeric aside from just like ellipses "..."/"...."?
Is it failing to start because of the:
No metadata/userdata found. Bye?
Would this be something I could correct on the box somewhere in the /oem/ path - or would it be easiest to blow it away and re-install it with a better cloud-init yaml config?

Thanks again for all the hard work on this project!
It's awesome!
I definitely appreciate any info/feedback!

Check if streamed events exists while notifying

small nit: we should consider failing here if we pass an event which is not exposed in the bus. in that way we can signal the user earlier if no event is registered in the sdk's bus. otherwise the event would be silently ignored - no need to fix it right now, but just to keep in mind

Originally posted by @mudler in #55 (comment)

Internal DNS

Internal DNS allows to propagate custom domain to the cluster nodes. Currently the feature is available but needs to be configured manually, this issue is to automatically set that up:

  • Have a config stanza to enable/disable embedded DNS
  • Automatically wire nodes to use the embeddedDNS
  • Expose DNS optional configuration (forwarding, cache-size, etc.)

Consume the SDK in the kairos-agent-provider

Is your feature request related to a problem? Please describe.
Currently the kairos-agent-provider is not consistent with what is provided by the SDK

Describe the solution you'd like
the kairos-agent-provider to satisfy the SDK contract

Describe alternatives you've considered
Keep things as they are

Additional context

c3os upgrade

a wrapper around cos-upgrade for manual upgrades which gets the latest available version (from github) and compares to the system to see if it's required to do upgrade.

Should also be capable of listing available images and to select latest depending on the flavor.

macOS cli binary

First of all, thanks for the very nice project.

Currently, there is no such version for darwin binary (macOS users). Can we please add to the build process?

Ubuntu kernel drivers too old!

C3OS version:
1.23.6-53

CPU architecture, OS, and Version:
x86, Ubuntu 20.04 LTS

Describe the bug
The Lenovo P350 Tiny appliances have an embedded I219-LM Ethernet card. The Ubuntu 20.04 LTS variant does have the e1000e kernel drivers for intel network cards -- but the version is 3.2.6-k -- over 8 years old from ~2015! Unfortunately, the Ethernet card is not claimed by the older drivers =/

To Reproduce
N/A

Expected behavior
The network card should be detected

Logs
N/A

Additional context
For an installer system like C3OS, it's not feasible to download and compile drivers for all hardware, but to mitigate a poor experience, perhaps it's best to switch linux-image-generic-hwe-20.04 which includes the working drivers for this hardware. If you agree, I'm happy to submit a PR to switch the build to use the newer kernel and kernel drivers.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.