Code Monkey home page Code Monkey logo

microcloud's Introduction

MicroCloud

MicroCloud MicroCloud allows you to deploy your own fully functional cloud in minutes.

It’s a snap package that can automatically configure LXD, Ceph, and OVN across a set of servers. It relies on mDNS to automatically detect other servers on the network, making it possible to set up a complete cluster by running a single command on one of the machines.

MicroCloud creates a small footprint cluster of compute nodes with distributed storage and secure networking, optimized for repeatable, reliable remote deployments. MicroCloud is aimed at edge computing, and anyone in need of a small-scale private cloud.

Requirements?

MicroCloud requires a minimum of three machines. It supports up to 50 machines.

To use local storage, each machine requires a local disk. To use distributed storage, at least three additional disks (not only partitions) for use by Ceph are required, and these disks must be on at least three different machines.

MicroCloud logo

Once the simple initialisation is complete, users can launch, run and manage their workloads using system containers or VMs, and otherwise utilise regular LXD functionality.

How to get started

To get started, install the LXD, MicroCeph, MicroOVN and MicroCloud snaps. You can install them all at once with the following command:

snap install lxd microceph microovn microcloud

Then start the bootstrapping process with the following command:

microcloud init

Following the simple CLI prompts, a working MicroCloud will be ready within minutes.

The MicroCloud snap drives three other snaps (LXD, MicroCeph, and MicroOVN), enabling automated deployment of a highly available LXD cluster for compute with Ceph as the storage driver and OVN as the managed network.

During initialisation, MicroCloud detects the other servers and then prompts you to add disks to Ceph and configure the networking setup.

At the end of this, you’ll have an OVN cluster, a Ceph cluster, and a LXD cluster. LXD itself will have been configured with both networking and storage suitable for use in a cluster.

What about networking?

By default, MicroCloud uses MicroOVN for networking, which is a minimal wrapper around OVN (Open Virtual Network). If you decide to not use MicroOVN, MicroCloud falls back on the Ubuntu fan for basic networking.

What's next?

This is just the beginning of MicroCloud. We’re very excited about what’s coming up next!

RESOURCES:

Snapcraft logo

microcloud's People

Contributors

camglegg avatar dependabot[bot] avatar gabrielmougard avatar markylaing avatar masnax avatar mggmuggins avatar mseralessandri avatar musicdin avatar roosterfish avatar ru-fu avatar simondeziel avatar stgraber avatar tomponline avatar wizardbit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

microcloud's Issues

OVN uplink network setup

We should introduce the bare minimum set of questions to setup a functional OVN network, for this we should do:

  • Update microcloud init to ask
    • Setup distributed networking? [default=YES] (no keeps things with lxdfan as today)
    • Please select an uplink interface for each server: (show list of possible interfaces on all servers, make sure exactly one is selected per server)
    • IPv4 gateway on the uplink network: (CIDR, empty to skip IPv4)
    • IPv4 address range for use by LXD (required if IPv4 enabled)
    • IPv6 gateway on the uplink network
    • IPv6 address range for use by LXD (required if IPv6 enabled)
  • If OVN is being setup:
    • Define the uplink network as UPLINK in LXD (type physical, set parent for each server, set ipv4.gateway, ipv4.ovn.ranges, ipv6.gateway, ipv6.ovn.ranges on global entry)
    • Define a new network called default of type ovn
    • Make the default profile use the default network
  • If OVN isn't being setup:
    • Create lxdfan0 as an Ubuntu Fan network (same as today)
    • Make the default profile use the lxdfan0 network

There's a lot more we can do around this to get external addresses routed to it, BGP, ... but the above is the bare minimum and shouldn't be too hard to add.

The list of valid interfaces to use as uplink can be found using:

package main

import (
	"fmt"
	"os"
	"strings"

	"github.com/lxc/lxd/client"
	"github.com/lxc/lxd/shared"
)

func main() {
	d, err := lxd.ConnectLXDUnix("/var/snap/lxd/common/lxd/unix.socket", nil)
	if err != nil {
		os.Exit(1)
	}

	networks, err := d.GetNetworks()
	if err != nil {
		os.Exit(1)
	}

	uplinks := []string{}
	for _, network := range networks {
		// Skip managed networks.
		if network.Managed {
			continue
		}

		// OpenVswitch only supports physical ethernet or VLAN interfaces, LXD also supports plugging in bridges.
		if !shared.StringInSlice(network.Type, []string{"physical", "bridge", "vlan"}) {
			continue
		}

		state, err := d.GetNetworkState(network.Name)
		if err != nil {
			continue
		}

		// OpenVswitch only works with full L2 devices.
		if state.Type != "broadcast" {
			continue
		}

		// Can't use interfaces that aren't up.
		if state.State != "up" {
			continue
		}

		// Make sure the interface isn't in use by ensuring there's no global addresses on it.
		addresses := []string{}
		for _, address := range state.Addresses {
			if address.Scope != "global" {
				continue
			}

			addresses = append(addresses, address.Address)
		}

		if len(addresses) > 0 {
			continue
		}

		uplinks = append(uplinks, network.Name)
	}

	fmt.Printf("Candidate: %s\n", strings.Join(uplinks, ", "))
}

The manual version of what's described above is:

lxc network create --type physical --target micro01 UPLINK parent=eth0.200
lxc network create --type physical --target micro02 UPLINK parent=eth0.200
lxc network create --type physical --target micro03 UPLINK parent=eth0.200
lxc network create --type physical UPLINK ipv4.gateway=172.30.200.1/24 ipv4.ovn.ranges=172.30.200.100-172.30.200.200 ipv6.gateway=2602:fc62:d:200::1/64 ipv6.ovn.ranges=2602:fc62:d:200::100-2602:fc62:d:200::200
lxc network create default network=UPLINK --type ovn
lxc profile device add default eth0 nic network=default name=eth0

Error when running microcloud init

After all questions of microcloud init are answered. And the cluster starts to initialize i get following error:

Initializing a new cluster
 Local MicroCloud is ready
 Local LXD is ready
 Local MicroCeph is ready
Error: Failed to bootstrap local MicroOVN: Failed to generate the daemon configuration: Remote couldn't be found for "metal1"

Error: MicroCeph service cluster does not match MicroCloud

$ snap list microcloud lxd microceph
Name        Version        Rev    Tracking       Publisher   Notes
lxd         5.0.2-838e1b2  24322  5.0/stable/…   canonical✓  -
microceph   0+git.6208776  220    latest/stable  canonical✓  -
microcloud  0+git.09caf0c  174    latest/stable  canonical✓  -
$ sudo microcloud init
Please choose the address MicroCloud will be listening on [default=192.168.122.85]: 
Scanning for eligible servers...
Press enter to end scanning for servers
 Found "microcloud-3" at "192.168.122.44"
 Found "microcloud-2" at "192.168.122.200"

Ending scan
Initializing a new cluster
 Local MicroCloud is ready
 Local MicroCeph is ready
 Local LXD is ready
Awaiting cluster formation...
 Peer "microcloud-3" has joined the cluster
 Peer "microcloud-2" has joined the cluster
Cluster initialization is complete
Error: MicroCeph service cluster does not match MicroCloud

In the init phase, the seed host + 2 additional servers are found and requested to form a cluster. However, the init command returns an error.

3 servers are recognized in microcloud cluster list although microceph cluster list only recognize 1 server.

ubuntu@microcloud-1:~$ sudo microcloud cluster list 
+--------------+----------------------+-------+------------------------------------------------------------------+--------+
|     NAME     |       ADDRESS        | ROLE  |                           FINGERPRINT                            | STATUS |
+--------------+----------------------+-------+------------------------------------------------------------------+--------+
| microcloud-1 | 192.168.122.85:9443  | voter | fe51bf48dd761fde03ac9882850ee34a54098a09dbacaeca914bc18031d03267 | ONLINE |
+--------------+----------------------+-------+------------------------------------------------------------------+--------+
| microcloud-2 | 192.168.122.200:9443 | voter | 3d11c0a44325a4d1882ae219ff0052323af0f837c518d13f26f956d6d616344f | ONLINE |
+--------------+----------------------+-------+------------------------------------------------------------------+--------+
| microcloud-3 | 192.168.122.44:9443  | voter | 32b892456f2c0a2ecb739d4f71441682531470db288e4610d6f100c4a47b3e0f | ONLINE |
+--------------+----------------------+-------+------------------------------------------------------------------+--------+
ubuntu@microcloud-1:~$ sudo microceph cluster list
+--------------+---------------------+-------+------------------------------------------------------------------+--------+
|     NAME     |       ADDRESS       | ROLE  |                           FINGERPRINT                            | STATUS |
+--------------+---------------------+-------+------------------------------------------------------------------+--------+
| microcloud-1 | 192.168.122.85:7443 | voter | 38d5529c5723d8ee18e523c61528acd8068caf601b774fbcc2a13db6e62857ab | ONLINE |
+--------------+---------------------+-------+------------------------------------------------------------------+--------+

Add option to specify mdns interface

microcloud init fails to initialize cluster if bridge interface(s) configured like below:

_# microcloud init
Using address "192.168.92.81" for MicroCloud
Limit search for other MicroCloud servers to 192.168.92.81/23? (yes/no) [default=yes]:
Scanning for eligible servers ...
Error: Failed lookup: write udp6 [::]:43320->[ff02::fb]:5353: sendto: cannot assign requested address_

I'm trying to evaluate microcloud deployment before migrating LXD cluster to microcloud cluster.

Minimal configuration for reproduction is:
3 nodes (I use Ubuntu 22.04 VMs in LXD cluster on top of RPI4 physical nodes)
Network config as below (netplan config files are attached):

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br-ext0 state UP group default qlen 1000
    link/ether 00:16:3e:2f:f7:d3 brd ff:ff:ff:ff:ff:ff
3: br-ext0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 0e:15:18:3c:86:c1 brd ff:ff:ff:ff:ff:ff
    inet 192.168.92.81/23 brd 192.168.93.255 scope global br-ext0
       valid_lft forever preferred_lft forever
    inet6 fe80::c15:18ff:fe3c:86c1/64 scope link
       valid_lft forever preferred_lft forever

My suspicion that microclud init tries to bind socket to unconfigured physical interface instead of bridge interface.

Steps to reproduce:

  • Deploy 3 (or more) nodes with Ubuntu 22.04 instances
  • Configure network bridge interface on each instance
  • Install microcloud, microovn, microceph and lxd snap packages on each
  • Run microcloud init command and follow to the withard

Result:
Wizard fails with message like that Error: Failed lookup: write udp6 [::]:43320->[ff02::fb]:5353: sendto: cannot assign requested address

Expected Result:
Wizard should find unconfigured nodes and continue execution

microcloud init works as expected if IP address configured on "physical" interface.
node-3.conf.txt
node-1.conf.txt
node-2.conf.txt

Setup the `remote` storage pool at the end of bootstrap

If possible, we should setup a remote storage pool.

I say is possible because we should only do so if:

  • We have 3 or more disks added
  • Those disks are spread over 3 or more systems

If not, we should print a warning and not add it as it would otherwise hang due to lack of replication availability in Ceph.

`microcloud init` can consume "at least 3 additional disks" by default but cannot reach to "for use by Ceph"

microcloud 0+git.445d39a

In the doc,

https://github.com/canonical/microcloud/blob/main/README.md?plain=1#L19

A minimum of 3 systems and at least 3 additional disks for use by Ceph are required.

my understanding is one additional disk per system (3 additional disks in total) are expected as minimum.

Then, we run microcloud init as follows and by selecting the default suggested options, it can consume 3 additional disks before reaching to the Ceph stage and we end up with a situation where no disk is left for Ceph.

This first user journey could be improved by updating the description of questions in the init phase.

Would you like to setup local storage? (yes/no) [default=yes]:
Select exactly one disk from each cluster member:

Would you like to setup distributed storage? (yes/no) [default=yes]:
Found no available disks

$ uvt-kvm ssh microcloud-1 -- -t -- sudo microcloud init
Please choose the address MicroCloud will be listening on [default=192.168.122.230]: 
Scanning for eligible servers...
Press enter to end scanning for servers
 Found "microcloud-3" at "192.168.122.28"
 Found "microcloud-2" at "192.168.122.237"

Ending scan
Would you like to setup local storage? (yes/no) [default=yes]: 
Select exactly one disk from each cluster member:
Space to select; Enter to confirm; Esc to exit; Type to filter results.
Up/Down to move; Right to select all; Left to select none.
       +--------------+---------------+----------+------+-------------------------------------------------+
       |   LOCATION   |     MODEL     | CAPACITY | TYPE |                      PATH                       |
       +--------------+---------------+----------+------+-------------------------------------------------+
> [x]  | microcloud-1 | QEMU HARDDISK | 32.00GiB | scsi | /dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001 |
  [x]  | microcloud-2 | QEMU HARDDISK | 32.00GiB | scsi | /dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001 |
  [x]  | microcloud-3 | QEMU HARDDISK | 32.00GiB | scsi | /dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001 |
       +--------------+---------------+----------+------+-------------------------------------------------+

Would you like to setup distributed storage? (yes/no) [default=yes]: 
Found no available disks
Retry selecting disks? (yes/no) [default=yes]: 
Found no available disks
Retry selecting disks? (yes/no) [default=yes]: no
Initializing a new cluster
 Local MicroCloud is ready
 Local MicroOVN is ready
 Local LXD is ready
 Local MicroCeph is ready
Awaiting cluster formation...
 Peer "microcloud-3" has joined the cluster
 Peer "microcloud-2" has joined the cluster
Cluster initialization is complete
Error: Pool not pending on any node (use --target <node> first)

charms: synchronized multi-units scale-up support

As of now, the charm can reliably tell the leader unit to wait for the newly added unit to install its dependencies before calling microcloud add --auto. This approach won't scale well when the user asks for a scale-up with more than one unit:

# For example, a three units scale-up
juju add-units microcloud -n 3 --to 4,5,6

Idea:

Here could be a waiting mechanism to tell the leader when to execute according to the new provided set of units. There would be only one call to self.microcloud_add() (basically a subprocess call to microcloud add --auto) that'd add the new members all at once.

def _on_cluster_relation_joined(self, event: RelationJoinedEvent) -> None:
    """Add a new set of nodes to an existing Microcloud cluster"""
    if (
        self.unit.is_leader()
        and self.get_peer_data_str(self.unit, "clustered") == "True"
        and event.unit
        != self.unit  # Don't add the leader to the cluster as it is already there
    ):
        previous_num_clustered_units = int(
            self.get_peer_data_str(self.app, "num_clustered_units")
        )
        peers_clustered = sum(
            1 for unit in self.peers.units if self.peers.data[unit].get("clustered") == "True"
        )
        num_new_nodes = sum(
            1 for unit in self.peers.units if self.peers.data[unit].get("new_node") == "True"
        )
        if self.app.planned_units() != peers_clustered + 1 and num_new_nodes == (
            self.app.planned_units() - previous_num_clustered_units
        ):
            # if all the units are not clustered yet  and that the number of new nodes is equal
            # to the number of planned units minus the number of previously clustered units,
            # we can try to add these new node all at once
            try:
                self.microcloud_add()
                logger.info("New Microcloud node successfully added")
                # TODO: update num_clustered_units with `lxc cluster list -f csv`
                self.unit_active("Successfully added new nodes")
                return
            except RuntimeError:
                logger.error("Failed to add new Microcloud nodes")
                self.unit_active("Failed to add new Microcloud nodes")
        
        event.defer() 

Add validation stage after init

Once microcloud init is done, we should offer a validation phase that would:

  • Spawn an instance on each machine and storage pool
  • Validate communication at max MTU across instances
  • Validate communication to uplink gateway from the instances

Allow running `microcloud` with `microceph` or `microovn` missing

We should be tolerant to microceph and/or microceph being missing.
In such a case, we should only cluster the services that are available.
This would allow the deployment of MicroCloud with just LXD and OVN, using a pre-existing external Ceph for example.

For the sake of this, we should detect the available services on the machine running microcloud init, show a prominent warning and ask for confirmation to move on without all services. It's then expected that all other servers have at least the same services as the bootstrap server, if some services are missing on the other servers, they should not be selected during the mdns stage.

This will most likely require the servers to broadcast what services they are running during the mdns stage, or offer an unauthenticated API allowing the bootstrap server to check what they have.

empty path of /dev/disk/by-id/ for virtio drives

Moved from:
canonical/microceph#113

$ snap list microcloud microceph
Name        Version        Rev  Tracking     Publisher   Notes
microceph   0+git.38a6bb6  289  latest/edge  canonical✓  -
microcloud  0+git.445d39a  264  latest/edge  canonical✓  -

When a system has virtio drives, the following table will be shown in microcloud init.

$ sudo microcloud init
Please choose the address MicroCloud will be listening on [default=192.168.122.68]: 
Scanning for eligible servers...
Press enter to end scanning for servers
 Found "microcloud-2" at "192.168.122.143"
 Found "microcloud-3" at "192.168.122.145"

Ending scan
Would you like to setup local storage? (yes/no) [default=yes]: 
Select exactly one disk from each cluster member:
Space to select; Enter to confirm; Esc to exit; Type to filter results.
Up/Down to move; Right to select all; Left to select none.
       +--------------+---------------+-----------+--------+-------------------------------------------------+
       |   LOCATION   |     MODEL     | CAPACITY  |  TYPE  |                      PATH                       |
       +--------------+---------------+-----------+--------+-------------------------------------------------+
> [ ]  | microcloud-1 | QEMU HARDDISK | 32.00GiB  | scsi   | /dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001 |
  [ ]  | microcloud-1 |               | 32.00GiB  | virtio | /dev/disk/by-id/                                |
  [ ]  | microcloud-1 |               | 372.00KiB | virtio | /dev/disk/by-id/                                |
  [ ]  | microcloud-2 | QEMU HARDDISK | 32.00GiB  | scsi   | /dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001 |
  [ ]  | microcloud-2 |               | 32.00GiB  | virtio | /dev/disk/by-id/                                |
  [ ]  | microcloud-2 |               | 372.00KiB | virtio | /dev/disk/by-id/                                |
  [ ]  | microcloud-3 | QEMU HARDDISK | 32.00GiB  | scsi   | /dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001 |
  [ ]  | microcloud-3 |               | 32.00GiB  | virtio | /dev/disk/by-id/                                |
  [ ]  | microcloud-3 |               | 372.00KiB | virtio | /dev/disk/by-id/                                |
       +--------------+---------------+-----------+--------+-------------------------------------------------+


Would you like to setup distributed storage? (yes/no) [default=yes]: 
Select from the available unpartitioned disks:

Space to select; Enter to confirm; Esc to exit; Type to filter results.
Up/Down to move; Right to select all; Left to select none.
       +--------------+-------+-----------+--------+------------------+
       |   LOCATION   | MODEL | CAPACITY  |  TYPE  |       PATH       |
       +--------------+-------+-----------+--------+------------------+
  [x]  | microcloud-1 |       | 32.00GiB  | virtio | /dev/disk/by-id/ |
  [ ]  | microcloud-1 |       | 372.00KiB | virtio | /dev/disk/by-id/ |
  [x]  | microcloud-2 |       | 32.00GiB  | virtio | /dev/disk/by-id/ |
  [ ]  | microcloud-2 |       | 372.00KiB | virtio | /dev/disk/by-id/ |
> [x]  | microcloud-3 |       | 32.00GiB  | virtio | /dev/disk/by-id/ |
  [ ]  | microcloud-3 |       | 372.00KiB | virtio | /dev/disk/by-id/ |
       +--------------+-------+-----------+--------+------------------+

And by selecting those drives, it fails with
Error: Failed adding new disk: Invalid disk path: /dev/disk/by-id/

OVS connection to OVN southbound database via `ovn-remote` not set correctly on all members

I would have logged this on https://github.com/canonical/microovn/ but it has Issues disabled.

I have reproduced an issue (that looks like another race condition) where the OVS ovn-remote setting does not set correctly by microovn during MicroCloud bootstrap.

Using this setup script:

#!/bin/sh
# Cleanup
if [ "${1:-""}" = "reset" ]; then
    set -eux
    lxc project switch microcloud
    set +e

    lxc delete -f micro01
    lxc delete -f micro02
    lxc delete -f micro03

    lxc storage volume delete default micro01-disk1
    lxc storage volume delete default micro01-disk2
    lxc storage volume delete default micro02-disk1
    lxc storage volume delete default micro02-disk2
    lxc storage volume delete default micro03-disk1
    lxc storage volume delete default micro03-disk2

    lxc profile device remove default eth1
    lxc profile device remove default eth0
    lxc profile device remove default root

    lxc network delete microbr0

    for fp in $(lxc image list -cf -fcsv); do
        lxc image delete "${fp}"
    done

    lxc project switch default
    lxc project delete microcloud
    exit 0
fi

set -eux

# Create project
lxc project create microcloud
lxc project switch microcloud

# Setup default profile
lxc profile device add default root disk pool=default path=/
lxc profile device add default eth0 nic network=lxdbr0 name=eth0

# Create uplink network
lxc network create microbr0 \
    ipv4.address=10.123.123.1/24 ipv4.dhcp=false ipv4.nat=true \
    ipv6.address=fd42:1234:1234:1234::1/64 ipv6.nat=true
lxc profile device add default eth1 nic network=microbr0 name=eth1

# Create extra disks
lxc storage volume create default micro01-disk1 size=30GiB --type=block
lxc storage volume create default micro01-disk2 size=50GiB --type=block

lxc storage volume create default micro02-disk1 size=30GiB --type=block
lxc storage volume create default micro02-disk2 size=50GiB --type=block

lxc storage volume create default micro03-disk1 size=30GiB --type=block
lxc storage volume create default micro03-disk2 size=50GiB --type=block

# Create instances
lxc init ubuntu:22.04 micro01 --vm -c limits.cpu=2 -c limits.memory=3GiB
lxc config device add micro01 disk1 disk pool=default source=micro01-disk1
lxc config device add micro01 disk2 disk pool=default source=micro01-disk2
lxc start micro01

lxc init ubuntu:22.04 micro02 --vm -c limits.cpu=2 -c limits.memory=3GiB
lxc config device add micro02 disk1 disk pool=default source=micro02-disk1
lxc config device add micro02 disk2 disk pool=default source=micro02-disk2
lxc start micro02

lxc init ubuntu:22.04 micro03 --vm -c limits.cpu=2 -c limits.memory=3GiB
lxc config device add micro03 disk1 disk pool=default source=micro03-disk1
lxc config device add micro03 disk2 disk pool=default source=micro03-disk2
lxc start micro03

# Wait for things to boot
sleep 1m

# Bring enp6s0 up but disable IPv6 (should do through netplan)
lxc exec micro01 -- ip link set enp6s0 up
lxc exec micro02 -- ip link set enp6s0 up
lxc exec micro03 -- ip link set enp6s0 up
lxc exec micro01 -- sh -c "echo 1 > /proc/sys/net/ipv6/conf/enp6s0/disable_ipv6"
lxc exec micro02 -- sh -c "echo 1 > /proc/sys/net/ipv6/conf/enp6s0/disable_ipv6"
lxc exec micro03 -- sh -c "echo 1 > /proc/sys/net/ipv6/conf/enp6s0/disable_ipv6"

# Install the snaps
lxc exec micro01 -- snap refresh lxd --channel="latest/${1:-stable}"
lxc exec micro02 -- snap refresh lxd --channel="latest/${1:-stable}"
lxc exec micro03 -- snap refresh lxd --channel="latest/${1:-stable}"
lxc exec micro01 -- snap install microceph --channel="${1:-stable}"
lxc exec micro01 -- snap install microovn --channel="${1:-stable}"
lxc exec micro01 -- snap install microcloud --channel="${1:-stable}"
lxc exec micro02 -- snap install microceph --channel="${1:-stable}"
lxc exec micro02 -- snap install microovn --channel="${1:-stable}"
lxc exec micro02 -- snap install microcloud --channel="${1:-stable}"
lxc exec micro03 -- snap install microceph --channel="${1:-stable}"
lxc exec micro03 -- snap install microovn --channel="${1:-stable}"
lxc exec micro03 -- snap install microcloud --channel="${1:-stable}"

# Show settings
set +x
echo ""
echo "Disks:"
echo "  Local storage: Use 30GiB disks"
echo "  Remote storage: Use 50GiB disks"
echo "OVN:"
echo "  IPv4 subnet: 10.123.123.1/24"
echo "  IPv4 start:  10.123.123.100"
echo "  IPv4 end:    10.123.123.254"
echo "  IPv6 subnet: fd42:1234:1234:1234::1/64"
echo ""

# Run microcloud init
lxc exec micro01 -- microcloud init
lxc ls
+---------+---------+----------------------+-------------------------------------------------+-----------------+-----------+
|  NAME   |  STATE  |         IPV4         |                      IPV6                       |      TYPE       | SNAPSHOTS |
+---------+---------+----------------------+-------------------------------------------------+-----------------+-----------+
| micro01 | RUNNING | 10.21.203.9 (enp5s0) | fd42:ffdb:caff:baf7:216:3eff:fee3:8e0e (enp5s0) | VIRTUAL-MACHINE | 0         |
+---------+---------+----------------------+-------------------------------------------------+-----------------+-----------+
| micro02 | RUNNING | 10.21.203.5 (enp5s0) | fd42:ffdb:caff:baf7:216:3eff:fe2b:bf3f (enp5s0) | VIRTUAL-MACHINE | 0         |
+---------+---------+----------------------+-------------------------------------------------+-----------------+-----------+
| micro03 | RUNNING | 10.21.203.6 (enp5s0) | fd42:ffdb:caff:baf7:216:3eff:fea4:cc00 (enp5s0) | VIRTUAL-MACHINE | 0         |
+---------+---------+----------------------+-------------------------------------------------+-----------------+-----------+

After the cluster has been configured, enter each member and check the config of ovn-remote in OVS:

lxc exec micro01 -- microovn.ovs-vsctl get open_vswitch . external_ids:ovn-remote
"tcp:10.21.203.9:6642,tcp:10.21.203.6:6642,tcp:10.21.203.5:6642"

lxc exec micro02 -- microovn.ovs-vsctl get open_vswitch . external_ids:ovn-remote
"tcp:10.21.203.9:6642,tcp:10.21.203.6:6642,tcp:10.21.203.5:6642"

lxc exec micro03 -- microovn.ovs-vsctl get open_vswitch . external_ids:ovn-remote
"tcp:10.21.203.9:6642,tcp:10.21.203.6:6642,"

So not only is micro03's setting an invalid list (trailing commas cause an issue with OVN), its also missing the entry for micro02.

charms: Support `preseed` bootstrapping

Using the initial preseed work #144, we could introduce a new preseed charm configuration key during the deploy phase like:

juju deploy ./microcloud_ubuntu-22.04-amd64.charm -n 3 --to 0,1,2 --config preseed=<PATH_YAML_PRESEED_FILE>

To deploy the microcloud units according to the preseed rules.

Support installation on VMs running in a cloud provider (where mDNS does not work)

The documentation warns that mDNS will fail on a cloud provider

MicroCloud uses mDNS to automatically detect other servers on the network.
This method works in physical networks, but it is usually not supported in a cloud environment.
https://canonical-microcloud.readthedocs-hosted.com/en/latest/explanation/initialisation/#explanation-initialisation

Is it possible to specify cluster node addresses by hand? According to #35 this feature was considered. It does not seem to be implemented, though. Not in microcloud 0.1.

ubuntu@djobik:~$ sudo microcloud init
Select an address for MicroCloud's internal traffic:

 Using address "10.0.169.48" for MicroCloud

Limit search for other MicroCloud servers to 10.0.169.48/24? (yes/no) [default=yes]: 
Scanning for eligible servers ...

Microovn also seems to rely on mDNS, so joining nodes on a cloud provider there does not seem to work either.

charms: Introduce an integration testing suite with `scenario`

Using the https://juju.is/docs/sdk/scenario testing framework, we will introduce the testing of some use cases to assess the resilience of the charm against chaotic conditions:

  • Three nodes random cluster initialization
  • Four nodes random cluster initialization
  • Three node initialized + one-unit scale-up
  • Four node initialized + one-unit scale-down
  • Three node initialized + multiple-units scale-up (ranging from 2 to N. Could we have N=47 ? We'd then reach a total cluster size of 50, which is the upper theoretical limit)
  • More than five node cluster + multiple-units scale-down

This should be a very good start.

No suggestion to configure an uplink for OVN

Follow-up of #87

microcloud 0+git.136aeb1

Even if one NIC is prepared as unconfigured to be consumed by OVN, microcloud skips the OVN step with No dedicated uplink interfaces detected, skipping distributed networking. The expectation is to guide users through to configure OVN with an uplink.

ubuntu@mc-1:~$ ip -br a
lo               UNKNOWN        127.0.0.1/8 ::1/128 
enp1s0           UP             192.168.122.172/24 metric 100 fe80::5054:ff:fe56:3c7d/64 
enp7s0           DOWN           

^^^ enp7s0 is unconfigured at that point.

ubuntu@mc-1:~$ sudo microcloud init
Using address "192.168.122.172" for MicroCloud
Limit search for other MicroCloud servers to 192.168.122.172/24? (yes/no) [default=yes]: 
Scanning for eligible servers...

 Selected "mc-2" at "192.168.122.105"
 Selected "mc-3" at "192.168.122.219"

Would you like to setup local storage? (yes/no) [default=yes]: 
Select exactly one disk from each cluster member:

 Using "/dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001" on "mc-3" for local storage pool
 Using "/dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001" on "mc-1" for local storage pool
 Using "/dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001" on "mc-2" for local storage pool

Would you like to setup distributed storage? (yes/no) [default=yes]: 
Select from the available unpartitioned disks:

Select which disks to wipe:

 Using 1 disk(s) on "mc-1" for remote storage pool
 Using 1 disk(s) on "mc-2" for remote storage pool
 Using 1 disk(s) on "mc-3" for remote storage pool

No dedicated uplink interfaces detected, skipping distributed networking
Initializing a new cluster
 Local MicroCloud is ready
 Local LXD is ready
 Local MicroOVN is ready
 Local MicroCeph is ready
Awaiting cluster formation...
 Peer "mc-3" has joined the cluster
 Peer "mc-2" has joined the cluster
Cluster initialization is complete
MicroCloud is ready
$ lxc network list
+---------+----------+---------+------+------+-------------+---------+---------+
|  NAME   |   TYPE   | MANAGED | IPV4 | IPV6 | DESCRIPTION | USED BY |  STATE  |
+---------+----------+---------+------+------+-------------+---------+---------+
| enp1s0  | physical | NO      |      |      |             | 0       |         |
+---------+----------+---------+------+------+-------------+---------+---------+
| enp7s0  | physical | NO      |      |      |             | 0       |         |
+---------+----------+---------+------+------+-------------+---------+---------+
| lxdfan0 | bridge   | YES     |      |      |             | 1       | CREATED |
+---------+----------+---------+------+------+-------------+---------+---------+
$ lxc profile show default
config: {}
description: ""
devices:
  eth0:
    name: eth0
    network: lxdfan0
    type: nic
  root:
    path: /
    pool: remote
    type: disk
name: default
used_by: []

Add option to disable ipv6 protocol in microcloud init command.

Problem description:

microcloud init require IPv4 and IPv6 stack.
microcloud use calls mdns.DefaultParams(service) which instructs to use ipv4 and ipv6 sockets in msdn Query function call.
microcloud init fails with message Error: Failed lookup: write udp6 [::]:<SRC_PORT>->[ff02::fb]:5353: sendto: cannot assign requested address in case if ipv6 is disabled.

Prerequisites

  • The single instance with the single network interface is enough for reproduction
  • install microcloud via snap: snap install microcloud (neverminded which channel is used, all are impacted)
  • disable ipv6 stack (for example using shell command: sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1; sudo sysctl -w net.ipv6.conf.default.disable_ipv6=1)

Steps to reproduce:

  • run sudo microcloud init --auto
    execution stops after Scanning for eligible servers ... step with the message Error: Failed lookup: write udp6 [::]:<SRC_PORT>->[ff02::fb]:5353: sendto: cannot assign requested address

Recommendations to fix
Add option to command disable ipv6 protocol in microcloud init command.

No OVN networking after completing `init`

microcloud 0+git.36ab378

init completed without any error for the distributed networking part, but there is no OVN network defined after that.

I suppose at least one additional network interface is necessary other than the main NIC that the host OS uses, but init didn't have any step to specify interface nor warn about the missing additional interface.

ubuntu@mc-1:~$ lxc network list
+---------+----------+---------+------+------+-------------+---------+---------+
|  NAME   |   TYPE   | MANAGED | IPV4 | IPV6 | DESCRIPTION | USED BY |  STATE  |
+---------+----------+---------+------+------+-------------+---------+---------+
| enp1s0  | physical | NO      |      |      |             | 0       |         |
+---------+----------+---------+------+------+-------------+---------+---------+
| lxdfan0 | bridge   | YES     |      |      |             | 1       | CREATED |
+---------+----------+---------+------+------+-------------+---------+---------+

ubuntu@mc-1:~$ lxc profile show default
config: {}
description: ""
devices:
  eth0:
    name: eth0
    network: lxdfan0
    type: nic
  root:
    path: /
    pool: remote
    type: disk
name: default
used_by: []
$ sudo microcloud init
Using address "192.168.122.110" for MicroCloud
Limit search for other MicroCloud servers to 192.168.122.110/24? (yes/no) [default=yes]: 
Scanning for eligible servers...

 Selected "mc-2" at "192.168.122.154"
 Selected "mc-3" at "192.168.122.156"

Would you like to setup local storage? (yes/no) [default=yes]: 
Select exactly one disk from each cluster member:

 Using "/dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001" on "mc-1" for local storage pool
 Using "/dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001" on "mc-2" for local storage pool
 Using "/dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001" on "mc-3" for local storage pool

Would you like to setup distributed storage? (yes/no) [default=yes]: 
Select from the available unpartitioned disks:

Select which disks to wipe:

 Using 1 disk(s) on "mc-2" for remote storage pool
 Using 1 disk(s) on "mc-3" for remote storage pool
 Using 1 disk(s) on "mc-1" for remote storage pool

Configure distributed networking? (yes/no) [default=yes]: 
Select the IPv4 gateway (CIDR) on the uplink network (empty to skip IPv4): 192.168.122.1/24
Select the first IPv4 address in the range to use with LXD: 192.168.122.11
Select the last IPv4 address in the range to use with LXD: 192.168.122.99
Select the IPv6 gateway (CIDR) on the uplink network (empty to skip IPv6): 
Initializing a new cluster
 Local MicroCloud is ready
 Local LXD is ready
 Local MicroOVN is ready
 Local MicroCeph is ready
Awaiting cluster formation...
 Peer "mc-3" has joined the cluster
 Peer "mc-2" has joined the cluster
Cluster initialization is complete
MicroCloud is ready

CI tests

We should basically copy what we're doing in microceph and add testing for microcloud in the same way.

Add benchmark stage

Similar to #103 we should also add an optional benchmarking stage which would run at the same time as validation and would also measure disk performance on all systems, disk performance on Ceph, network throughput between instances, ...

This would allow getting a reasonable idea of the performance of the MicroCloud before it's put in production.

Add support for local storage

When initializing a MicroCloud, local storage should be offered prior to setting up remote storage.

The updated microcloud init process should therefore be:

  • Bootstrap
  • Detect and join other servers
  • Ask whether the user wants to setup local storage, if yes
    • Check that we have at least one available disk per server
    • Ask the user to select exactly one disk per server
  • Ask whether the user wants to setup remote storage, if yes
    • Check that we have at least three disks spread across at least three servers
    • Ask the user to select what disks to include into Ceph
  • If local storage was chosen, use it for the default profile, otherwise use remote (if was setup)

Removing a Microcloud cluster member does not remove the underlying LXD cluster member

Having a simple 3 nodes cluster configuration like so:

root@v3:~# microcloud cluster list
+------+-------------------+-------+------------------------------------------------------------------+--------+
| NAME |      ADDRESS      |  ROLE |                           FINGERPRINT                            | STATUS |
+------+-------------------+-------+------------------------------------------------------------------+--------+
| v1   | 10.10.10.67:9443  | voter | 3d4140ec40d677b2a9a4870511b144f795578f0007d32cdef962a177cf152286 | ONLINE |
+------+-------------------+-------+------------------------------------------------------------------+--------+
| v2   | 10.10.10.217:9443 | voter | 621fe0a5e252b80764fc0528e269046ff583d4e52ac17f980fdbf71a177890e6 | ONLINE |
+------+-------------------+------+------------------------------------------------------------------+--------+
| v3   | 10.10.10.86:9443  | voter | 0967c4417e555d1bf79f345ffaa6c6c1eb1b0e8ddd73b682980860f689f998e4 | ONLINE |
+------+-------------------+-------+------------------------------------------------------------------+--------+

When I want to remove a microcloud node with microcloud cluster remove v3 for example, this works as expected (for example, I go on v2 a list the microcloud members)

root@v2:~# microcloud cluster list
+------+-------------------+-------+------------------------------------------------------------------+--------+
| NAME |      ADDRESS      | ROLE  |                           FINGERPRINT                            | STATUS |
+------+-------------------+-------+------------------------------------------------------------------+--------+
| v1   | 10.10.10.67:9443  | voter | 3d4140ec40d677b2a9a4870511b144f795578f0007d32cdef962a177cf152286 | ONLINE |
+------+-------------------+-------+------------------------------------------------------------------+--------+
| v2   | 10.10.10.217:9443 | spare | 621fe0a5e252b80764fc0528e269046ff583d4e52ac17f980fdbf71a177890e6 | ONLINE |
+------+-------------------+-------+------------------------------------------------------------------+--------+

But on every node, if I do a lxc cluster list, I see all the members:

root@v3:~# lxc cluster list
+------+---------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| NAME |            URL            |      ROLES      | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATE  |      MESSAGE      |
+------+---------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| v1   | https://10.10.10.67:8443  | database        | x86_64       | default        |             | ONLINE | Fully operational |
+------+---------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| v2   | https://10.10.10.217:8443 | database        | x86_64       | default        |             | ONLINE | Fully operational |
+------+---------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+
| v3   | https://10.10.10.86:8443  | database-leader | x86_64       | default        |             | ONLINE | Fully operational |
|      |                           | database        |              |                |             |        |                   |
+------+---------------------------+-----------------+--------------+----------------+-------------+--------+-------------------+

This behaviour is not very 'symmetric' with microcloud init that creates underlying LXD cluster members. I would expect microcloud cluster remove <node_name> to remove the underlying LXD cluster member (the one listed with lxc cluster list) as well.

I'm also curious to know how it behaves with microceph/microovn: does a microcloud cluster remove <node_name> triggers an automatic microceph cluster remove <node_name> / microovn cluster remove <node_name> as well ? I don't know what is the expected behaviour here, but I'd say that if we remove a microcloud node, we also would like to remove its associated node in the microceph / microovm cluster as they are meant to work all together..

Allow removing machines from a MicroCloud

We should introduce a microcloud add and microcloud remove command to add/remove servers to the cluster.
The microcloud add would be a scriptable version of what we do through microcloud init.

microcloud remove is the trickier one as it will have to validate that:

  • LXD is empty on the system
  • Ceph isn't dependent on that server (can operate without it)

And if that's the case, then perform the removal from LXD, Ceph and OVN before removing it from MicroCloud.
The expectation is that the server will then go back to a pristine stage and so should multicast over MDNS again and be possible to add it back to the MicroCloud.

charms: scale-down of units is not properly handled

As of now, when attempting a scale-down with juju remove-unit microcloud/<UNIT_ID_#0> microcloud/<UNIT_ID_#0> ..., the juju units are effectively removed but the underlying clustered LXD member is seen as offline by the other members (lxc cluster list). It is just the same with the microceph member being displayed as unreachable but still present where it shouldn't (microceph cluster list).

See here : https://asciinema.org/a/p5fYO9BvGeBPXOtDaYwVaSFpY

Add support for MicroOVN

microovn is now available in the edge channel and when combined with the edge version of snapd, has a working interface to be used between microcloud and microovn.

We now need microcloud to also bootstrap the microovn cluster. This is pretty straightforward as all that needs to be done is the bootstrap+join dance. No disk or anything else to setup at this stage.

Handle MicroCeph already installed

I have MicroCeph installed independently, before MicroCloud to use a dedicated network. When asked during the MicroCloud install if I want distributed storage, I say "no", but there seems to be an internal assumption preventing the setup to complete.

Would you like to set up distributed storage? (yes/no) [default=yes]: no

Configure distributed networking? (yes/no) [default=yes]: 
Select exactly one network interface from each cluster member:

 Using "enp6s0" on "n2" for OVN uplink
 Using "enp6s0" on "n1" for OVN uplink
 Using "enp6s0" on "n3" for OVN uplink

Specify the IPv4 gateway (CIDR) on the uplink network (empty to skip IPv4): 192.168.10.1/24
Specify the first IPv4 address in the range to use with LXD: 192.168.10.100
Specify the last IPv4 address in the range to use with LXD: 192.168.10.200
Specify the IPv6 gateway (CIDR) on the uplink network (empty to skip IPv6): 
Initializing a new cluster
 Local MicroCloud is ready
 Local LXD is ready
 Local MicroOVN is ready
Error: Failed to bootstrap local MicroCeph: Failed to initialize local remote entry: A remote with name "n1" already exists

Rework UX to ask all questions at the beginning

To make things easier to handle, we need the flow to be:

  • Select servers
  • Setup storage
    • Local disks
    • Remote disks

After which we get the cluster to bootstrap on the first server, then join the other servers, ...
Basically we want the interactive user interaction to happen as early as possible and not have any question delayed because of servers getting configured.

dial unix /var/snap/microcloud/common/state/control.socket: connect: permission denied

I suppose sudo is necessary for microcloud init.

microcloud init

ubuntu@microcloud-1:~$ snap list microcloud
Name        Version        Rev  Tracking       Publisher   Notes
microcloud  0+git.09caf0c  174  latest/stable  canonical✓  -
ubuntu@microcloud-1:~$ microcloud init 
Please choose the address MicroCloud will be listening on [default=192.168.122.166]: 
Scanning for eligible servers...
Press enter to end scanning for servers
 Found "microcloud-2" at "192.168.122.155"
 Found "microcloud-3" at "192.168.122.98"

Ending scan
Initializing a new cluster
Error: Failed to bootstrap local MicroCloud: Post "http://control.socket/cluster/control": dial unix /var/snap/microcloud/common/state/control.socket: connect: permission denied


ubuntu@microcloud-1:~$ ll /var/snap/microcloud/common/state/control.socket
srw-rw---- 1 root root 0 Mar 27 01:21 /var/snap/microcloud/common/state/control.socket=

Configure OVN access in LXD

When MicroOVN is set up, we should have MicroCloud do:

  • Pull list of services from MicroOVN API
  • Identify what servers run the central service
  • From that compute the northbound connection string and set it as network.ovn.northbound_connection
  • Whenever a server is added or removed, go through that process again

The connection string looks something like: tcp:[2602:fc62:a:101::100]:6641,tcp:[2602:fc62:a:101::101]:6641,tcp:[2602:fc62:a:101::102]:6641

So basically comma-separated list of tcp:<IP>:6641 for each of the central servers.

Race getting OVN southbound connection during microcloud init

Sometimes when running microcloud init inside VMs I get this error:

Error: failed to notify peer 10.21.203.2:8443: Failed to get OVN client: Failed to get OVN southbound connection string: Failed to run: ovs-vsctl get open_vswitch . external_ids:ovn-remote: exit status 1 (ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory))

Using https://gist.github.com/stgraber/73d1904c666dbdd061e3efcabf8b7acc

But it doesn't happen every time, suggesting it is a race whilst trying to access OVS before its started.

Full steps:

+ lxc project create microcloud
Project microcloud created
+ lxc project switch microcloud
+ lxc profile device add default root disk pool=default path=/
Device root added to default
+ lxc profile device add default eth0 nic network=lxdbr0 name=eth0
Device eth0 added to default
+ lxc network create microbr0 ipv4.address=10.123.123.1/24 ipv4.dhcp=false ipv4.nat=true ipv6.address=fd42:1234:1234:1234::1/64 ipv6.nat=true
Network microbr0 created
+ lxc profile device add default eth1 nic network=microbr0 name=eth1
Device eth1 added to default
+ lxc storage volume create default micro01-disk1 size=30GiB --type=block
Storage volume micro01-disk1 created
+ lxc storage volume create default micro01-disk2 size=50GiB --type=block
Storage volume micro01-disk2 created
+ lxc storage volume create default micro02-disk1 size=30GiB --type=block
Storage volume micro02-disk1 created
+ lxc storage volume create default micro02-disk2 size=50GiB --type=block
Storage volume micro02-disk2 created
+ lxc storage volume create default micro03-disk1 size=30GiB --type=block
Storage volume micro03-disk1 created
+ lxc storage volume create default micro03-disk2 size=50GiB --type=block
Storage volume micro03-disk2 created
+ lxc init ubuntu:22.04 micro01 --vm -c limits.cpu=2 -c limits.memory=4GiB
Creating micro01
+ lxc config device add micro01 disk1 disk pool=default source=micro01-disk1
Device disk1 added to micro01
+ lxc config device add micro01 disk2 disk pool=default source=micro01-disk2
Device disk2 added to micro01
+ lxc start micro01
+ lxc init ubuntu:22.04 micro02 --vm -c limits.cpu=2 -c limits.memory=4GiB
Creating micro02
+ lxc config device add micro02 disk1 disk pool=default source=micro02-disk1
Device disk1 added to micro02
+ lxc config device add micro02 disk2 disk pool=default source=micro02-disk2
Device disk2 added to micro02
+ lxc start micro02
+ lxc init ubuntu:22.04 micro03 --vm -c limits.cpu=2 -c limits.memory=4GiB
Creating micro03
+ lxc config device add micro03 disk1 disk pool=default source=micro03-disk1
Device disk1 added to micro03
+ lxc config device add micro03 disk2 disk pool=default source=micro03-disk2
Device disk2 added to micro03
+ lxc start micro03
+ sleep 1m
+ lxc exec micro01 -- ip link set enp6s0 up
+ lxc exec micro02 -- ip link set enp6s0 up
+ lxc exec micro03 -- ip link set enp6s0 up
+ lxc exec micro01 -- sh -c echo 1 > /proc/sys/net/ipv6/conf/enp6s0/disable_ipv6
+ lxc exec micro02 -- sh -c echo 1 > /proc/sys/net/ipv6/conf/enp6s0/disable_ipv6
+ lxc exec micro03 -- sh -c echo 1 > /proc/sys/net/ipv6/conf/enp6s0/disable_ipv6
+ lxc exec micro01 -- snap refresh lxd --channel=latest/stable
lxd 5.13-8e2d7eb from Canonical✓ refreshed
+ lxc exec micro02 -- snap refresh lxd --channel=latest/stable
lxd 5.13-8e2d7eb from Canonical✓ refreshed
+ lxc exec micro03 -- snap refresh lxd --channel=latest/stable
lxd 5.13-8e2d7eb from Canonical✓ refreshed
+ lxc exec micro01 -- snap install microovn microceph microcloud
microceph 0+git.fdf6d5e from Canonical✓ installed
microcloud 0+git.9cb7ccd from Canonical✓ installed
microovn 0+git.f8a4497 from Canonical✓ installed
+ lxc exec micro02 -- snap install microovn microceph microcloud
2023-05-11T07:44:31Z INFO Waiting for conflicting change in progress: conflicting slot snap snapd, task
"connect"
microcloud 0+git.9cb7ccd from Canonical✓ installed
microovn 0+git.f8a4497 from Canonical✓ installed
microceph 0+git.fdf6d5e from Canonical✓ installed
+ lxc exec micro03 -- snap install microovn microceph microcloud
2023-05-11T07:44:59Z INFO Waiting for conflicting change in progress: conflicting snap microceph with
task "setup-profiles"
2023-05-11T07:44:59Z INFO Waiting for conflicting change in progress: conflicting slot snap snapd, task
"connect"
2023-05-11T07:45:05Z INFO Waiting for conflicting change in progress: conflicting snap microovn with task
"setup-profiles"
microovn 0+git.f8a4497 from Canonical✓ installed
microceph 0+git.fdf6d5e from Canonical✓ installed
microcloud 0+git.9cb7ccd from Canonical✓ installed
+ set +x

Disks:
  Local storage: Use 30GiB disks
  Remote storage: Use 50GiB disks
OVN:
  IPv4 subnet: 10.123.123.1/24
  IPv4 start:  10.123.123.100
  IPv4 end:    10.123.123.254
  IPv6 subnet: fd42:1234:1234:1234::1/64

Select an address for MicroCloud's internal traffic:

You must select exactly one address
Retry selecting an address? (yes/no) [default=yes]: 
Select an address for MicroCloud's internal traffic:

 Using address "10.21.203.4" for MicroCloud

Limit search for other MicroCloud servers to 10.21.203.4/24? (yes/no) [default=yes]: 
Scanning for eligible servers ...

 Selected "micro03" at "10.21.203.3"
 Selected "micro02" at "10.21.203.2"

Would you like to set up local storage? (yes/no) [default=yes]: 
Select exactly one disk from each cluster member:

Select which disks to wipe:

 Using "/dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_lxd_disk1" on "micro02" for local storage pool
 Using "/dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_lxd_disk1" on "micro03" for local storage pool
 Using "/dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_lxd_disk1" on "micro01" for local storage pool

Would you like to set up distributed storage? (yes/no) [default=yes]: 
Select from the available unpartitioned disks:

Select which disks to wipe:

 Using 1 disk(s) on "micro03" for remote storage pool
 Using 1 disk(s) on "micro01" for remote storage pool
 Using 1 disk(s) on "micro02" for remote storage pool

Configure distributed networking? (yes/no) [default=yes]: 

 Using "enp6s0" on "micro02" for OVN uplink
 Using "enp6s0" on "micro01" for OVN uplink
 Using "enp6s0" on "micro03" for OVN uplink

Specify the IPv4 gateway (CIDR) on the uplink network (empty to skip IPv4): 10.123.123.1/24
Specify the first IPv4 address in the range to use with LXD: 10.123.123.100
Specify the last IPv4 address in the range to use with LXD: 10.123.123.254
Specify the IPv6 gateway (CIDR) on the uplink network (empty to skip IPv6): fd42:1234:1234:1234::1/64
Initializing a new cluster
 Local MicroCloud is ready
 Local LXD is ready
 Local MicroOVN is ready
 Local MicroCeph is ready
Awaiting cluster formation ...
 Peer "micro03" has joined the cluster
 Peer "micro02" has joined the cluster
Cluster initialization is complete
Error: failed to notify peer 10.21.203.2:8443: Failed to get OVN client: Failed to get OVN southbound connection string: Failed to run: ovs-vsctl get open_vswitch . external_ids:ovn-remote: exit status 1 (ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory))

Failed to init microcloud: Error: Failed to get system resources of peer "node07":

Hello,

I installed microcloud on a a fresh ubuntu 22.04 image, using latest:edge, but it fails to initialize early on. microcloud can find the other 2 nodes on the network, and I have general connectivity to the other nodes via SSH. I have two interfaces to select from. However, a few seconds after selecting the nodes, microcloud reports an "invalid host header".

root@node07:/home/ubuntu# microcloud init
Select an address for MicroCloud's internal traffic:

 Using address "172.27.81.183" for MicroCloud

Limit search for other MicroCloud servers to 172.27.81.183/23? (yes/no) [default=yes]: yes
Scanning for eligible servers ...

 Selected "node09" at "172.27.81.179"
 Selected "node08" at "172.27.81.190"

Error: Failed to get system resources of peer "node09": Get "http://%2Fvar%2Fsnap%2Flxd%2Fcommon%2Flxd%2Funix.socket/1.0/resources": http: invalid Host header
root@node07:/home/ubuntu# 

snaps

root@node07:/home/ubuntu# sudo snap list
Name        Version        Rev    Tracking       Publisher   Notes
core20      20230308       1852   latest/stable  canonical✓  base
core22      20230703       817    latest/stable  canonical✓  base
lxd         5.0.2-838e1b2  24322  5.0/stable/…   canonical✓  -
microceph   0+git.dd2d8b7  520    latest/edge    canonical✓  -
microcloud  0+git.bbbb812  535    latest/edge    canonical✓  -
microovn    0+git.eaf5200  179    latest/edge    canonical✓  -
snapd       2.59.5         19457  latest/stable  canonical✓  snapd
root@node07:/home/ubuntu# 

Very little related in the snaps

root@node07:/home/ubuntu# sudo snap logs microcloud
2023-07-20T10:34:27Z systemd[1]: Started Service for snap application microcloud.daemon.
2023-07-20T10:34:28Z microcloud.daemon[4627]: time="2023-07-20T10:34:28Z" level=warning msg="Failed to parse new remotes from truststore"
2023-07-20T10:34:28Z microcloud.daemon[4627]: time="2023-07-20T10:34:28Z" level=warning msg="microcluster database is uninitialized"
2023-07-20T10:34:28Z microcloud.daemon[4627]: time="2023-07-20T10:34:28Z" level=warning msg="Failed to parse new remotes from truststore"

Can provide more logs if needed.

User cannot abort init process if any service is missing

If any services, such as MicroCeph and MicroOVN, are missing when MicroCloud is being initialized,
user cannot abort the process.

Regardless of the user's response, the process always continues normally:

Limit search for other MicroCloud servers to 10.45.149.211/24? (yes/no) [default=yes]: yes 
MicroCeph,MicroOVN not found. Continue anyway? (yes/no) [default=yes]: yes
Scanning for eligible servers ...

Limit search for other MicroCloud servers to 10.45.149.211/24? (yes/no) [default=yes]: yes 
MicroCeph,MicroOVN not found. Continue anyway? (yes/no) [default=yes]: no
Scanning for eligible servers ...

Docs: List extra requirements for use with Raspberry Pi

When running microcloud on a Raspberry Pi (without microovn) the following error message will appear after a cluster is successfully created with microcloud init.

Error: Failed adding link: Failed to run: ip link add name lxdfan0-fan type vxlan id 15728640 dev eth0 local 192.168.0.43 dstport 0 fan-map 240.0.0.0/8:192.168.0.0/24: exit status 2 (Error: Unknown device type.)

The error is appears because the newer Ubuntu RPi kernels don't include the vxlan module (launchpad issue here)

The kernel module needs to be installed an enabled for networking in lxd cluster containers to work properly.

To enable the vxlan module:

  • apt install linux-modules-extra-raspi
  • reboot
  • modprobe vxlan

This should be documented somewhere in the microcloud documentation.

Allow microcloud to output tables in the CLI as json and csv

Just like lxc can output information in different formats with the -f flag, this would be needed for the microcloud CLI as well.
The motivation behind this originates from the microcloud charm development. Indeed some actions like a node removal (when a charm is scaled down) require to check for the node ROLE before it actually calls microcloud cluster remove <node_name> .. A removal when the role is still in PENDING state leads to error during the removal (e.g, Failed to run "['microcloud', 'cluster', 'remove', 'metal-3']": Error: remove /var/snap/microcloud/common/state/truststore/metal-1.yaml: file does not exist . the certificate was probably not generated yet)

Having a json output format would allow the charm developer to easily track these ROLE states and eventually enqueue an action for later if such a state like PENDING is detected.

Rework the detection stage

Based on user feedback, there are two things we should be doing to make the detection stage easier:

  • Auto-detect for 5s
  • Show a list of detected systems (table), allowing for the user to pick which ones they want to cluster
  • Add a hotkey to trigger a clean search (interruptible by the user rather than default 5s)
  • Add a hotkey to manually add a machine by its IP address

The table should show hostname, IP and detected services (once we add that to the mix).

Improve disk handling during `microcloud bootstrap`

Right now, we have a loop which scans for disks on particular systems, then asks what disk the user wants to use, then goes onto the next system.

This gets annoying pretty quick when dealing with a large-ish number of systems.

Instead we should show a table of available disks across all members and allow the user to select which one they want from that and whether they want them wiped or not.

Ideally, we'd use some Go module supporting navigation within a table so we could go line by line and tick which ones we want and which ones we'd like to wipe. But short of being able to do that, we should just do a table with a row number as the first column, ask the user which one they want to add, prompt for wiping and then ask for more entries.

ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory)

microcloud 0+git.136aeb1

ubuntu@mc-1:~$ ip -br a
lo               UNKNOWN        127.0.0.1/8 ::1/128
enp1s0           UP             192.168.122.203/24 metric 100 fe80::5054:ff:fe6e:fa8c/64
enp7s0           UP             fe80::5054:ff:feda:6430/64
$ sudo microcloud init
Using address "192.168.122.203" for MicroCloud
Limit search for other MicroCloud servers to 192.168.122.203/24? (yes/no) [default=yes]: 
Scanning for eligible servers...

 Selected "mc-3" at "192.168.122.206"
 Selected "mc-2" at "192.168.122.186"

Would you like to setup local storage? (yes/no) [default=yes]: 
Select exactly one disk from each cluster member:

 Using "/dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001" on "mc-1" for local storage pool
 Using "/dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001" on "mc-2" for local storage pool
 Using "/dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001" on "mc-3" for local storage pool

Would you like to setup distributed storage? (yes/no) [default=yes]: 
Select from the available unpartitioned disks:

Select which disks to wipe:

 Using 1 disk(s) on "mc-1" for remote storage pool
 Using 1 disk(s) on "mc-2" for remote storage pool
 Using 1 disk(s) on "mc-3" for remote storage pool

Configure distributed networking? (yes/no) [default=yes]: 

 Using "enp7s0" on "mc-2" for OVN uplink
 Using "enp7s0" on "mc-1" for OVN uplink
 Using "enp7s0" on "mc-3" for OVN uplink

Select the IPv4 gateway (CIDR) on the uplink network (empty to skip IPv4): 192.168.122.1/24
Select the first IPv4 address in the range to use with LXD: 192.168.122.11
Select the last IPv4 address in the range to use with LXD: 192.168.122.99
Select the IPv6 gateway (CIDR) on the uplink network (empty to skip IPv6): 
Initializing a new cluster
 Local MicroCloud is ready
 Local LXD is ready
 Local MicroOVN is ready
 Local MicroCeph is ready
Awaiting cluster formation...
 Peer "mc-2" has joined the cluster
 Peer "mc-3" has joined the cluster
Cluster initialization is complete
Error: Failed to get OVN client: Failed to get OVN southbound connection string: Failed to run: ovs-vsctl get open_vswitch . external_ids:ovn-remote: exit status 1 (ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory))

The path doesn't exist. At the same time, the command with the microovn prefix (and the snap namespace) works.

$ sudo microovn.ovs-vsctl get open_vswitch . external_ids:ovn-remote
"tcp:192.168.122.203:6642,tcp:192.168.122.186:6642,tcp:192.168.122.206:6642"
$ sudo microovn.ovs-vsctl --help | grep -C 2 'default: unix:'
Options:
  --db=DATABASE               connect to DATABASE
                              (default: unix:/var/snap/microovn/common/run/switch//db.sock)
  --no-wait                   do not wait for ovs-vswitchd to reconfigure
  --retry                     keep trying to connect to server forever
$ sudo ls -alF /var/snap/microovn/common/run/switch/db.sock /var/run/openvswitch/db.sock
ls: cannot access '/var/run/openvswitch/db.sock': No such file or directory
srwxr-x--- 1 root root 0 Apr  1 01:34 /var/snap/microovn/common/run/switch/db.sock=

LXD fails to pickup non-pristine disks

In the microcloud init screen, the wizard seems to fail to pickup non-pristine disks. It offers to wipe the disk in the next screen, so I assume this is a bug. If I wipe a non-pristine disk with:

sudo wipefs -a /dev/sdb && sudo dd if=/dev/zero of=/dev/sdb bs=4096 count=100 > /dev/null

then microcloud picks up the disk next time the wizard is run.

Allow growing an existing cluster through `microcloud init`

The user should be able to re-run microcloud init to detect new servers and join them into the cluster.
If local storage is used, the joining servers will need to have at least one disk available for that.
Additional disks can then be added to Ceph if the user wants.

(empty to skip IPv6): Invalid input: invalid CIDR address

microcloud 0+git.badd32b

$ sudo microcloud init
Using address "192.168.122.58" for MicroCloud
Limit search for other MicroCloud servers to 192.168.122.58/24? (yes/no) [default=yes]: 
Scanning for eligible servers...

 Selected "mc-2" at "192.168.122.243"
 Selected "mc-3" at "192.168.122.4"

Would you like to setup local storage? (yes/no) [default=yes]: 
Select exactly one disk from each cluster member:

 Using "/dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001" on "mc-2" for local storage pool
 Using "/dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001" on "mc-3" for local storage pool
 Using "/dev/disk/by-id/scsi-SATA_QEMU_HARDDISK_QM00001" on "mc-1" for local storage pool

Would you like to setup distributed storage? (yes/no) [default=yes]: 
Select from the available unpartitioned disks:

Select which disks to wipe:

 Using 1 disk(s) on "mc-1" for remote storage pool
 Using 1 disk(s) on "mc-2" for remote storage pool
 Using 1 disk(s) on "mc-3" for remote storage pool

Configure distributed networking? (yes/no) [default=yes]: 
Select the IPv4 gateway (CIDR) on the uplink network (empty to skip IPv4): 192.168.122.1
Invalid input: invalid CIDR address: 192.168.122.1

Select the IPv4 gateway (CIDR) on the uplink network (empty to skip IPv4): 192.168.122.0/24        
Select the first IPv4 address in the range to use with LXD: 192.168.122.11
Select the last IPv4 address in the range to use with LXD: 192.168.122.99
Select the IPv6 gateway (CIDR) on the uplink network (empty to skip IPv6): 
Invalid input: invalid CIDR address: 

Select the IPv6 gateway (CIDR) on the uplink network (empty to skip IPv6): 
Invalid input: invalid CIDR address: 

Excessive CPU usage at idle

When microcloud isn't running mDNS and isn't busy getting clustered, it should pretty much do nothing at all.
Given that, I was very surprised to often see it amongst the busiest process on my low powered ARM boards.

This seems to suggest some kind of issue internally, potentially very aggressive heartbeats or the like:

root@micro01:~# health-check -c -p 493

CPU usage (in terms of 1 CPU):
     PID Process                USR%   SYS% TOTAL%   Duration
     493 microcloudd            6.51   5.15  11.66      60.06  (slight load)

Page Faults:
     PID Process                 Minor/sec    Major/sec    Total/sec
     493 microcloudd                 22.45         0.00        22.45

Context Switches:
     PID Process                Voluntary   Involuntary     Total
                               Ctxt Sw/Sec  Ctxt Sw/Sec  Ctxt Sw/Sec
     790 microcloudd                557.80         0.83       558.63 (quite high)
    2914 microcloudd                202.56         0.80       203.36 (quite high)
     843 microcloudd                146.98         3.18       150.16 (quite high)
     840 microcloudd                141.97         5.96       147.93 (quite high)
     792 microcloudd                132.56         5.38       137.94 (quite high)
   22186 microcloudd                108.63         3.88       112.51 (quite high)
     842 microcloudd                 96.59         3.03        99.62 (moderate)
     793 microcloudd                 92.55         6.16        98.71 (moderate)
    2918 microcloudd                  0.32         0.00         0.32 (very low)
     795 microcloudd                  0.18         0.00         0.18 (very low)
     841 microcloudd                  0.15         0.00         0.15 (very low)
     794 microcloudd                  0.15         0.00         0.15 (very low)
    2915 microcloudd                  0.13         0.00         0.13 (very low)
     493 microcloudd                  0.13         0.00         0.13 (very low)
    2917 microcloudd                  0.10         0.00         0.10 (very low)
    2916 microcloudd                  0.10         0.00         0.10 (very low)
    7488 microcloudd                  0.07         0.00         0.07 (very low)
 Total                           1480.98        29.22      1510.20

System calls traced:
     PID Process              Syscall               Count    Rate/Sec    Total μSecs  % Call Time
     493 microcloudd          futex                     1       0.0167             0      0.0000
     790 microcloudd          nanosleep              9884     164.5812      19389366      4.6710
     790 microcloudd          futex                   672      11.1897      30839574      7.4293
     790 microcloudd          epoll_pwait             401       6.6772        508731      0.1226
     790 microcloudd          tgkill                  283       4.7123        185199      0.0446
     790 microcloudd          getpid                  283       4.7123         51614      0.0124
     790 microcloudd          sched_yield              44       0.7327          5241      0.0013
     790 microcloudd          restart_syscall           1       0.0167        461618      0.1112
     792 microcloudd          read                    733      12.2054        129075      0.0311
     792 microcloudd          futex                   712      11.8557      35544174      8.5627
     792 microcloudd          epoll_pwait             693      11.5393       9711575      2.3395
     792 microcloudd          write                   444       7.3932        411198      0.0991
     792 microcloudd          setsockopt              316       5.2618        177394      0.0427
     792 microcloudd          epoll_ctl               135       2.2479         36417      0.0088
     792 microcloudd          rt_sigreturn             82       1.3654             0      0.0000
     792 microcloudd          getsockname              70       1.1656         12130      0.0029
     792 microcloudd          close                    62       1.0324         19615      0.0047
     792 microcloudd          accept4                  56       0.9325         12856      0.0031
     792 microcloudd          fcntl                    50       0.8326          4150      0.0010
     792 microcloudd          nanosleep                49       0.8159         13009      0.0031
     792 microcloudd          socket                   43       0.7160         12363      0.0030
     792 microcloudd          getpeername              43       0.7160          7592      0.0018
     792 microcloudd          connect                  43       0.7160        300452      0.0724
     792 microcloudd          madvise                  38       0.6327          8477      0.0020
     792 microcloudd          shutdown                 25       0.4163         10134      0.0024
     792 microcloudd          sched_yield              13       0.2165          3164      0.0008
     792 microcloudd          openat                   12       0.1998          2198      0.0005
     792 microcloudd          fstat                     9       0.1499           951      0.0002
     792 microcloudd          getsockopt                7       0.1166          1710      0.0004
     792 microcloudd          getdents64                6       0.0999           880      0.0002
     792 microcloudd          io_setup                  6       0.0999             0      0.0000
     792 microcloudd          newfstatat                2       0.0333           486      0.0001
     793 microcloudd          read                    561       9.3414         91202      0.0220
     793 microcloudd          epoll_pwait             543       9.0416       7877986      1.8978
     793 microcloudd          futex                   448       7.4598      43967605     10.5919
     793 microcloudd          write                   333       5.5449        200483      0.0483
     793 microcloudd          setsockopt              178       2.9639         19120      0.0046
     793 microcloudd          epoll_ctl                76       1.2655         14370      0.0035
     793 microcloudd          rt_sigreturn             50       0.8326             0      0.0000
     793 microcloudd          nanosleep                42       0.6994         10271      0.0025
     793 microcloudd          close                    38       0.6327          9201      0.0022
     793 microcloudd          getsockname              36       0.5994          4332      0.0010
     793 microcloudd          fcntl                    32       0.5328        120715      0.0291
     793 microcloudd          accept4                  27       0.4496          5013      0.0012
     793 microcloudd          madvise                  24       0.3996        145343      0.0350
     793 microcloudd          getpeername              23       0.3830          2485      0.0006
     793 microcloudd          connect                  22       0.3663          6795      0.0016
     793 microcloudd          socket                   22       0.3663        123566      0.0298
     793 microcloudd          sched_yield              19       0.3164          2322      0.0006
     793 microcloudd          shutdown                 16       0.2664          2295      0.0006
     793 microcloudd          openat                    8       0.1332          2025      0.0005
     793 microcloudd          fstat                     6       0.0999           992      0.0002
     793 microcloudd          getdents64                4       0.0666           552      0.0001
     793 microcloudd          io_setup                  4       0.0666             0      0.0000
     793 microcloudd          getsockopt                3       0.0500           340      0.0001
     793 microcloudd          newfstatat                2       0.0333           455      0.0001
     793 microcloudd          faccessat                 1       0.0167             0      0.0000
     793 microcloudd          getpid                    1       0.0167           166      0.0000
     793 microcloudd          tgkill                    1       0.0167           241      0.0001
     794 microcloudd          futex                     1       0.0167             0      0.0000
     795 microcloudd          futex                     1       0.0167             0      0.0000
     840 microcloudd          epoll_pwait             888      14.7863      11465999      2.7622
     840 microcloudd          read                    843      14.0370        238520      0.0575
     840 microcloudd          futex                   658      10.9565      39319706      9.4722
     840 microcloudd          write                   506       8.4255        740072      0.1783
     840 microcloudd          setsockopt              331       5.5116        292177      0.0704
     840 microcloudd          epoll_ctl               124       2.0648         21414      0.0052
     840 microcloudd          nanosleep                71       1.1822         15943      0.0038
     840 microcloudd          close                    71       1.1822         21508      0.0052
     840 microcloudd          rt_sigreturn             59       0.9824             0      0.0000
     840 microcloudd          getsockname              57       0.9491        187842      0.0453
     840 microcloudd          accept4                  54       0.8992         10375      0.0025
     840 microcloudd          getpeername              27       0.4496          3278      0.0008
     840 microcloudd          socket                   26       0.4329        113958      0.0275
     840 microcloudd          connect                  25       0.4163          6993      0.0017
     840 microcloudd          shutdown                 22       0.3663          3610      0.0009
     840 microcloudd          getpid                   17       0.2831          2519      0.0006
     840 microcloudd          tgkill                   17       0.2831          1931      0.0005
     840 microcloudd          sched_yield              11       0.1832          1066      0.0003
     840 microcloudd          madvise                   6       0.0999           678      0.0002
     840 microcloudd          getsockopt                6       0.0999           608      0.0001
     840 microcloudd          io_setup                  3       0.0500             0      0.0000
     840 microcloudd          fcntl                     1       0.0167             0      0.0000
     840 microcloudd          io_destroy                1       0.0167             0      0.0000
     841 microcloudd          futex                     1       0.0167             0      0.0000
     842 microcloudd          epoll_pwait             591       9.8409       7778880      1.8740
     842 microcloudd          read                    570       9.4912        192911      0.0465
     842 microcloudd          futex                   449       7.4764      45243801     10.8994
     842 microcloudd          write                   349       5.8113         73087      0.0176
     842 microcloudd          setsockopt              188       3.1304         21410      0.0052
     842 microcloudd          epoll_ctl                84       1.3987         11404      0.0027
     842 microcloudd          nanosleep                69       1.1489        170231      0.0410
     842 microcloudd          close                    49       0.8159         10516      0.0025
     842 microcloudd          getsockname              35       0.5828          3720      0.0009
     842 microcloudd          accept4                  34       0.5661        162919      0.0392
     842 microcloudd          rt_sigreturn             30       0.4995             0      0.0000
     842 microcloudd          shutdown                 18       0.2997          2165      0.0005
     842 microcloudd          socket                   17       0.2831          3301      0.0008
     842 microcloudd          getpeername              17       0.2831          1793      0.0004
     842 microcloudd          connect                  17       0.2831          4900      0.0012
     842 microcloudd          madvise                  16       0.2664          2723      0.0007
     842 microcloudd          sched_yield              12       0.1998          1185      0.0003
     842 microcloudd          fcntl                    10       0.1665           854      0.0002
     842 microcloudd          getsockopt                3       0.0500           401      0.0001
     842 microcloudd          fstat                     3       0.0500           368      0.0001
     842 microcloudd          newfstatat                2       0.0333           320      0.0001
     842 microcloudd          openat                    2       0.0333           587      0.0001
     842 microcloudd          getpid                    1       0.0167           126      0.0000
     842 microcloudd          tgkill                    1       0.0167           136      0.0000
     842 microcloudd          flock                     1       0.0167             0      0.0000
     842 microcloudd          io_setup                  1       0.0167             0      0.0000
     843 microcloudd          epoll_pwait             878      14.6198       9282394      2.2362
     843 microcloudd          read                    854      14.2202        107619      0.0259
     843 microcloudd          futex                   640      10.6568      38319270      9.2312
     843 microcloudd          write                   497       8.2757        208871      0.0503
     843 microcloudd          setsockopt              279       4.6457        145218      0.0350
     843 microcloudd          epoll_ctl               151       2.5143         29521      0.0071
     843 microcloudd          close                   106       1.7650         19356      0.0047
     843 microcloudd          madvise                  86       1.4320          9246      0.0022
     843 microcloudd          nanosleep                84       1.3987         15997      0.0039
     843 microcloudd          fcntl                    79       1.3155          7223      0.0017
     843 microcloudd          rt_sigreturn             57       0.9491             0      0.0000
     843 microcloudd          getsockname              44       0.7327        113198      0.0273
     843 microcloudd          accept4                  42       0.6994        110871      0.0267
     843 microcloudd          shutdown                 34       0.5661          3524      0.0008
     843 microcloudd          getpeername              22       0.3663          2052      0.0005
     843 microcloudd          connect                  22       0.3663        149337      0.0360
     843 microcloudd          socket                   22       0.3663          3972      0.0010
     843 microcloudd          openat                   20       0.3330          5203      0.0013
     843 microcloudd          sched_yield              18       0.2997          1294      0.0003
     843 microcloudd          fstat                     9       0.1499           959      0.0002
     843 microcloudd          getdents64                8       0.1332          1005      0.0002
     843 microcloudd          newfstatat                5       0.0833           896      0.0002
     843 microcloudd          unlinkat                  5       0.0833          9818      0.0024
     843 microcloudd          getsockopt                4       0.0666           325      0.0001
     843 microcloudd          io_setup                  3       0.0500             0      0.0000
     843 microcloudd          renameat                  2       0.0333          4815      0.0012
     843 microcloudd          getpid                    1       0.0167            54      0.0000
     843 microcloudd          tgkill                    1       0.0167           115      0.0000
     843 microcloudd          set_tid_address           1       0.0167             0      0.0000
     843 microcloudd          fsync                     1       0.0167         10428      0.0025
     843 microcloudd          fchmod                    1       0.0167           154      0.0000
     843 microcloudd          io_getevents              1       0.0167             0      0.0000
     843 microcloudd          setxattr                  1       0.0167             0      0.0000
     843 microcloudd          io_destroy                1       0.0167             0      0.0000
    2914 microcloudd          read                   1903      31.6874        503174      0.1212
    2914 microcloudd          epoll_ctl              1415      23.5616        127226      0.0306
    2914 microcloudd          epoll_pwait            1081      18.0000      54165515     13.0486
    2914 microcloudd          write                   618      10.2905         78783      0.0190
    2914 microcloudd          accept4                 238       3.9630       1005315      0.2422
    2914 microcloudd          close                   136       2.2646         21064      0.0051
    2914 microcloudd          getpid                  119       1.9815          9168      0.0022
    2914 microcloudd          getsockopt              119       1.9815        321631      0.0775
    2914 microcloudd          mprotect                  5       0.0833           649      0.0002
    2914 microcloudd          madvise                   4       0.0666           676      0.0002
    2914 microcloudd          writev                    2       0.0333           303      0.0001
    2914 microcloudd          futex                     2       0.0333           277      0.0001
    2914 microcloudd          io_getevents              1       0.0167           111      0.0000
    2914 microcloudd          io_submit                 1       0.0167           141      0.0000
    2915 microcloudd          futex                     1       0.0167             0      0.0000
    2916 microcloudd          futex                     1       0.0167             0      0.0000
    2917 microcloudd          futex                     1       0.0167             0      0.0000
    2918 microcloudd          futex                     4       0.0666      19533467      4.7057
    2918 microcloudd          io_submit                 1       0.0167           348      0.0001
    2918 microcloudd          io_getevents              1       0.0167          9443      0.0023
    2918 microcloudd          write                     1       0.0167           124      0.0000
    7488 microcloudd          futex                     1       0.0167             0      0.0000
   22186 microcloudd          epoll_pwait             697      11.6059       5945192      1.4322
   22186 microcloudd          read                    631      10.5070         75070      0.0181
   22186 microcloudd          futex                   432       7.1934      27430328      6.6081
   22186 microcloudd          write                   402       6.6938        334661      0.0806
   22186 microcloudd          setsockopt              224       3.7299        167262      0.0403
   22186 microcloudd          madvise                 186       3.0971         17990      0.0043
   22186 microcloudd          epoll_ctl                70       1.1656        125937      0.0303
   22186 microcloudd          nanosleep                43       0.7160          9515      0.0023
   22186 microcloudd          sched_yield              38       0.6327          2450      0.0006
   22186 microcloudd          getsockname              38       0.6327          4713      0.0011
   22186 microcloudd          close                    32       0.5328          5132      0.0012
   22186 microcloudd          rt_sigreturn             28       0.4662             0      0.0000
   22186 microcloudd          accept4                  27       0.4496          5044      0.0012
   22186 microcloudd          socket                   26       0.4329          4567      0.0011
   22186 microcloudd          getpeername              25       0.4163          2192      0.0005
   22186 microcloudd          connect                  25       0.4163          7951      0.0019
   22186 microcloudd          shutdown                 21       0.3497          2238      0.0005
   22186 microcloudd          fcntl                    12       0.1998           905      0.0002
   22186 microcloudd          getsockopt                8       0.1332           879      0.0002
   22186 microcloudd          io_destroy                5       0.0833             0      0.0000
   22186 microcloudd          tgkill                    4       0.0666           538      0.0001
   22186 microcloudd          getpid                    4       0.0666           362      0.0001
   22186 microcloudd          openat                    3       0.0500          1031      0.0002
   22186 microcloudd          renameat                  2       0.0333          5183      0.0012
   22186 microcloudd          unlinkat                  2       0.0333           458      0.0001
   22186 microcloudd          newfstatat                2       0.0333           301      0.0001
   22186 microcloudd          io_setup                  1       0.0167             0      0.0000
   22186 microcloudd          fremovexattr              1       0.0167             0      0.0000
   22186 microcloudd          flock                     1       0.0167             0      0.0000
   22186 microcloudd          fchmod                    1       0.0167           157      0.0000
   22186 microcloudd          fsync                     1       0.0167          9354      0.0023
 Total                                            36760     612.1008     415104713

Top polling system calls:
     PID Process              Syscall             Rate/Sec   Infinite   Zero     Minimum    Maximum    Average
                                                             Timeouts Timeouts   Timeout    Timeout    Timeout
     790 microcloudd          nanosleep             164.5812        0        0  20.0 usec  10.0 msec 115.0 usec
     790 microcloudd          epoll_pwait             6.6772        0      401   0.0 sec    0.0 sec    0.0 sec 
     792 microcloudd          epoll_pwait            11.5393        0      444   0.0 sec   13.4 sec    1.6 sec 
     792 microcloudd          nanosleep               0.8159        0        0   3.0 usec   3.0 usec   3.0 usec
     793 microcloudd          epoll_pwait             9.0416        0      342   0.0 sec   14.2 sec    1.9 sec 
     793 microcloudd          nanosleep               0.6994        0        0   3.0 usec   3.0 usec   3.0 usec
     840 microcloudd          epoll_pwait            14.7863        3      559   0.0 sec   29.3 sec    1.8 sec 
     840 microcloudd          nanosleep               1.1822        0        0   3.0 usec   3.0 usec   3.0 usec
     842 microcloudd          epoll_pwait             9.8409        0      374   0.0 sec   14.3 sec    1.7 sec 
     842 microcloudd          nanosleep               1.1489        0        0   3.0 usec   3.0 usec   3.0 usec
     843 microcloudd          epoll_pwait            14.6198        0      570   0.0 sec   29.3 sec    1.6 sec 
     843 microcloudd          nanosleep               1.3987        0        0   3.0 usec   3.0 usec   3.0 usec
    2914 microcloudd          epoll_pwait            18.0000        0        0   1.0 msec 500.0 msec 328.1 msec
   22186 microcloudd          epoll_pwait            11.6059        0      432   0.0 sec   13.7 sec    1.6 sec 
   22186 microcloudd          nanosleep               0.7160        0        0   3.0 usec   3.0 usec   3.0 usec
 Total                                            266.6535        3     3122

Distribution of poll timeout times:
                                                            10.0  100.0    1.0   10.0  100.0    1.0   10.0  100.0
                                                    up to    to     to     to     to     to     to     to  or more
                                              Zero    9.9   99.9  999.9    9.9   99.9  999.9    9.9   99.9        Infinite
     PID Process              Syscall            sec   usec   usec   usec   msec   msec   msec    sec    sec    sec   Wait
     790 microcloudd          nanosleep            0     -    9682     72     53     77     -      -      -      -       0
     790 microcloudd          epoll_pwait        401     -      -      -      -      -      -      -      -      -       0
     792 microcloudd          epoll_pwait        444     -      -      -       2      7     29    189     22     -       0
     792 microcloudd          nanosleep            0     49     -      -      -      -      -      -      -      -       0
     793 microcloudd          epoll_pwait        342     -      -      -       1     14     22    150     14     -       0
     793 microcloudd          nanosleep            0     42     -      -      -      -      -      -      -      -       0
     840 microcloudd          epoll_pwait        559     -      -      -       5      6     39    254     22     -       3
     840 microcloudd          nanosleep            0     71     -      -      -      -      -      -      -      -       0
     842 microcloudd          epoll_pwait        374     -      -      -       4      2     25    153     33     -       0
     842 microcloudd          nanosleep            0     69     -      -      -      -      -      -      -      -       0
     843 microcloudd          epoll_pwait        570     -      -      -       2     14     35    219     38     -       0
     843 microcloudd          nanosleep            0     84     -      -      -      -      -      -      -      -       0
    2914 microcloudd          epoll_pwait          0     -      -      -      12    133    936     -      -      -       0
   22186 microcloudd          epoll_pwait        432     -      -      -      -       5     66    178     16     -       0
   22186 microcloudd          nanosleep            0     43     -      -      -      -      -      -      -      -       0

Polling system call analysis:
 microcloudd (790), epoll_pwait:
        227 immediate timed out calls with zero timeout (non-blocking peeks)
         99 repeated immediate timed out polled calls with zero timeouts (heavy polling peeks)
 microcloudd (792), epoll_pwait:
        250 immediate timed out calls with zero timeout (non-blocking peeks)
          1 repeated timed out polled calls with non-zero timeouts (light polling)
         62 repeated immediate timed out polled calls with zero timeouts (heavy polling peeks)
 microcloudd (793), epoll_pwait:
        163 immediate timed out calls with zero timeout (non-blocking peeks)
         21 repeated immediate timed out polled calls with zero timeouts (heavy polling peeks)
          1 system call errors
 microcloudd (840), epoll_pwait:
        287 immediate timed out calls with zero timeout (non-blocking peeks)
          5 repeated timed out polled calls with non-zero timeouts (light polling)
         47 repeated immediate timed out polled calls with zero timeouts (heavy polling peeks)
 microcloudd (842), epoll_pwait:
        195 immediate timed out calls with zero timeout (non-blocking peeks)
          3 repeated timed out polled calls with non-zero timeouts (light polling)
         41 repeated immediate timed out polled calls with zero timeouts (heavy polling peeks)
 microcloudd (843), epoll_pwait:
        311 immediate timed out calls with zero timeout (non-blocking peeks)
          2 repeated timed out polled calls with non-zero timeouts (light polling)
         71 repeated immediate timed out polled calls with zero timeouts (heavy polling peeks)
          1 system call errors
 microcloudd (22186), epoll_pwait:
        243 immediate timed out calls with zero timeout (non-blocking peeks)
          3 repeated timed out polled calls with non-zero timeouts (light polling)
         37 repeated immediate timed out polled calls with zero timeouts (heavy polling peeks)

Filesystem Syncs:
     PID  fdatasync    fsync     sync   syncfs    total   total (Rate)
     843          0        1        0        0        1     0.02
   22186          0        1        0        0        1     0.02

Files Sync'd:
     PID  syscall    # sync's filename
     843  fsync             1 /var/snap/microcloud/common/state/database/.cluster.yaml4098083994
   22186  fsync             1 /var/snap/microcloud/common/state/database/.cluster.yaml2182960107

Inotify watches added:
 None.

Memory:
Per Process Memory (K):
     PID Process              Type        Size       RSS       PSS
     493 microcloudd          Stack        132        12        12
     493 microcloudd          Heap     1761484     17496     17496
     493 microcloudd          Mapped     18408     16164     13746

Change in memory (K/second):
     PID Process              Type        Size       RSS       PSS
     493 microcloudd          Heap        0.00     24.31     24.31 (growing moderately fast)

Heap Change via brk():
 None.

Memory Change via mmap() and munmap():
 None.

Open Network Connections:
     PID Process              Proto         Send   Receive  Address
     493 microcloudd          UNIX        0.00 B    0.00 B  /run/dbus/system_bus_socket
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:45600
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:34220
     493 microcloudd          UNIX        0.00 B    0.00 B  /var/snap/microovn/common/run/central/ovnnb_db.ctl
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:53296
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:34628
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:53284
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:41802
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:40438
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:45620
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:40430
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:45616
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:34266
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:34250
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:40422
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:34238
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:34226
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:45566
     493 microcloudd          UNIX        0.00 B    0.00 B  /var/snap/microcloud/common/state/control.socket
     493 microcloudd          TCP6        0.00 B    0.00 B  :::0
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:53312
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:34314
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:53274
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:34302
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:41754
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:41562
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:40410
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:34282
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:45634
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:34292
     493 microcloudd          UNIX        0.00 B    0.00 B  /run/systemd/journal/io.systemd.journal
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:45582
     493 microcloudd          UNIX        0.00 B    0.00 B  /run/systemd/journal/stdout
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:45576
     493 microcloudd          UNIX        0.00 B    0.00 B  /run/udev/control
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:45550
     493 microcloudd          UNIX        0.00 B    0.00 B  @snap.microceph.dqlite
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:45552
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:41636
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:41800
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::169:35252
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::169:9443
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:45536
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:34648
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:34638
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:9443
     493 microcloudd          UNIX        0.00 B    0.00 B  @snap.microovn.dqlite
     493 microcloudd          UNIX        0.00 B    0.00 B  /var/snap/microceph/302/run/ceph-mon.micro01.asok
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::171:9443
     493 microcloudd          UNIX        0.00 B    0.00 B  /var/snap/microovn/common/state/control.socket
     493 microcloudd          UNIX        0.00 B    0.00 B  /run/user/1000/bus
     493 microcloudd          UNIX        0.00 B    0.00 B  /run/user/1000/snapd-session-agent.socket
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::170:41592
     493 microcloudd          UNIX        0.00 B    0.00 B  @snap.microcloud.dqlite
     493 microcloudd          TCP6        0.00 B    0.00 B  2602:fc62:d:100::169:44806
 Total                                  0.00 B    0.00 B

Ambiguous initial networking questions

microcloud 0+git.1fb7a9f

The new init dialogue is actually more confusing to me.

$ sudo microcloud init 
Please choose the address MicroCloud will be listening on [default=192.168.122.50]: 
Please choose the subnet for MicroCloud (all/[subnet]) [default=192.168.122.0/24]: 

It's not so clear what's the difference on those two initial questions and what implication based on the selection.

I suppose it's about 192.168.122.50:9443 vs 0.0.0.0:9443 but from an user experience point of view, the question could be user story based. e.g. "Do you want to publish the LXD API service to all subnets or do you want to restrict it to a specific subnet?" or something like that.

$ sudo ss -tlnp 
State  Recv-Q  Send-Q    Local Address:Port   Peer Address:Port Process                                    
LISTEN 0       4096     192.168.122.50:9443        0.0.0.0:*     users:(("microcloudd",pid=5394,fd=8))     
LISTEN 0       512      192.168.122.50:3300        0.0.0.0:*     users:(("ceph-mon",pid=7080,fd=25))       
LISTEN 0       512      192.168.122.50:6789        0.0.0.0:*     users:(("ceph-mon",pid=7080,fd=26))       
LISTEN 0       4096     192.168.122.50:6443        0.0.0.0:*     users:(("microovnd",pid=1919,fd=9))       
LISTEN 0       512      192.168.122.50:6800        0.0.0.0:*     users:(("ceph-mds",pid=7742,fd=16))       
LISTEN 0       512      192.168.122.50:6801        0.0.0.0:*     users:(("ceph-mds",pid=7742,fd=17))       
LISTEN 0       10       192.168.122.50:6641        0.0.0.0:*     users:(("ovsdb-server",pid=10127,fd=17))  
LISTEN 0       10       192.168.122.50:6642        0.0.0.0:*     users:(("ovsdb-server",pid=10138,fd=17))  
LISTEN 0       512      192.168.122.50:6802        0.0.0.0:*     users:(("ceph-osd",pid=8903,fd=17))       
LISTEN 0       10       192.168.122.50:6643        0.0.0.0:*     users:(("ovsdb-server",pid=10127,fd=20))  
LISTEN 0       512      192.168.122.50:6803        0.0.0.0:*     users:(("ceph-osd",pid=8903,fd=18))       
LISTEN 0       4096     192.168.122.50:7443        0.0.0.0:*     users:(("microcephd",pid=4275,fd=9))      
LISTEN 0       10       192.168.122.50:6644        0.0.0.0:*     users:(("ovsdb-server",pid=10138,fd=20))  
LISTEN 0       512             0.0.0.0:6804        0.0.0.0:*     users:(("ceph-osd",pid=8903,fd=19))       
LISTEN 0       512             0.0.0.0:6805        0.0.0.0:*     users:(("ceph-osd",pid=8903,fd=20))       
LISTEN 0       32           240.50.0.1:53          0.0.0.0:*     users:(("dnsmasq",pid=7078,fd=7))         
LISTEN 0       4096      127.0.0.53%lo:53          0.0.0.0:*     users:(("systemd-resolve",pid=639,fd=14)) 
LISTEN 0       512      192.168.122.50:6806        0.0.0.0:*     users:(("ceph-osd",pid=8903,fd=21))       
LISTEN 0       128             0.0.0.0:22          0.0.0.0:*     users:(("sshd",pid=715,fd=3))             
LISTEN 0       512      192.168.122.50:6807        0.0.0.0:*     users:(("ceph-osd",pid=8903,fd=22))       
LISTEN 0       512             0.0.0.0:6808        0.0.0.0:*     users:(("ceph-osd",pid=8903,fd=23))       
LISTEN 0       512             0.0.0.0:6809        0.0.0.0:*     users:(("ceph-osd",pid=8903,fd=24))       
LISTEN 0       512      192.168.122.50:6810        0.0.0.0:*     users:(("ceph-mgr",pid=7633,fd=25))       
LISTEN 0       512      192.168.122.50:6811        0.0.0.0:*     users:(("ceph-mgr",pid=7633,fd=26))       
LISTEN 0       4096     192.168.122.50:8443        0.0.0.0:*     users:(("lxd",pid=2919,fd=23))            
LISTEN 0       128                [::]:22             [::]:*     users:(("sshd",pid=715,fd=4))

Filter mDNS based on interface of the listening address`

When building a cluster, we should only consider machines that are connected to the same network that the initial server is listening on.
This should allow for effectively selecting a separate network/vlan for use of MicroCloud.

Setup images and backups volumes when local pool is available

When a local storage pool is available, we need MicroCloud to do for each server:

  • Create a volume called images on the local pool
  • Create a volume called backupson the local pool
  • Set storage.backups_volume to local/backups
  • Set storage.images_volume to local/images

This will avoid consuming space on the root filesystem which may be very small in MicroCloud.
It will also speed up demos on our test hardware as it will avoid using MicroSD storage when spawning instances.

Configure MicroCloud with a preseed.yaml

We will want to have support for a yaml preseed file passed into microcloud init --preseed <yaml> that will take care of all the otherwise interactive configuration.

To facilitate selecting disks without knowing all the information, we want to be smart about filtering on some key details.

lookup_subnet: 10.0.0.1/24 # The subnet to lookup servers over mDNS
systems:
 - name: micro01 # The hostnames of the specific machines we want in the cluster
   ovn_uplink_interface: eth1 # If unset, will take the first available if setting up OVN.
ovn:
  ipv4_gateway: 10.0.0.1/24
  ipv4_range: [10.0.0.100,10.0.0.254]
  ipv6_gateway: ...
disks: # Separate config for local and ceph storage.
  local:
    - id: nvme0n1 # Provide just a string to match exactly
      model:
        filter: "INTEL" # Provide a struct with <filter> and <mode>. Mode defaults to "contains" for strings and "exact" for ints.
        mode: "prefix"
      size:
        filter: "10GiB" 
        mode: "greater_than"
  ceph:
    max_disks: 3 # Total maximum number of expected disks. Expects unlimited if not specified. Fails if not at least 3.
    - id: nvme1n1
      type: 
        filter: "nvme"
        mode: "prefix"
      limit: 1 # Limit of matching selections per machine. Defaults to unlimited.
      machines: ["micro01"] # Machines to consider the filter for. Others will be ignored. Defaults to all. 
      wipe: false # Wipe disks found for this entry on each machine. 
    - size: 30GiB
     ...

I've been thinking about something sort of like this, where we have 2 entries for local and ceph. Each takes a slice of objects containing the fields for a struct of a similar form to lxdAPI.ResourcesStorageDisk. The fields will accept either a plain string for an exact match, or an object with filter and mode as options. Additional fields per disk are machine, specifying a list of machines the disk criteria should apply to, limit, specifying the number of disks per machine to consider, and wipe.

Microcloud init - Error: Failed to get system resources of peer : No auth secret in response

Originally raised in the LXD project so re-raising this here.
Also this was marked as a duplicate of #107 but reading that issue - this seems like a different issue/cause.

When trying to init microcloud in a 3 server cluster (physical servers not VMs) I get the following error:

Error: Failed to get system resources of peer "r520-03": No auth secret in response

I am using the edge version of lxd as I want to try this with the new lxd GUI.
I am using the latest version of microcloud on a fresh installation of Ubuntu 22.04.
Ubuntu and all snaps fully updated.

What can I do to diagnose this issue and resolve it?

To reproduce on 3 servers with a fresh install of Ubuntu Server 22.04

sudo snap install lxd
sudo snap refresh lxd --edge
sudo snap install microceph microovn microcloud

Then run the init command:

testuser@r520-02:~$ sudo lxd --version
4.0.9
testuser@r520-02:~$ sudo microcloud --version
0.1
testuser@r520-02:~$ sudo uname -a
Linux r520-02 5.15.0-72-generic #79-Ubuntu SMP Wed Apr 19 08:22:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
testuser@r520-02:~$ sudo microcloud init
Using address "192.168.1.127" for MicroCloud
Limit search for other MicroCloud servers to 192.168.1.127/24? (yes/no) [default=yes]:
Scanning for eligible servers ...

 Selected "r520-03" at "192.168.1.128"
 Selected "r520-04" at "192.168.1.123"

Error: Failed to get system resources of peer "r520-04": No auth secret in response

This is shown in the syslog
May 20 11:12:27 r520-02 kernel: [ 4768.307666] audit: type=1400 audit(1684581147.188:146): apparmor="DENIED" operation="open" profile="snap.microcloud.microcloud" name="/var/lib/snapd/hostfs/etc/ssl/certs/ca-certificates.crt" pid=5993 comm="microcloud" requested_mask="r" denied_mask="r" fsuid=0 ouid=0

More information that may be useful:

testuser@r520-02:~$ snap services
Service            Startup   Current   Notes
lxd.activate       enabled   inactive  -
lxd.daemon         enabled   active    socket-activated
microceph.daemon   enabled   active    -
microceph.mds      disabled  inactive  -
microceph.mgr      disabled  inactive  -
microceph.mon      disabled  inactive  -
microceph.osd      disabled  inactive  -
microceph.rgw      disabled  inactive  -
microcloud.daemon  enabled   active    -
microovn.central   disabled  inactive  -
microovn.chassis   disabled  inactive  -
microovn.daemon    enabled   active    -
microovn.switch    disabled  inactive  -

testuser@r520-02:~$ snap list
Name        Version        Rev    Tracking       Publisher   Notes
core20      20230503       1891   latest/stable  canonical✓  base
core22      20230503       634    latest/stable  canonical✓  base
lxd         git-407205d    23988  4.0/edge       canonical✓  -
microceph   0+git.fdf6d5e  338    latest/stable  canonical✓  -
microcloud  0+git.9cb7ccd  412    latest/stable  canonical✓  -
microovn    0+git.f8a4497  91     latest/stable  canonical✓  -
snapd       2.59.2         19122  latest/stable  canonical✓  snapd

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.