Code Monkey home page Code Monkey logo

xcluster-cni's Introduction

xcluster-cni

Xcluster-cni is a CNI-plugin for Kubernetes. It is developed for xcluster but can be used in any K8s cluster. Key features;

  • Multi networking - xcluster-cni can create POD-POD connectivity on secondary networks. Multus can be used to bring in extra interfaces to PODs

  • Flexible - Making it useful for experiments. But it also lay a greater responsibility on the user. For instance network overlays are not directly supported, but you can create them yourself

  • Very small footprint - xcluster-cni setup routes and watches the K8s node objects. If the K8s node objects doesn't change often the overhead created by xcluster-cni is basically none

The main components of xcluster-cni are:

  • kube-node IPAM. Reads information (address ranges) from the K8s node object and assigns addresses by delegating to the host-local ipam

  • The xcluster-cni router binary. Sets up routing to address ranges (CIDRs) between K8s nodes. This is deployed as a DaemonSet that will also install (and upgrade) the CNI-plugin on the nodes.

To install xcluster-cni as the main K8s CNI-plugin make sure --allocate-node-cidrs=true is given to the kube-controller-manager and do:

kubectl apply -n kube-system -f https://raw.githubusercontent.com/Nordix/xcluster-cni/master/xcluster-cni.yaml

Multi Networking

A key feature is multi networking. Xcluster-cni can be installed on different interfaces on K8s nodes and can create POD-POD connectivity on secondary networks using Multus. Installing xcluster-cni on secondary networks requires annotations on Node objects:

kubectl annotate node vm-003 cidr.example.com/net3=192.168.55.0/26,fd00:5::1:0/112
kubectl annotate node vm-003 adr.example.com/net3=192.168.2.3,1000::1:c0a8:203

and configuring the annotation names as environment variables:

        env:
          - name: NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          - name: LOG_LEVEL
            value: "debug"
          - name: CIDR_ANNOTATION
            value: "cidr.example.com/net3"
          - name: ADDRESS_ANNOTATION
            value: "adr.example.com/net3"
          - name: PROTOCOL
            value: "200"

The protocol is used to handle routes and must be unique for each xcluster-cni instance on the node. Default is 202.

Network overlay

xcluster-cni does not setup a network overlay, but you may configure an overlay yourself (e.g. vxlan) and use the ADDRESS_ANNOTATION environment variable to stear traffic to the overlay.

This area will see some improvements in the future.

Network policies

K8s network policies are not supported.

MTU

If xcluster-cni is installed as main K8s CNI-plugin MTU is handled. You may check with:

# cat /etc/cni/net.d/10-xcluster-cni.conf 
{
    "cniVersion": "0.4.0",
    "name": "xcluster",
    "type": "bridge",
    "bridge": "cbr0",
    "isGateway": true,
    "mtu": 9000,
    "hairpinMode": true,
    "isDefaultGateway": true,
    "ipam": {
        "type": "kube-node",
        "kubeconfig": "/etc/cni/net.d/xcluster-cni.kubeconfig",
        "dataDir": "/run/container-ipam-state/xcluster"
    }
}

In this example "jumbo-frames" with mtu=9000 are used.

For secondary networks the MTU must be set in the Multus NAD object and is the users responsibility.

Build the xcluster-cni image

The image is built with "docker build" so docker must be installed.

./build.sh         # Help printout
./build.sh image   # Build the image

xcluster-cni's People

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

uablrek

xcluster-cni's Issues

MTU is always default

Current

MTU for the bridge plugin is left as default. Usually this mean mtu=1500. The mtu of the sit0 tunnel is 1480 and the mismatch causes fragmetation. Also jumbo frames are not detected.

Wanted

xcluster-cni shall set the MTU of the bridge plugin based on the MTU on the k8s interface. If the k8s interface supports jumbo frames the larger MTU shall be used. if tunnels are used the mtu shall be reduced with the size of the tunnel header

A kubeconfig file must be generated on installation

The kube-node IPAM requires a kubeconfig to read the own node object. The current installation assumes that it is available on all nodes, but it may not be.

A kubeconfig must be generated on installation. For K8s main network it is configured at installation, but for secondary networks the user must specify it in the NAD.

Add a KinD test/example

  • Show how xcluster-cni can be configured as K8s network in KinD (Kubernetes in Docker)
  • Use xcluster-cni for a secondary network in KinD

There is no MTU support

If the inter-node links have a smaller MTU than the internal POD interfaces there will be fragmentation or worse. This will be the case if a network overlay is used.

This can be fixed by the user, but some support should be available. At the very least a way to manually configure MTU on installation.

Xcluster-cni does not work for K8s < v1.16.0

The podCIDRs is assumed. Log printouts on K8s v1.15.10;

> kubectl logs -n kube-system   xcluster-cni-696z8
xcluster-cni; Fri Dec 13 19:30:54 CET 2019
K8S_NODE=[vm-005]
11:19:46: Generated /opt/cni/bin/podCIDR
11:19:46: K8s node address; 192.168.1.5
11:19:46: Using MTU=1480
11:19:46: Set addr 192.168.1.5/32 on dev sit0
xcluster-cni-router.sh: Using sit tunnels
xcluster-cni-router.sh: First node-info read
jq: error (at <stdin>:1): Cannot iterate over null (null)
jq: error (at <stdin>:1): Cannot iterate over null (null)
jq: error (at <stdin>:1): Cannot iterate over null (null)

Improve install/upgrade and test it

Install/upgrade is only supported if xcluster-cni is used on the K8s network. Further install happens even if the versions of binaries are not updated.

Install/upgrade should be supported for secondary network installations and same-version updates should be avoided.

These things should also be tested.

Test and document IPv4 address preservation

The xcluster-cni relies on dividing a CIDR among nodes. While this is not a problem for IPv6 is can easily be a problem for IPv4 in a large cluster. It should be tested and (more important) documented how IPv4 can be limited to a few nodes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.