Code Monkey home page Code Monkey logo

Comments (3)

spuiuk avatar spuiuk commented on June 12, 2024

Notes:

  • A liveness probe indicates that the container is running. a readiness probe indicates that the container is ready to service requests. ie. liveness probe indicates the state of the container and the readiness probe indicates the state of the service running in that container. A container with liveness probe set to true and readiness probe returning false indicates that the container is up and running but the service is not yet ready to service requests.

  • by default, a liveness probe will check PID 1 in the container to determine if the the container is alive. This is fine for cases where only one process runs on the container.

  • by default, kubernetes will assume that the container is ready to receive traffic as long at the liveliness probe returns true.

# Lookup existing Probes 
$ kubectl edit sts smbshare3
..

..
# No probes defined for ctdb container
      - args:
        - run
        - ctdbd
        - --setup=smb_ctdb
        - --setup=ctdb_config
        - --setup=ctdb_etc
        - --setup=ctdb_nodes
        env:
        - name: SAMBA_CONTAINER_ID
          value: smbshare3
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json
        - name: HOSTNAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBACC_CTDB
          value: ctdb-is-experimental
        image: quay.io/samba.org/samba-server:latest
        imagePullPolicy: Always
        name: ctdb
        resources: {}
..
# Both liveness and readiness probe defined for smbd container
      - args:
        - run
        - smbd
        - --setup=users
        - --setup=smb_ctdb
        env:
        - name: SAMBA_CONTAINER_ID
          value: smbshare3
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json
        image: quay.io/samba.org/samba-server:latest
        imagePullPolicy: Always
        livenessProbe:
          failureThreshold: 3
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: 445
          timeoutSeconds: 1
        name: samba
        ports:
        - containerPort: 445
          name: smb
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: 445
          timeoutSeconds: 1

# Login into the ctdb container in a clustered pod created
[sprabhu@fedora bin]$ kubectl exec -it smbshare3-0 -c ctdb -- /bin/bash

# Process list on the ctdb share.
[root@smbshare3-0 /]# ps -fax
    PID TTY      STAT   TIME COMMAND
    533 pts/0    Ss     0:00 /bin/bash
    635 pts/0    R+     0:00  \_ ps -fax
     89 ?        Ss     0:00 /usr/sbin/smbd --foreground --log-stdout --no-process-group
    105 ?        S      0:00  \_ /usr/sbin/smbd --foreground --log-stdout --no-process-group
    106 ?        S      0:00  \_ /usr/sbin/smbd --foreground --log-stdout --no-process-group
     83 ?        Ss     0:00 /usr/bin/python3 /usr/local/bin/samba-container ctdb-manage-nodes --hostname=smbshare3-0 --take-node-number-from-hostname=after-last-dash
     39 ?        SLs    0:03 /usr/sbin/ctdbd --interactive
     45 ?        S      0:00  \_ /usr/libexec/ctdb/ctdb-eventd -P 39 -S 9
     81 ?        S      0:00  \_ /usr/sbin/ctdbd --interactive
     94 ?        S      0:00      \_ /usr/libexec/ctdb/ctdb_mutex_fcntl_helper /var/lib/ctdb/shared/RECOVERY
      1 ?        Ss     0:00 /pause

References:

from samba-operator.

spuiuk avatar spuiuk commented on June 12, 2024

Test 1

Set readinessProbe in the following manner for the ctdb container

        readinessProbe:
          failureThreshold: 3
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
          exec:
            command:
            - /bin/sh
            - -c
            - "ctdb nodestatus |grep 'OK (THIS NODE)'"

Exec into a pod/ctdb container and disable the ctdb service

[sprabhu@fedora bin]$ kubectl exec -it smbshare3-0 -c ctdb -- /bin/bash
[root@smbshare3-0 /]# ctdb nodestatus
pnn:0 10.244.1.37      OK (THIS NODE)
[root@smbshare3-0 /]# ctdb disable
[root@smbshare3-0 /]# ctdb nodestatus
pnn:0 10.244.1.37      DISABLED (THIS NODE)

We see the following effect in the cluster

[sprabhu@fedora tests]$ kubectl get pods -w 
NAME                               READY   STATUS    RESTARTS   AGE
samba-ad-server-86b7dd9856-m46sh   1/1     Running   0          43h
smbshare3-0                        3/3     Running   0          28m
smbshare3-1                        3/3     Running   0          31m
smbshare3-0                        2/3     Running   0          28m

smbshare3-0 goes from Ready 3/3 to 2/3

[sprabhu@fedora tests]$ kubectl describe pod smbshare3-0
..
  Warning  Unhealthy  43s (x120 over 45m)  kubelet            Readiness probe failed:

At this point, the smbshare service should have stopped sending any service requests to the pod. However it doesn't reboot the pod automatically. This requires the liveliness probe to be setup instead.

from samba-operator.

spuiuk avatar spuiuk commented on June 12, 2024

From the ctdb man page. The status can be any of the following

       OK
           This node is healthy and fully functional. It hosts public addresses to provide services.

       DISCONNECTED
           This node is not reachable by other nodes via the private network. It is not currently participating in the cluster. It does not host public
           addresses to provide services. It might be shut down.

       DISABLED
           This node has been administratively disabled. This node is partially functional and participates in the cluster. However, it does not host
           public addresses to provide services.

       UNHEALTHY
           A service provided by this node has failed a health check and should be investigated. This node is partially functional and participates in
           the cluster. However, it does not host public addresses to provide services. Unhealthy nodes should be investigated and may require an
           administrative action to rectify.

       BANNED
           CTDB is not behaving as designed on this node. For example, it may have failed too many recovery attempts. Such nodes are banned from
           participating in the cluster for a configurable time period before they attempt to rejoin the cluster. A banned node does not host public
           addresses to provide services. All banned nodes should be investigated and may require an administrative action to rectify.

       STOPPED
           This node has been administratively exclude from the cluster. A stopped node does no participate in the cluster and does not host public
           addresses to provide services. This state can be used while performing maintenance on a node.

       PARTIALLYONLINE
           A node that is partially online participates in a cluster like a healthy (OK) node. Some interfaces to serve public addresses are down, but at
           least one interface is up. See also ctdb ifaces.

from samba-operator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.