Code Monkey home page Code Monkey logo

k8s-humio's Introduction

k8s-humio - Ship Logs, Events and Metrics from Kubernetes to Humio

Contains components for shipping logs, events and metrics from kubernetes clusters to humio.

Overview

Here, fluentd is used to forward application- and host- level logs from each kubernetes node to humio server. This extends the standard setup here from fluentd for log forwarding in kubernetes. Heapster is deployed so for publishing various aggregated metrics to humio. For clusters where the master nodes are not accessible (eg on GCP) we use eventer to expose events occurring in the kubernetes control plane.

Getting Started

  1. Pre-requisites:
    • Kubernetes cluster
    • User authorized to administrate via kubectl
    • Default service account with read privileges to API server for use by the kubernetes metadata filter plugin. This should be present by default in the kube-system namespace (even in kubernetes 1.6 with RBAC enabled)
  2. Setup your data space in humio and create an ingest token
  3. Base64 encode your token by running printf 'TOKEN' | base64 and update fluentd/k8s/fluentd-humio-ingest-token-secret.yaml with the value
  4. Create fluentd resources in kubernetes: kubectl apply -f fluentd/k8s/
  5. Create heapster to send metrics to humio: kubectl apply -f heapster/
  6. If master nodes are not scheduleable in your cluster, also create eventer to expose control-plane events: kubectl apply -f eventer/
  7. Logs start appearing in humio!

Node-level Forwarding

In fluentd/docker-image a docker image is defined which specifies how to forward to humio (with other settings, like log sources reused from the base image). Kubernetes manifests are defined in fluentd/k8s: a daemonset will deploy fluentd pods across every worker node inside the kube-system namespace, and each pod will read the humio ingest token from the fluentd-humio-ingest-token secret.

As per the normal setup, fluentd output is buffered, and uses TLS for nice log confidentiality. It also appends kubernetes metadata such and pod name and namespace to each log entry, wrapping raw logs in a standard json structure.

Log types

For any application running as a pod in kubernetes, the value of the log-type label added to the pod will be used to determine the parser humio uses to parse log lines arriving from the pod. Each value must have a corresponding parser in humio. If the label is unspecified or doesn't correspond to a parser then pod logs will be left as unstructured text.

Fluentd Container Variables

We expose three environment variables so the daemonset configuration can be easily changed in different environments:

  • FLUENT_HUMIO_HOST: humio host
  • FLUENT_HUMIO_DATA_SPACE: used to parameterize the path to humio bulk elastic ingest API for your data space
  • FLUENT_HUMIO_INGEST_TOKEN: authorization to push logs into humio

If you need to make further customizations, you will need to mount in an altered version of the fluentd config files /fluentd/etc/fluent.conf and /fluentd/etc/kubernetes.conf, e.g. using ConfigMaps.

Namespacing and Service Accounts Usage

As noted above, the 'default' service account is used by the fluentd metadata plugin to lookup pod/namespace information. This is not particularly in line with the developing RBAC model for service accounts in kubernetes, but causes few problems in the kube-system namespace where services are assumed to be somewhat root-like. Since 'default' service account is available to all pods in a namespace, careful thought is recommended when assigning permissions to this account to get fluentd to work outside the kube-system namespace.

Metrics

In addition to ingesting logs and events it can also be helpful to ingest metrics into humio. The standard component for metrics collection is heapster, so that is what we use here to easily get hold of metrics aggregated for hosts, namespaces, pods, containers, and the cluster. As with eventer, heapster is able to use stdout as a sink, however the existing multi-line formatting is not readily parseable. To solve this, we use a forked version which can output metrics data in a predictable json structure. In this structure:

  • A single log entry/json document is created for each MetricSet. MetricSets are defined for logical components of each aggregation object - e.g. services on a host. This division is important to bound the maximum size of log entries.
  • Key-value metrics info can be found under the 'Metrics' and 'LabeledMetrics' keys. In case of LabeledMetrics the value is given as a list, to allow for further disambiguation or metrics according to the resource_id label
  • All information defined in the storage-schema docs is passed on

Control-plane Events

Appropriate for clusters where fluentd cannot run on master nodes, the eventer component of heapster is used to retrieve cluster events from the API server. We forward events to fluentd by simply printing events to stdout, providing a consistent interface for logs coming out of kubernetes. Eventer runs as a deployment with a single instance handling all cluster events, regardless of cluster size. As with heapster, it makes use of the addon-resizer component to update requested resources as load on the eventer, causing the eventer pod to get redeployed as cluster activity grows past certain thresholds.

k8s-humio's People

Contributors

benjvi avatar

Stargazers

Bjarke Vad Andersen avatar

Watchers

James Cloos avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.