Code Monkey home page Code Monkey logo

thanos-rule-syncer's Introduction

Observatorium

Build Status Slack

Configuration for Multi-Tenant, Flexible, Scalable, Observability Backend

Observatorium allows you to run and operate effectively a multi-tenant, easy to operate, scalable open source observability system on Kubernetes. This system will allow you to ingest, store and use common observability signals like metrics, logging and tracing. Observatorium is a "meta project" that allows you to manage, integrate and combine multiple well-established existing projects like Thanos, Loki, Tempo/Jaeger, Open Policy Agent etc under a single consistent system with well-defined tenancy APIs and signal correlation capabilities.

As active maintainers and contributors to the underlying projects, we created a reference configuration, with extra software that connects those open source solutions into one unified and easy to use service. It adds missing gaps between those projects like consistency, multi-tenancy, security and resiliency pieces that are needed for a robust backend.

Read more on High Level Architecture docs.

Context

As the Red Hat Monitoring Team, we were focusing on the Observability software and concepts since the CoreOS acquisition. From the beginning, one of our main goals was to establish a stable in-cluster metric collection, querying, and alerting for OpenShift clusters. With the growth of managed OpenShift (OSD) clusters, the scope of the team goal has extended: we had to develop a scalable, global, metric stack that can be run in local as well as a central location for monitoring and telemetry purposes. We also worked together with Red Hat Logging and Tracing teams to implement something similar for logging and tracing. We’re also working on Continuous Profiling aspects.

From the very beginning our teams were leveraging Open Source to accomplish all those goals. We believe that working with the communities is the best way to have long term, successful systems, share knowledge and establish solid APIs. You might have not seen us, but members of our teams have been actively maintaining and contributing to major Open Source standards and projects like Prometheus, Thanos, Loki, Grafana, kube-state-metrics (KSM), prometheus-operator, kube-prometheus, Alertmanager, cluster-monitoring-operator (CMO), OpenMetrics, Jaeger, ConProf, Cortex, SIG CNCF Observability, SIG K8s Instrumentation and more.

What's Included

  • Observatorium is primarily defined in Jsonnet, which allows great flexibility and reusability. The main configuration resources are stored in components directory, and they import further official resources like kube-thanos. Some Examples:

  • We are aware that not everybody speaks Jsonnet, and not everybody have it's own GitOps pipeline, so we designed alternative deployments based on the main Jsonnet resources. Operator project delivers Kubernetes plain Operator that operates Observatorium.

NOTE: Observatorium is set of cloud native, mostly stateless components that mostly do not have special operating logic. For those operations that required automation, specialized controllers were designed. Use Operator only if this is your primary installation logic or if you don't have CI pipeline.

NOTE2: Operator is in heavy progress. There are already plans to streamline its usage and redesign current CustomResourceDefinition in next version. Yet, it's currently used in production by many bigger users, so any changes will be done with care.

  • The Thanos Receive Controller is a Kubernetes controller written in Go that distributes essential tenancy configuration to the desired pods.

  • The API is the facet of Observatorium service. It's a lightweight proxy written in Go that helps with multi-tenancy, tenancy (isolation, cross tenancy requests, rate-limiting, roles, tracing). This proxy should be used for all external traffic with Observatorium.

  • OPA-AMS is our Go library for integrating Open Policy Agent with Red Hat authorization service for smooth OpenShift experience.

  • up is a useful Go service that periodically queries Observatorium and outputs vital metrics on the Observatorium read path healthiness and performance over time.

  • token-refresher is a simple Go CLI allowing to perform OIDC refresh flow.

Getting Started

Status: Work In Progress

While metric and logging part using Thanos and Loki is used in production at Red Hat,documentation, full design, user guides, different configurations support are in progress.

Stay Tuned!

Missing something or not sure?

Let us know! Visit our Slack channel or put a GitHub issue!

thanos-rule-syncer's People

Contributors

bill3tt avatar douglascamata avatar jessicalins avatar metalmatze avatar onprem avatar saswatamcode avatar squat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

thanos-rule-syncer's Issues

Add support for multiple tenants

Problem

Thanos Rule Syncer currently supports fetching rules for a single tenant from Observatorium API. This means that if you are running Obs API with multiple tenants (almost always), you'd need to run one Thanos Rule Syncer sidecar for each tenant.

Proposed Solution

We can add support for specifying multiple tenants and their credentials as input to Rule Syncer (maybe as a yaml file) and then we can fetch rules for all of them, combine in a single file, and then follow the same pattern we followed for a single tenant.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.