Code Monkey home page Code Monkey logo

zenit's Introduction

Zenit

Go Report Card

Zenit is a daemon collector for metrics in yours Database's and Proxy's service in small environment. Maybe not requires many another agents for this purpose, but with this one you'll find an excellent tool for database administration.

Why can you use it, this tool is make by DBA for DBA, other tools collect basic information with many services and complex configs, while this it collector low level information and variety than others not, all in one and easy usage.

Sponsored by

Description:

This agent collect all basic metrics from the hardware and more details from MySQL, MongoDB or ProxySQL services. And the metrics is send only to InfluxDB and you can analize and monitoring with Grafana.

Advantage

  • One agent for all, easy to install and configure, low memory consumption and high performance.
  • Auto discover database servers on Amazon Web Services.

Warnings

  • The activation of the Audit and Slow Log compromise the writing performance on disk, and another resources, use another disk for logs and have the necessary resources to support this process.

Risks

Zenit is not mature, but all database tools can pose a risk to the system and the database server. Before using this tool, please:

  • Read the tool's documentation.
  • Review the tool’s known "BUGS".
  • Test the tool on a non-production server.

Like most, you should not be surprised.

Install agent

For the moment, this tool only run in any Linux distribution with amd/aarch 64 bits. Paste that at a Terminal prompt:

bash < <(curl -s https://debeando.com/zenit.sh)

For more details, please visit the wiki.

How to use it:

See usage with:

zenit --help

zenit's People

Contributors

michaelcoburn avatar mmoreram avatar nstrappazzonc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

zenit's Issues

Scaping special symbols in HTML format in AuditLog

The values stored in the field sqltext in auditlog have query with this special symbols:

  • &gt; Stands for the greater-than sign ( > )
  • &lt; Stands for the less-than sign ( < )
  • &#10; Stands for the NewLine
  • &#9; Stands for the Tab

For example:

select text_value, value from constants&#10; where table_name = 'xxx' and column_name ='status'

No parse MariaDB slow log

This bug is reported by: @cyrill3

The format log in MariaDB have light difference and the parser ignore it:

Example:

# administrator command: Ping;
# User@Host: clickuser[clickuser] @ localhost [127.0.0.1]
# Thread_id: 4371  Schema:   QC_hit: No
# Query_time: 0.002916  Lock_time: 0.000020  Rows_sent: 596  Rows_examined: 596
# Rows_affected: 0
SET timestamp=1535364205;
SHOW GLOBAL VARIABLES;

Allow multiple Servers

With this feature, is possible with one daemon access to multiple servers, for example RDS / Aurora.

Functionalities for multiple servers:

  • MySQL
  • ProxySQL

The logging parsing files is imposible, in the case for RDS or other service type require different method to process it.

Collect CPU metric on Aurora MySQL 8.0

For any reason, the master/primary server no show the cpu value:

mysql> select server_id, cpu from ro_replica_status;
+------------------------------+---------+
| server_id                    | cpu     |
+------------------------------+---------+
| zzz-prd-mysql-xxxxxxx-node01 |       0 |
| zzz-prd-mysql-xxxxxxx-node02 | 26.4988 |
| zzz-prd-mysql-xxxxxxx-node03 |    8.75 |
+------------------------------+---------+
3 rows in set (0.00 sec)

Prometheus exporter unexpected end of input stream

time="2018-09-19T10:46:19Z" level=error msg="Error parsing /usr/local/prometheus/textfile_collector/zenit.prom: text format parsing error in line 1131: unexpected end of input stream" source="textfile.go:99"

Don't send original query, only the query digest

In the config file, add option to don't send original query for security reason, only the query digest.

Example:

[mysql-slowlog]
log_path       = /var/lib/mysql/slow.log
buffer_size    = 100
buffer_timeout = 60
original_query = on

No close child process

When application is closing, it no close or kill the sub process, for example in this case is tail tool remains in execution.

root@d1c86f2f36ff:~# ./zenit -stop
root@d1c86f2f36ff:~# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.1  20336  2692 pts/0    Ss+  Aug08   0:00 /bin/bash
root     13272  0.0  0.1  20360  3364 pts/2    Ss+  02:43   0:00 /bin/bash
root     13307  0.0  0.1  20364  3356 pts/3    Ss+  02:47   0:01 /bin/bash
root     17918  0.0  0.1  20364  3344 pts/4    Ss   17:26   0:00 /bin/bash
root     18370  0.0  0.0   4268   700 pts/4    S    21:16   0:00 /usr/bin/tail -n 0 -f /var/lib/mysql/audit.log
root     18383  0.0  0.0  17500  2024 pts/4    R+   21:16   0:00 ps aux
root@d1c86f2f36ff:~# kill $(pgrep tail)

Exception when execute on ProxySQL

Message:

panic: interface conversion: interface {} is uint64, not string

goroutine 8 [running]:
github.com/swapbyt3s/zenit/plugins/alerts/proxysql/status.(*ProxyPoolStatus).Collect(0xa440b0)
        /Users/nicola/go/src/github.com/swapbyt3s/zenit/plugins/alerts/proxysql/status/status.go:37 +0x83d
github.com/swapbyt3s/zenit/plugins.Load(0xc420016b20)
        /Users/nicola/go/src/github.com/swapbyt3s/zenit/plugins/plugins.go:65 +0x196
created by main.(*program).run
        /Users/nicola/go/src/github.com/swapbyt3s/zenit/main.go:42 +0x6f

Retry to open .log files

To prevent re-start daemon when is enabled any *.log files to parse follow after start service or/and enable logging in MySQL, maybe is a good idea to retry read this files when if not exists.

Bug - panic: close of closed channel

# zenit -quiet
panic: close of closed channel

goroutine 19 [running]:
github.com/swapbyt3s/zenit/plugins/inputs/mysql/audit.Parser.func1(0xc42005a5a0, 0xc42005a540, 0xc4200d4840)
        /Users/swapbyt3s/go/src/github.com/swapbyt3s/zenit/plugins/inputs/mysql/audit/audit.go:76 +0x1025
created by github.com/swapbyt3s/zenit/plugins/inputs/mysql/audit.Parser
        /Users/swapbyt3s/go/src/github.com/swapbyt3s/zenit/plugins/inputs/mysql/audit/audit.go:19 +0x63

InfluxDB has this error socket: too many open files

2019/11/08 08:38:44 E! - Plugin - OutputIndluxDB:Ping - Error: Get http://x.y.z.x:8086/ping: dial tcp 172.30.1.253:8086: socket: too many open files
thn-prd-proxysql /home/ec2-user# cat /proc/$(pgrep -x zenit)/limits
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        0                    unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             14679                14679                processes
Max open files            1024                 4096                 files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       14679                14679                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us

Add logrotate for zenit.log and zenit.err

The debug logging generate lot amount of logs and fill very fast the volume, to prevent it, maybe in the install script, is a good option to add new rules for logrotate.

MongoDB Plugin

Lis of "queries" to collect metrics on MongoDB:

rs.status()
db.serverStatus()
db.runCommand({ dbStats: 1 })

use <db_name>
db.runCommand( { collStats : "<collection_name>", scale: 1024 } )

Error and exception zenit when parse slow log

Case 1:

  • Get value from one line with one key and value from slow log.
    No catch value for bytes_sent

Case 2: Exception

  • Slice bounds out of range when parsing slow log.
panic: runtime error: slice bounds out of range

goroutine 53 [running]:
github.com/swapbyt3s/zenit/common/sql/parser/slow.Event(0xc4201a6840, 0xc4201a68a0)
        /Users/nicola/go/src/github.com/swapbyt3s/zenit/common/sql/parser/slow/slow.go:19
created by github.com/swapbyt3s/zenit/plugins/inputs/mysql/slow.Parser
        /Users/nicola/go/src/github.com/swapbyt3s/zenit/plugins/inputs/mysql/slow/slow.go

Prometheus bad units export

Error gathering metrics: 7 error(s) occurred:

  • collected metric zenit_os
    label:<name:"device" value:"/dev/xvda1" >
    label:<name:"name" value:"disk" >
    untyped:<value:7.48 >
    has label dimensions inconsistent with previously collected metrics in the same metric family

  • collected metric zenit_os
    label:<name:"device" value:"lo" >
    label:<name:"name" value:"net" >
    label:<name:"type" value:"receive" >
    untyped:<value:2.62588484339e+12 >
    has label dimensions inconsistent with previously collected metrics in the same metric family

  • collected metric zenit_os
    label:<name:"device" value:"lo" >
    label:<name:"name" value:"net" >
    label:<name:"type" value:"transmit" >
    untyped:<value:2.62588484339e+12 >
    has label dimensions inconsistent with previously collected metrics in the same metric family

  • collected metric zenit_os
    label:<name:"device" value:"eth0" >
    label:<name:"name" value:"net" >
    label:<name:"type" value:"receive" >
    untyped:<value:5.31028131819e+12 >
    has label dimensions inconsistent with previously collected metrics in the same metric family

  • collected metric zenit_os
    label:<name:"device" value:"eth0" >
    label:<name:"name" value:"net" >
    label:<name:"type" value:"transmit" >
    untyped:<value:5.345703981086e+12 >
    has label dimensions inconsistent with previously collected metrics in the same metric family

  • collected metric zenit_os
    label:<name:"name" value:"sysctl" >
    label:<name:"type" value:"nr_open" >
    untyped:<value:1.048576e+06 >
    has label dimensions inconsistent with previously collected metrics in the same metric family

  • collected metric zenit_os
    label:<name:"name" value:"sysctl" >
    label:<name:"type" value:"file_max" >
    untyped:<value:382744 >
    has label dimensions inconsistent with previously collected metrics in the same metric family

Adapt all metrics to this standard: https://prometheus.io/docs/practices/naming/

Implement alert condition

This is a proposal.

For example:

mysql:
  dsn: root@tcp(127.0.0.1:3306)/
  overflow: true
  slave: true
  status: true
  tables: true
  variables: true
  slowlog:
    enable: true
    log_path: /var/lib/mysql/slow.log
    buffer_size: 100
    buffer_timeout: 60
  auditlog:
    enable: true
    format: xml-old
    log_path: /var/lib/mysql/audit.log
    buffer_size: 100
    buffer_timeout: 60
  alerts:
    readonly:
      enable: true
    lag:
      enable: false
      warning: 10
      critical: 60
      duration: 30
    connections:
      enable: true
      warning: 70
      critical: 90
      duration: 30
os:
  cpu: true
  disk:
    enable: true
    alerts:
    - volume: /var/lib/mysql/
      enable: true
      warning: 75
      critical: 95
      duration: 30
  • Implemented YAML in #44.
  • Duration is a seconds between first encountering a new check.

Collect all indexes from all tables

When apply indexes starting from slave, and for another reason have inconsistent index in all slaves, with this feature you can find the difference and them fixed.

The idea is to collect all indexes from all tables to detect with another tool the difference between servers, maybe in the future is possible for columns.

+-------+------------+--------------------+--------------+----------------+...+-------------+...+
| Table | Non_unique | Key_name           | Seq_in_index | Column_name    |...| Cardinality |...|
+-------+------------+--------------------+--------------+----------------+...+-------------+...+
| demo  |          0 | PRIMARY            |            1 | id             |...|      368135 |...|
| demo  |          1 | idx_audit_at       |            1 | created_at     |...|          17 |...|
| demo  |          1 | idx_audit_at       |            2 | deleted_at     |...|       12694 |...|
| demo  |          1 | idx_publication_id |            1 | publication_id |...|       61355 |...|
+-------+------------+--------------------+--------------+----------------+...+-------------+...+

output example:

mysql_indexes{"schema": "test", "table"="demo", "index": "idx_audit_at", "column": "deleted_at"} = 12694

Exit when tail file not exist

Should not exit when file not exit

2018/08/20 15:57:50 I! - ClickHouse - DSN: http://127.0.0.1:8123/?database=zenit
2018/08/20 15:57:50 E! - Tail - File not exist: /mnt/dbstorage/mysql-data/audit.log

Standardized fingerprint of a query

Reported by: Joffrey MICHAIE <@db_cat_twitter>

Related issues: #42

The tools pt-query-digest, percona / go-mysql and proxysql calculate different fingerprint for the same query, maybe is possible to use a standard. The idea of this issue is open thread to discuss about it.

Percona rules:

  • Shorten multi-value INSERT statements to a single VALUES() list.
  • Strip comments.
  • Replace all literals, such as quoted strings. For efficiency, the code that replaces literal numbers is somewhat non-selective, and might replace some things as numbers when they really are not. Hexadecimal literals are also replaced. NULL is treated as a literal. Numbers embedded in identifiers are also replaced, so tables named similarly will be fingerprinted to the same values (e.g. users_2009 and users_2010 will fingerprint identically).
  • Collapse all whitespace into a single space.
  • Lowercase the entire query.
  • Replace all literals inside of IN() and VALUES() lists with a single placeholder, regardless of cardinality.

Refactoring metric structure for more easy usage

Example:

package main

import (
	"fmt"
)

func main() {
	tags := make(map[string]string)
	tags["schema"] = "identity"
	tags["table"] = "user"

	fields := make(map[string]interface{})
	fields["percentage"] = 10

	item := make(map[string]interface{})
	item["tags"] = tags
	item["fields"] = fields

	metric := make(map[string][]map[string]interface{})
	
	metric["mysql_overflow"] = append(metric["mysql_overflow"], item)
	
	metrics := make(map[string][]map[string][]map[string]interface{})
	metrics["127.0.0.1"] = append(metrics["127.0.0.1"], metric)

	fmt.Println(fmt.Sprintf("%#v", metrics))
}

Output

map[string][]map[string][]map[string]interface {}{
  "127.0.0.1":[]map[string][]map[string]interface {}{
    map[string][]map[string]interface {}{
      "mysql_overflow":[]map[string]interface {}{
        map[string]interface {}{
          "fields":map[string]interface {}{"percentage":10},
          "tags":map[string]string{
            "schema":"identity",
            "table":"user"}}}}}}

Remove tail command and implement native tail

Call tail command is inestable when the application is closed or killed, and this process is stay in background and not kill the children process, for this reason, maybe is good idea replace by native go-tail.

Auto discover MySQL RDS on AWS

---
general:
  aws_access_key_id: xxx
  aws_secret_access_key: xxx

inputs:
  rds:
    enable: true
    username: monitor
    password: monitor
    hostname: [dns|instance_name]
    plugins:
      aurora: false
      overflow: true
      slave: true
      status: true
      tables: true
      variables: true

Replace all values inside of IN() lists

To follow the same fingerprint to pt-query-digest. Replace all literals inside of IN() lists with a single placeholder, regardless of cardinality.

Related issues: #37

For example;

SELECT name FROM user WHERE id IN (?, ?, ?, ?, ?, ?);

To:

SELECT name FROM user WHERE id IN (?);

Impossible to execute query: Error 1317: Query execution was interrupted

2018/09/28 09:34:23 E! - MySQL:Indexes - Impossible to execute query: Error 1317: Query execution was interrupted
panic: runtime error: invalid memory address or nil pointer dereference
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x5937cb]

goroutine 17 [running]:
database/sql.(*Rows).close(0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/local/Cellar/go/1.10.1/libexec/src/database/sql/sql.go:2907 +0x6b
database/sql.(*Rows).Close(0x0, 0x0, 0x0)
        /usr/local/Cellar/go/1.10.1/libexec/src/database/sql/sql.go:2903 +0x33
panic(0x778b40, 0x9db6e0)
        /usr/local/Cellar/go/1.10.1/libexec/src/runtime/panic.go:502 +0x229
database/sql.(*Rows).Next(0x0, 0x35)
        /usr/local/Cellar/go/1.10.1/libexec/src/database/sql/sql.go:2599 +0x30
github.com/swapbyt3s/zenit/plugins/inputs/mysql.Indexes()
        /Users/nicola/go/src/github.com/swapbyt3s/zenit/plugins/inputs/mysql/indexes.go:47 +0x2a0
github.com/swapbyt3s/zenit/plugins/inputs.doCollectPlugins(0xc42013e000)
        /Users/nicola/go/src/github.com/swapbyt3s/zenit/plugins/inputs/inputs.go:60 +0x268
created by github.com/swapbyt3s/zenit/plugins/inputs.Gather
        /Users/nicola/go/src/github.com/swapbyt3s/zenit/plugins/inputs/inputs.go:33 +0xc9

Bad string parsing in SQL

Should by object_id = \'?\', not object_id = \\'?\\':

Example:

2018/08/20 08:08:05 D! - ClickHouse - Event capture: AuditLog - map[string]string{"os_login":"", "priv_user":"", "proxy_user":"", "user":"demo", "connection_id":"2529923", "db":"", "host_ip":"127.0.0.1", "os_user":"", "status":"0", "sqltext_digest":"SELECT ... FROM ... WHERE object_id = \\'?\\' object_type = \\'?\\'", "command_class":"error", "host":"", "name":"Query", "sqltext":"SELECT ... FROM ... WHERE .object_id = \\'xxxx\\' AND c0_.object_type = \\'xxxx\\'", "_time":"2018-08-20 08:08:05", "host_name":"localhost", "ip":"127.0.0.1"}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.