Code Monkey home page Code Monkey logo

pg_sqlog's Introduction

pg_sqlog

An extension providing access to PostgreSQL logs through SQL interface.

Description

pg_sqlog allows to query a foreign table, pointing to a log, recorded in a CSV format. It has special functions to extract the query duration of each query, as well as to group similar queries together.

Prerequisites

Set log_min_duration_statement to a non-negative value in order to record the slow queries.

log_min_duration_statement  = 1000  # logs every query taking more than 1 second

This extension depends on file_fdw as well as on the following configuration directives.

log_destination   = 'syslog,csvlog' # 'csvlog' should be present
log_filename      = 'postgresql.%F' # any combination of %F, %Y, %m, %d, %a
logging_collector = 'on'
log_rotation_age  = '1d'            # at max 1 log file per day
log_rotation_size = 0
log_truncate_on_rotation = 'on'

To use the special autovacuum and autoanalyze reports you need to set log_autovacuum_min_duration to a non-negative value.

log_autovacuum_min_duration = 0

Tables

  • sqlog.log - a template table, pointing to a log file, generated through a given day. It could be either queried through the special sqlog.log() set of function, or directly in a combination with the sqlog.set_date() function. By default the date is set to the last call to sqlog.log() or sqlog.set_date().

Functions

  • sqlog.log([timestamp]) - a set returning function, giving the contents of the PostgreSQL log file for a given day. If timestamp is omitted, then the current day's log is returned. Calls sqlog.set_date() implicitly.
  • sqlog.set_date([timestamp]) - a function to control the sqlog.log filename option. Once set to a given date, it stays that way until another call to it. Note that calling this function will influence the contents of the sqlog.log table for all the other concurrent sessions as well (if any).
  • sqlog.duration(text) - extracts the query duration from the message field in milliseconds.
  • sqlog.preparable_query(text) - replaces all the possible arguments of a query found in the message field with question marks, thus providing a preparable query, effectively grouping similar queries together.
  • sqlog.summary(text, int [, int]) - strip meta data from the query and display the first N, the last N, or both characters of it. By default the first 30 and the last 30 characters will be shown. If -1 is passed for the second argument, no trimming will occur.
  • sqlog.temporary_file_size(text) - extracts the file size of each temporary file that has been created and logged, according to the log_temp_files configuration option. Pass sqlog.message as argument.
  • sqlog.autovacuum([timestamp]) - a set returning function, giving human readable report of the autovacuum runs for a given day. Calls sqlog.set_date() implicitly.
  • sqlog.autoanalyze([timestamp]) - a set returning function, giving human readable report of the autoanalyze runs for a given day. Calls sqlog.set_date() implicitly.

Installation

After making the project, copy the conf/pg_sqlog.conf file to the conf.d/ PostgreSQL directory (or make the appropriate changes to your postgresql.conf file directly) and restart the service.

Upgrading from PostgreSQL 12 (or older)

Drop and create the extension to have proper backend_type parsing.

Upgrading from PostgreSQL 13 (or older)

Drop and create the extension to have proper leader_pid and query_id parsing.

Examples

Get a summary of the errors reported for the day.

postgres=# SELECT error_severity, COUNT(*) FROM sqlog.log() GROUP BY 1;
 error_severity | count
----------------+-------
 FATAL          |     6
 WARNING        |    27
 LOG            |   949
 ERROR          |    10
(4 rows)

Get the top 3 slowest queries of the day.

SELECT
  AVG(sqlog.duration(message)),
  COUNT(*),
  sqlog.preparable_query(message)
FROM
  sqlog.log()
WHERE
  message ~ '^duration'
GROUP BY
  3
ORDER BY
  2 DESC
LIMIT
  3;

                       preparable_query                       |          avg          | count
--------------------------------------------------------------+-----------------------+-------
 SELECT pg_sleep(?)                                           | 9002.774              |     2
 SELECT id, name FROM invoice WHERE status > ?                | 4367.3729834738293848 |    12
 UPDATE app SET credit=?+overdraft WHERE id=? and overdraft>? | 1158.1232790697674419 |    43
(3 rows)

Show the summary for some queries from yesterday.

SELECT
  log_time::time,
  sqlog.duration(message),
  sqlog.summary(message)
FROM
  sqlog.log('yesterday')
WHERE
  message ~ '^duration';

   log_time      | duration |                              summary                              
-----------------+----------+------------------------------------------------------------------
 19:09:44.942+00 | 8604.054 | SELECT DISTINCT (accounts_uuid ... _jupiterprofile"."venus" = 0))
 19:37:52.766+00 | 1209.055 | UPDATE "infrastructure_jupiter ... be5c-5gb7-9d7b-b30bg6cf5b56'))
 19:40:05.506+00 | 1628.792 | SELECT (date_trunc('hour', tim ... r', time)) ORDER BY "hour" ASC
(3 rows)

Get a random autovacuum report for the day.

postgres=# SELECT * FROM sqlog.autovacuum() limit 1;
-[ RECORD 1 ]-------------+---------------------------
log_time                  | 2018-11-06 06:03:00.178+00
database                  | db
schema_name               | public
table_name                | account
idx_scans                 | 1
pages_removed             | 1
pages_remain              | 16
pages_skipped_pins        | 0
pages_skipped_frozen      | 0
tuples_removed            | 455
tuples_remain             | 27
tuples_dead_not_removable | 0
oldest_xmin               | 224250521
buffer_hits               | 187
buffer_misses             | 0
buffer_dirtied            | 7
read_mbs                  | 0.000
write_mbs                 | 0.033
cpu_user                  | 0.04
cpu_system                | 0.02
elapsed                   | 1.64

Querying slave node logs

Analyzing queries on a slave node is also possible. In order to change the date make a call to sqlog.set_date([date]) on the master node prior to querying sqlog.log on the slave.

pg_sqlog's People

Contributors

kouber avatar

Stargazers

 avatar Georgy Shelkovy avatar Alberto Labarga avatar  avatar  avatar Tobias Bussmann avatar Adrián López avatar

Watchers

 avatar

pg_sqlog's Issues

Unopionated log file selection

I was trying to use your fdw but I find it is very opionated in how the config should be.

My logging configuration is csvlog but in a different folder (/var/log/postgresql) and rotation managed by logrotate.

Postgres config:

log_destination = 'csvlog'
logging_collector = on
log_directory = '/var/log/postgresql/'
log_filename = 'postgresql'
log_truncate_on_rotation = off
log_rotation_age = 0
log_rotation_size = 0

/etc/logrotate.d/postgres

/var/log/postgresql/postgresql.csv {
  rotate 14
  daily
  compress
  maxsize 2M
  nodelaycompress
  create 0600 postgres postgres
  dateext
  dateformat -%Y-%m-%d-%H
  postrotate
    sudo -u postgres /usr/pgsql-12/bin/pg_ctl logrotate -D /var/lib/pgsql/12/data
  endscript
}

The final paths are like:

/var/log/postgresql/postgresql.csv
/var/log/postgresql/postgresql.csv-2020-03-05-17.gz

I know this brokes in several parts the fdw.
I was thinking to make some adaptations to be able to use this logging configuration.

I have found a fdw to gzipped files.

Another problem will be the rotation, because I will not know in advance the naming scheme.

Before doing anything I wanted to ask for your opinion.

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.