Code Monkey home page Code Monkey logo

fluentd-output-sumologic's Introduction

Build Status Gem Version contributions welcome

fluent-plugin-sumologic_output, a plugin for Fluentd

This plugin has been designed to output logs or metrics to SumoLogic via a HTTP collector endpoint

License

Released under Apache 2.0 License.

Installation

gem install fluent-plugin-sumologic_output

Configuration

Configuration options for fluent.conf are:

  • data_type - The type of data that will be sent to Sumo Logic, either logs or metrics (Default is logs )
  • endpoint - SumoLogic HTTP Collector URL
  • verify_ssl - Verify ssl certificate. (default is true)
  • source_category* - Set _sourceCategory metadata field within SumoLogic (default is nil)
  • source_name* - Set _sourceName metadata field within SumoLogic - overrides source_name_key (default is nil)
  • source_name_key - Set as source::path_key's value so that the source_name can be extracted from Fluentd's buffer (default source_name)
  • source_host* - Set _sourceHost metadata field within SumoLogic (default is nil)
  • log_format - Format to post logs into Sumo. (default json)
    • text - Logs will appear in SumoLogic in text format (taken from the field specified in log_key)
    • json - Logs will appear in SumoLogic in json format.
    • json_merge - Same as json but merge content of log_key into the top level and strip log_key
  • log_key - Used to specify the key when merging json or sending logs in text format (default message)
  • open_timeout - Set timeout seconds to wait until connection is opened.
  • send_timeout - Timeout for sending to SumoLogic in seconds. Don't modify unless you see HTTPClient::SendTimeoutError in your Fluentd logs. (default 120)
  • add_timestamp - Add timestamp (or timestamp_key) field to logs before sending to sumologic (default true)
  • timestamp_key - Field name when add_timestamp is on (default timestamp)
  • proxy_uri - Add the uri of the proxy environment if present.
  • metric_data_format - The format of metrics you will be sending, either graphite or carbon2 or prometheus (Default is graphite )
  • disable_cookies - Option to disable cookies on the HTTP Client. (Default is false )
  • compress - Option to enable compression (default true)
  • compress_encoding - Compression encoding format, either gzip or deflate (default gzip)
  • custom_fields - Comma-separated key=value list of fields to apply to every log. more information
  • custom_dimensions - Comma-separated key=value list of dimensions to apply to every metric. more information
  • use_internal_retry - Enable custom retry mechanism. As this is false by default due to backward compatibility, we recommend to enable it and configure the following parameters (retry_min_interval, retry_max_interval, retry_timeout, retry_max_times)
  • retry_min_interval - Minimum interval to wait between sending tries (default is 1s)
  • retry_max_interval - Maximum interval to wait between sending tries (default is 5m)
  • retry_timeout - Time after which the data is going to be dropped (default is 72h) (0s means that there is no timeout)
  • retry_max_times - Maximum number of retries (default is 0) (0 means that there is no max retry times, retries will happen forever)
  • max_request_size - Maximum request size (before applying compression). Default is 0k which means no limit

NOTE: * Placeholders are supported

Example Configuration

Reading from the JSON formatted log files with in_tail and wildcard filenames:

<source>
  @type tail
  format json
  time_key time
  path /path/to/*.log
  pos_file /path/to/pos/ggcp-app.log.pos
  time_format %Y-%m-%dT%H:%M:%S.%NZ
  tag appa.*
  read_from_head false
</source>

<match appa.**>
 @type sumologic
 endpoint https://collectors.sumologic.com/receiver/v1/http/XXXXXXXXXX
 log_format json
 source_category prod/someapp/logs
 source_name AppA
 open_timeout 10
</match>

Sending metrics to Sumo Logic using in_http:

<source>
  @type http
  port 8888
  bind 0.0.0.0
</source>

<match test.carbon2>
	@type sumologic
	endpoint https://endpoint3.collection.us2.sumologic.com/receiver/v1/http/ZaVnC4dhaV1hYfCAiqSH-PDY6gUOIgZvO60U_-y8SPQfK0Ks-ht7owrbk1AkX_ACp0uUxuLZOCw5QjBg1ndVPZ5TOJCFgNGRtFDoTDuQ2hzs3sn6FlfBSw==
	data_type metrics
	metric_data_format carbon2
	flush_interval 1s
</match>

<match test.graphite>
	@type sumologic
	endpoint https://endpoint3.collection.us2.sumologic.com/receiver/v1/http/ZaVnC4dhaV1hYfCAiqSH-PDY6gUOIgZvO60U_-y8SPQfK0Ks-ht7owrbk1AkX_ACp0uUxuLZOCw5QjBg1ndVPZ5TOJCFgNGRtFDoTDuQ2hzs3sn6FlfBSw==
	data_type metrics
	metric_data_format graphite
	flush_interval 1s
</match>

Example input/output

Assuming following inputs are coming from a log file named /var/log/appa_webserver.log

{"asctime": "2016-12-10 03:56:35+0000", "levelname": "INFO", "name": "appa", "funcName": "do_something", "lineno": 29, "message": "processing something", "source_ip": "123.123.123.123"}

Then output becomes as below within SumoLogic

{
    "timestamp":1481343785000,
    "asctime":"2016-12-10 03:56:35+0000",
    "levelname":"INFO",
    "name":"appa",
    "funcName":"do_something",
    "lineno":29,
    "message":"processing something",
    "source_ip":"123.123.123.123"
}

Dynamic Configuration within log message

The plugin supports overriding SumoLogic metadata and log_format parameters within each log message by attaching the field _sumo_metadata to the log message.

NOTE: The _sumo_metadata field will be stripped before posting to SumoLogic.

Example

{
  "name": "appa",
  "source_ip": "123.123.123.123",
  "funcName": "do_something",
  "lineno": 29,
  "asctime": "2016-12-10 03:56:35+0000",
  "message": "processing something",
  "_sumo_metadata": {
    "category": "new_sourceCategory",
    "source": "override_sourceName",
    "host": "new_sourceHost",
    "log_format": "merge_json_log"
  },
  "levelname": "INFO"
}

Retry Mechanism

retry_min_interval, retry_max_interval, retry_timeout, retry_max_times are not the buffer retries parameters. Due to technical reason, this plugin implements it's own retrying back-off exponential mechanism. It is disabled by default, but we recommend to enable it by setting use_internal_retry to true.

TLS 1.2 Requirement

Sumo Logic only accepts connections from clients using TLS version 1.2 or greater. To utilize the content of this repo, ensure that it's running in an execution environment that is configured to use TLS 1.2 or greater.

fluentd-output-sumologic's People

Contributors

andrzej-stencel avatar bendrucker avatar breinero avatar danmx avatar duchatran avatar frankreno avatar jitran avatar kobtea avatar leklund avatar mar-kolya avatar mtanda avatar reillybrogan avatar rvmiller89 avatar samjsong avatar sanjitsaluja avatar stevezau avatar sumo-drosiek avatar swiatekm-sumo avatar taraspos avatar vsinghal13 avatar zhelyan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fluentd-output-sumologic's Issues

_sumo_metadata field not stripped off before posting to SumoLogic

Info

  1. plugin version: fluent-plugin-sumologic_output (1.3.1)
  2. td-agent version: td-agent 1.2.2

Problem Summary
fluentd-output-sumologic plugin does not work. _sumo_metadata is not stripped off at all, from the log.

Background
Our setup involves sending logs from syslog-ng server to td-agent server. We are using hosted collector for sending logs to sumologic. While on both the ends, the log messages looks exactly the same. This consists of having the sumo_metadata key and the corresponding values.

Expectation
<CONTAINER_NAME>/test should appear in _sourceCategory.

Configuration

<source>
  @type tcp
  tag tcp.events
  <parse>
    @type json
  </parse>
  port 24225
  bind 127.0.0.1
</source>

<source>
  @type http
  port 8888
  bind 127.0.0.1
</source>

<match **>
  @type copy
  <store>
    @type file
    path <to_log_syslog_log_file>
    <buffer time>
      timekey 1s
      timekey_use_utc true
      timekey_wait 1s
    </buffer>
  </store>
  <store>
    @type sumologic
    endpoint https://collectors.us2.sumologic.com/receiver/v1/http/***
    log_format text
    open_timeout 10
    flush_interval 1s
  </store>
</match>

Excerpt from Syslog-ng server log

{"journal":{_HOSTNAME":"ip
-172-31-22-252","_GID":"0","_EXE":"/usr/bin/dockerd","_COMM":"dockerd","_CMDLINE":"/usr/bin/dockerd --raw-logs","TEST":"false","SYSLOG_IDENTIFIER":"confident_leavitt/test","PRIORITY":
"6","MESSAGE":"Hello World!","LOCATION":"west","CONTAINER_TAG":"confident_leavitt/test","CONTAINER_NAME":"confident_leavitt"},"_sumo_metadata":{"source":"journal","host":"ip-172-31-22-252","category":"confident_leavitt/test"},"TAGS":".source.s_src","SOURCEIP":"127.0.0.1","SOURCE":"s_src","PROGRAM":"confident_leavitt/test","PRIORITY":"in
fo","PID":"28XXX","MESSAGE":"Hello World!","HOST_FROM":"ip-172-31-22-252","HOST":"ip-172-31-22-252","DATE":"Sep 21 00:52:51"}

Excerpt from td-agent server log

Sep 21 00:52:51 172.31.22.252 confident_leavitt/test[28XXX]: {"journal":{_HOSTNAME":"ip
-172-31-22-252","_GID":"0","_EXE":"/usr/bin/dockerd","_COMM":"dockerd","_CMDLINE":"/usr/bin/dockerd --raw-logs","TEST":"false","SYSLOG_IDENTIFIER":"confident_leavitt/test","PRIORITY":
"6","MESSAGE":"Hello World!","LOCATION":"west","CONTAINER_TAG":"confident_leavitt/test","CONTAINER_NAME":"confident_leavitt"},"_sumo_metadata":{"source":"journal","host":"ip-172-31-22-252","category":"confident_leavitt/test"},"TAGS":".source.s_src","SOURCEIP":"127.0.0.1","SOURCE":"s_src","PROGRAM":"confident_leavitt/test","PRIORITY":"in
fo","PID":"28XXX","MESSAGE":"Hello World!","HOST_FROM":"ip-172-31-22-252","HOST":"ip-172-31-22-252","DATE":"Sep 21 00:52:51"}

Steps to reproduce:

  1. docker run --log-driver=journald --log-opt tag="{{.Name}}/test" --log-opt labels=location --log-opt env=TEST --env "TEST=false" --label location=west ubuntu echo "Hello World!"

More information:
Sumologic receives the log in the same format as mentioned above. Yet, the _sourceCategory only has Http Input in the list. And I am using _collector=<collector_name> in the sumo search field.

Noisy logs "Unknown key" from cookie handling code

I've been evaluating sumo and this plugin. While using it, we noticed noise in the fluend logs like:

Unknown key: MAX-AGE = 604800
Unknown key: MAX-AGE = 604800
Unknown key: SameSite = None

repeatedly. It doesn't seem to use the fluentd logging functionality, so there's no way to filter it out.

I dug into it further and it seems like it's coming from the httpclient dependency which doesn't know about these 2 Set-Cookie parameters and the Sumo API is returning these in each API call. (Though I'm not sure why cookies would be needed for stateless ingest of logs)

I'm guessing there's no easy fix here, besides fixing it upstream. Though I am curious why cookies/sessions are needed for API ingest or if I'm misunderstanding something

Screenshot 2023-03-28 at 3 16 50 PM

Getting "Unknown output plugin 'sumologic" after installation

I'm trying to redirect application logs to the summologic collector.

Installed plug-ing as mentioned in docs: gem install fluent-plugin-sumologic_output
and fluent-gem install fluent-plugin-sumologic_output

What happened:
Getting following message in log: 2019-05-02 15:48:38 -0700 [info]: parsing config file is succeeded path="/etc/td-agent/td-agent.conf"
2021-12-13 20:33:34 +0000 [error]: config error file="/fluentd/etc/fluent.conf" error_class=Fluent::ConfigError error="Unknown output plugin 'sumologic'. Run 'gem search -rd fluent-plugin' to find plugins"

Environment:

  • Ruby version : ruby 3.0.1p64
  • Distributor ID: Ubuntu
    Description: Ubuntu 18.04.6 LTS
    Release: 18.04
    Codename: bionic
  • Fluentd plugin version : fluent-plugin-sumologic_output-1.7.3

image

Fluentd.conf
`
@type forward
port 24224
bind 0.0.0.0

<match .>
@type sumologic
endpoint https://collectors.us2.sumologic.com/receiver/v1/http/XXXX
log_format json
source_categoryAppLog
source_name FluentD
open_timeout 10

`

source_name vs source

Hello,
I have a question regarding the use of source_name vs source(does not exist, but maybe should).

In the example provided in readme it seem to be used to put things like a service-name, but looking at sumologic documentation source_name is used for file-names (source_name mapping to _sourceName).

For me it would make sense to use the source(_source) for this type of data, but this is not available in this plugin.

Could someone elaborate on this? Perhaps this boils down to some limitation of the http API.

I'm thinking these things eventually affect indexing of of some sort on SumoLogic side.

Thanks :)

Ref: https://help.sumologic.com/Search/Get-Started-with-Search/Search-Basics/Search-Metadata

Gem name incorrect in Readme

I believe the correct name is "fluent-plugin-sumologic_output" not "fluent-plugin-sumologic_out" as it says in the Readme.

Option to send configuration additional options.

Hi,

Is there a way to embed the following configuration parameters to the _sumo_metadata ?

"automaticDateParsing"
"multilineProcessingEnabled"
"useAutolineMatching"
"forceTimeZone"
"filters"
"cutoffTimestamp"
"encoding"
"timeZone"
"pathExpression"
"blacklist"
"sourceType"

If there is, how exactly can I do so for each log ?

Thank you for help.

Best.

JSON timestamp problem

Hello,
I have golang application, and I'm using Logrus lib, with JSON formatter.
This lib creates JSON with next format:

{"level":"error","msg":"Wrong message type","time":"2017-03-17T15:33:58Z"}

When I'm giving this JSON messages to the FluentD, on the sumologic then I see next log:

{
    "timestamp":2017000,
    "level":"error",
    "msg":"Wrong message type"
}

If I disabled timestamp from my logs, everything works ok.
There is should be a better way to workaround this.

Thank you.

Adding timestamp to logs - why?

Hi,
I noticed that in the code this plugin is adding a 'timestamp' to all emitted logs. Why is this?
In our case we happen to use the identifier 'timestamp' for our usecase - which made this extra confusing.

The implementation is also broken in the sense that is uses :timestamp in the merge-operation.
:timestamp will never be == "timestamp", so the merge-operation will have two keys "timestamp" if you merge and then marshal a json object already containing a field "timestamp" (technically valid JSON - but really not what you want).

Depending on how the ingestion and billing is performed on the SumoLogic side - this could increase cost for the customers.

Buffer Output tuning - and consistency with fluentd BufferedOutput

This plugin doesn't inherit from BufferedOutput so the typical fluentd configuration options around tuning buffers are not available (e.g,. memory vs file, flush interval, etc)

Is this intentional? It seems like it would be easier to use if it was consistent with other buffered output plugins, using the standard <buffer> type config

I noticed this when converting from the Splunk output plugin to this output plugin, and specifically seeing that the flush interval was much higher, without an apparent way to tune it.

Access record in `custom_fields`

Hello, I would like to know if it is possible to extract some field from the record and add the value as a custom field. For example:

        custom_fields "pod=${record['kubernetes']['pod_name']}"

If the record contains this:

{
  "log": "01:22:42.462  DEBUG hello world",
  "stream": "stdout",
  "kubernetes": {
    "pod_name": "my-pod-659698dfb9-smslh"
    }
  }
}

The value my-pod-659698dfb9-smslh would be included as a custom field named pod.

If there's another way to do this in fluentd, please let me know. I'm relatively new to fluentd as well. Thanks for the plugin!

Multi-line processing

Unless I'm missing something, multi-line processing doesn't seem to exist. Any way to get this functionality?
The plugin works great otherwise...tagging and sending messages to http endpoint is a breeze.

Interpolation inside _sumo_metadata

Using the Docker logging driver for NGINX, values in _sumo_metadata is interpolated.

<filter nginx>
  @type record_transformer
  <record>
    logger.sumo true
    logger.s3   true
    _sumo_metadata {
      "host": "#{ENV['SUMO_HOST']}",
      "source": "${!container_id}",
      "category": "foo/${!tag}/${!source}"
    }
  </record>
</filter>

In logs, "host" is the hardcoded string #{ENV['SUMO_HOST']}.

Fluentd stops sending logs when log_format is set to 'text'

Edit: Found the issue: I had failed to set log_key after the key changed from the default message to a non default value.

I'm going to leave this open because I think there are some improvements which could have clarified this issue. Unfortunately I have neither the time nor the Ruby experience to open a PR right now.

  • output.rb (and perhaps the plugin as a whole) seems to lack trace logs. These would have been helpful in confirming which parts of the write() function were executing, which is helpful in understanding where the issue occurs.
  • The log_key value did not exist and I had configured log_format=text which means the expected result should be that no logs are ever sent. I think this is an appropriate situation in which to issue a log, maybe even with level warn informing the user that, for the current chunk, their configuration is useless.

I appreciate my suggestions may not be valid for this code. Feel free to close this as my issue related to misconfiguration and has been resolved.


I'm using the kube-logging operator: https://kube-logging.dev/docs/configuration/plugins/outputs/sumologic/
This issue arose as we updated from version 3.17 to 4.20 of the operator.
There are various sumologic related changes in that release, visible by searching in on the page of this very large diff.
kube-logging/logging-operator@release-3.17...4.2.0

However I've come here because the behaviour I'm getting from fluentd based on a single change to the Sumologic output configuration seems unexpected. I hope to get advice on how to move forward with this issue as it seems like a silent failure.

Feel free to close this issue or advise accordingly, I have not confirmed a bug although it does seem like there might be one.

The Sumologic Output is working as expected with log_format: json however if I set log_format: text fluentd stops sending logs.

<source>
  @type forward
  @id main_forward
  bind 0.0.0.0
  port 24240
</source>
<match **>
  @type label_router
  @id main
  metrics true
  <route>
    @label @c1157f02c8c13fd3ea66f8419567c357
    metrics_labels {"id":"flow:mynamespace:my-sumo-flow"}
    <match>
      labels my-sumo-label:enabled
      namespaces mynamespace
      negate false
    </match>
  </route>
  ... ...
</match>
<label @c1157f02c8c13fd3ea66f8419567c357>
  <match kubernetes.**>
    @type tag_normaliser
    @id flow:mynamespace:my-sumo-flow:0
    format ${namespace_name}.${labels.environment}.${pod_name}.${container_name}
  </match>
  <match **>
    @type sumologic
    @id flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output
    endpoint COLLECTOR_URL_REMOVED
    log_format json
    source_name my-sumo-source
    <buffer tag,time>
      @type file
      chunk_limit_size 32m
      flush_interval 60s
      flush_mode interval
      path /buffers/flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output.*.buffer
      retry_forever true
      timekey 10m
      timekey_wait 1m
      total_limit_size 1024m
    </buffer>
  </match>
</label>
... ...

With log_format: json I see fluend logging sends and I see logs in Sumologic.

$ tail -f -n 100000 /fluentd/log/out | grep "mynamespace:my-sumo"
2023-06-05 14:53:43 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Sending 2; logs records with source category '', source host '', source name 'my-sumo-source', chunk #5fd630dfe92e8520c65b0b2aedb3524c, try 0, batch 0
2023-06-05 14:58:21 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd6322428e95cb3b7fbd07ad24428b0" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976600, tag="mynamespace.unknown.test-logger.container-name", variables=nil, seq=0>
2023-06-05 14:59:22 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Sending 13; logs records with source category '', source host '', source name 'my-sumo-source', chunk #5fd6322428e95cb3b7fbd07ad24428b0, try 0, batch 0
2023-06-05 14:59:26 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd632622621c3c2585f88717d2c406e" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976600, tag="mynamespace.unknown.test-logger.container-name", variables=nil, seq=0>

With log_format: text I no longer see fluentd logging sends and I no longer see logs in Sumologic.

$ tail -f -n 100000 /fluentd/log/out | grep "mynamespace:my-sumo"
2023-06-05 14:41:11 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd62e4de0a29e3c6c8d40ce47a99521" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976000, tag="mynamespace.unknown.test-logger.container-name", variables=nil, seq=0>
2023-06-05 14:42:16 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd62e8bdd2eb0db408da82be8213a70" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976000, tag="mynamespace.unknown.test-logger.container-name", variables=nil, seq=0>
2023-06-05 14:43:21 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd62ec9da55d6d9601586c71c835446" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976000, tag="mynamespace.unknown.test-logger.container-name", variables=nil, seq=0>
2023-06-05 14:50:00 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd63046447c860381363c4d438ee78d" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976600, tag="mynamespace.unknown.container-namer-6fd6b8b55c-gmdsf.container-namer", variables=nil, seq=0>
2023-06-05 14:50:51 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd6307703ab3dd92bbfb4c8b3a7dfc4" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976600, tag="mynamespace.unknown.test-logger.container-name", variables=nil, seq=0>

The only change happening in the configuration is the value of log_format.

I do see buffers under /buffers.

The logging operator deploys three fluentd containers, I am monitoring and inspecting all of them at once when troubleshooting.

I am using the fluentd debug container and everything seems to be functioning as expected except this one puzzling issue. Any advice is very much appreciated.

Kubernetes metadata no longer visible to plugin

Before upgrading to 1.4.1, we used to dynamically set our Sumo source/host/category based on the K8s metadata, as follows:

<match **>
@type sumologic
endpoint https://endpoint1.collection.us2.sumologic.com/receiver/v1/http/XXXX
log_format json
source_category ${record['kubernetes']['namespace_name']}
source_name ${record['kubernetes']['container_name']}
source_host ${record['kubernetes']['pod_name']}
open_timeout 10

With 1.4.1, we're getting the above hardcoded string values (i.e...)instead of the dynamic K8s metadata. I've noticed that I'm still able to access the tag values by using something like ${tag[n]}, for instance (although it used to be tag_parts[n], but that no longer works either). Is this intentional, expected, or am I doing something wrong?

Log message getting interpreted as Array and puking on strip!

We have a log message that generates an odd error. The message is shaped like this:

[Container] 2020/11/18 20:07:50 Phase complete: BUILD State: SUCCEEDED

The error we get:

image

We think it may be getting tripped up here:

log = record[@log_key]
unless log.nil?
log.strip!
end

I am unfamiliar with ruby enough to submit a bug fix or to know if the fix is appropriate. Should that section be more explicit?

          log = (record[@log_key]).to_s
          unless log.nil?
            log.strip!
          end

Lemme know if I can provide more context.

Sumologic app settings for multi-line

I'm having trouble getting Sumologic and fluentd-kubernetes-sumologic library to cooperate for multi-line messages.

I use a custom multiline start regexp (/\d{2}:\d{2}:\d{2}/, matches any hh:mm:ss timestamp in the log message). When I start running the daemonset in Kubernetes, I noticed that Sumologic wasn't respecting the multiline-batching done by the plugin. I fiddled around with some of the HTTP collector settings, and this is what I found:
(Reference doc: https://help.sumologic.com/03Send-Data/Sources/04Reference-Information-for-Sources/Collecting-Multiline-Logs)

Let's say for the case where we have the following records to be sent to sumologic:

{"log": "00:00:00 message1\n00:00 submessage1"}
{"log": "00:00:01 message2\nsubmessage2"}

Using the default settings ("Detect messages spanning multiple lines" checked and "Infer Boundaries" checked):
The multiline grouping done in the plugin is ignored. It feels like Sumologic is taking the batch of messages sent from this plugin, concatenating them all together, and then doing its own multiline boundary processing.

00:00:00 message1
---
00:00 submessage1
---
00:00:01 message2
submessage2

"Detect messages spanning multiple lines" checked and "Boundary Regex" set to my regex (/\d{2}:\d{2}:\d{2}/)):
This works, but also defeats the purpose of the concat plugin. Ideally the concat plugin would do all of the work for multiline grouping, and sumologic would just respect it.

00:00:00 message1
00:00 submessage1
---
00:00:01 message2
submessage2

"Detect messages spanning multiple lines" unchecked and "Enable One Message Per Request" checked:
This causes all the messages in a batch request to get smushed together into one message in the Sumologic UI.

00:00:00 message1
00:00 submessage1
00:00:01 message2
submessage2

"Detect messages spanning multiple lines" unchecked and "Enable One Message Per Request" unchecked:
When everything is disabled, the multiline grouping is ignored, and instead any newline will cause a new message.

00:00:00 message1
---
00:00submessage1
---
00:00:01 message2
---
submessage2

I believe that concat plugin used by the library works fine (I stubbed out the call to this library with the stdout output plugin, and it correctly logs the concatenated logs based on the regex), which is why I'm filing an issue in this repo instead of fluentd-kubernetes-sumologic.

If you believe that I should contact the Sumologic main support channel to ask them about their API behavior, just let me know and I'll reach out to them. I figured I could be missing something obvious since I would expect someone else to have filed an issue by now if there was really a multi-line issue.

Version 1.0.0 spewing errors about "cannot load such file -- fluent/plugin/output"

So we were using version 0.0.4 of the plugin but recently installed 1.0.0 of the plugin. Using the same config, when we started td-agent it began spewing out hundreds of messages a minute in the td-agent.log file, eventually filling up the disk.

This is my config: I had to change < to [ due to formatting issues, but it's correct in my actual config.

  ## Send to Sumologic - Catchall HTTP endpoint ##
  [store>
    @type forest
    subtype sumologic
    escape_tag_separator /
    [template>
      endpoint https://endpoint2.collection.us2.sumologic.com/receiver/v1/http/Zablahblahblahblah9gliErQ==
      log_format json
      verify_ssl true
      source_host ${hostname}
      source_name prodb
      source_category prod/${escaped_tag}
    [/template>
  [/store>

And this was what showed up in the logs over and over again:

/opt/td-agent/embedded/lib/ruby/2.1.0/monitor.rb:211:in `mon_synchronize'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/buffer.rb:193:in `emit'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:593:in `block in emit'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:592:in `each'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:592:in `emit'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:42:in `next'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/buffer.rb:199:in `block in emit'
/opt/td-agent/embedded/lib/ruby/2.1.0/monitor.rb:211:in `mon_synchronize'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/buffer.rb:193:in `emit'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:593:in `block in emit'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:592:in `each'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:592:in `emit'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:42:in `next'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin/out_copy.rb:78:in `emit'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/event_router.rb:90:in `emit_stream'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/event_router.rb:81:in `emit'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/engine.rb:177:in `block in log_event_loop'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/engine.rb:175:in `each'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/engine.rb:175:in `log_event_loop'
2017-11-09 20:38:46 +0000 [error]: Cannot output messages with tag 'fluent.error'
2017-11-09 20:38:46 +0000 [error]: failed to configure/start sub output sumologic: cannot load such file -- fluent/plugin/output
2017-11-09 20:38:46 +0000 [error]: /opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:54:in `require'
/opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:54:in `require'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-sumologic_output-1.0.0/lib/fluent/plugin/out_sumologic.rb:1:in `'
/opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:54:in `require'
/opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:54:in `require'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin.rb:172:in `block in try_load_plugin'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin.rb:170:in `each'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin.rb:170:in `try_load_plugin'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin.rb:130:in `new_impl'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/plugin.rb:59:in `new_output'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-forest-0.3.3/lib/fluent/plugin/out_forest.rb:131:in `block in plant'
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-forest-0.3.3/lib/fluent/plugin/out_forest.rb:128:in `synchronize'

The above configuration works perfectly fine using version 0.0.4 of the sumologic pluugin.

Migrate to use v0.14 API?

Hi, I'm wondering if there are plans to migrate this plugin to the new API? It would be great to get the millisecond support working and it would also allow for the use of tag_parts in the config.

It would allow me to do something like this
source_category "Dev/${tag_parts[0]}/${tag_parts[1]}"

Thanks

Fluentd not finding sumologic output (using version 0.0.5)

When I start fluentd with /opt/td-agent/embedded/bin/fluentd -c /etc/td-agent/td-agent.conf, I am getting the following error message:

# /opt/td-agent/embedded/bin/fluentd -c /etc/td-agent/td-agent.conf
2017-08-23 04:09:29 +0000 [info]: reading config file path="/etc/td-agent/td-agent.conf"
2017-08-23 04:09:29 +0000 [info]: starting fluentd-0.12.35
2017-08-23 04:09:29 +0000 [info]: gem 'fluent-mixin-config-placeholders' version '0.4.0'
2017-08-23 04:09:29 +0000 [info]: gem 'fluent-mixin-plaintextformatter' version '0.2.6'
2017-08-23 04:09:29 +0000 [info]: gem 'fluent-plugin-kafka' version '0.5.5'
2017-08-23 04:09:29 +0000 [info]: gem 'fluent-plugin-mongo' version '0.8.0'
2017-08-23 04:09:29 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.5.5'
2017-08-23 04:09:29 +0000 [info]: gem 'fluent-plugin-s3' version '0.8.2'
2017-08-23 04:09:29 +0000 [info]: gem 'fluent-plugin-scribe' version '0.10.14'
2017-08-23 04:09:29 +0000 [info]: gem 'fluent-plugin-sumologic_output' version '0.0.5'
2017-08-23 04:09:29 +0000 [info]: gem 'fluent-plugin-td' version '0.10.29'
2017-08-23 04:09:29 +0000 [info]: gem 'fluent-plugin-td-monitoring' version '0.2.2'
2017-08-23 04:09:29 +0000 [info]: gem 'fluent-plugin-webhdfs' version '0.4.2'
2017-08-23 04:09:29 +0000 [info]: gem 'fluentd' version '0.12.35'
2017-08-23 04:09:29 +0000 [info]: adding match pattern="**.**" type="sumologic"
2017-08-23 04:09:29 +0000 [error]: config error file="/etc/td-agent/td-agent.conf" error="Unknown output plugin 'sumologic'. Run 'gem search -rd fluent-plugin' to find plugins"
2017-08-23 04:09:29 +0000 [info]: process finished code=256
2017-08-23 04:09:29 +0000 [warn]: process died within 1 second. exit.

Contents of td-agent.conf:

#
# Simple default td-agent.conf
# For more details, see http://docs.fluentd.org/articles/config-file
#

@include /etc/td-agent/conf.d/*.conf

And the output.conf file defining the sumologic output is:

<match **.**>
  @type sumologic

  log_format json
  endpoint https://collectors.sumologic.com/ [...]

  buffer_chunk_limit 256m
</match>

If I run td-agent-gem list, this gem is definitely in the list...

# td-agent-gem list | grep "sumologic_output"
fluent-plugin-sumologic_output (0.0.5)

It seems like this was working okay two weeks ago (last time I did a fresh install and specifically tested fluentd/sumologic). I'm running fluentd 0.12.35.

After last release TD-Agent doesn't work anymore

Apr 27 13:04:06 systemd: Starting td-agent: Fluentd based data collector for Treasure Data... Apr 27 13:04:06 fluentd: /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/config/element.rb:221:in unescape_parameter': undefined method each_char' for nil:NilClass (NoMethodError) Apr 27 13:04:06 fluentd: from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/config/element.rb:204:in dump_value'
Apr 27 13:04:06 fluentd: from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/config/element.rb:151:in block in to_s' Apr 27 13:04:06 fluentd: from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/config/element.rb:150:in each_pair'`

Issue here is that custom_fields value has null key

Unknown output plugin 'sumologic'

Hi,
I configure td-agent (td-agent-3.2.1-0.el7.x86_64) on my test machine so ship logs directly to Sumo via https endpoint and I am seeing mentioned output plugin error. I installed fluent-plugin-sumologic_output as shown below and when I run gem list I see plugin is installed but when I run "td-agent-gem list", I dont see any. Any thing I am missing over here?

Error Message:

2018-11-29 15:21:36 -0500 [info]: parsing config file is succeeded path="/etc/td-agent/td-agent.conf"
2018-11-29 15:21:36 -0500 [error]: config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="Unknown output plugin 'sumologic'. Run 'gem search -rd fluent-plugin' to find plugins"
2018-11-29 15:21:37 -0500 [info]: parsing config file is succeeded path="/etc/td-agent/td-agent.conf"
2018-11-29 15:21:37 -0500 [error]: config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="Unknown output plugin 'sumologic'. Run 'gem search -rd fluent-plugin' to find plugins"
2018-11-29 15:21:37 -0500 [info]: parsing config file is succeeded path="/etc/td-agent/td-agent.conf"
2018-11-29 15:21:37 -0500 [error]: config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="Unknown output plugin 'sumologic'. Run 'gem search -rd fluent-plugin' to find plugins"
2018-11-29 15:21:38 -0500 [info]: parsing config file is succeeded path="/etc/td-agent/td-agent.conf"
2018-11-29 15:21:38 -0500 [error]: config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="Unknown output plugin 'sumologic'. Run 'gem search -rd fluent-plugin' to find plugins"
2018-11-29 15:21:38 -0500 [info]: parsing config file is succeeded path="/etc/td-agent/td-agent.conf"
2018-11-29 15:21:38 -0500 [error]: config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="Unknown output plugin 'sumologic'. Run 'gem search -rd fluent-plugin' to find plugins"

Gem list:

[[email protected] conf.d]# gem list | grep fluent-plugin
fluent-plugin-sumologic_output (1.3.1)
[[email protected] conf.d]# td-agent-gem list | grep fluent-plugin
fluent-plugin-elasticsearch (2.11.11)
fluent-plugin-kafka (0.7.9)
fluent-plugin-record-modifier (1.1.0)
fluent-plugin-rewrite-tag-filter (2.1.0)
fluent-plugin-s3 (1.1.6)
fluent-plugin-td (1.0.0)
fluent-plugin-td-monitoring (0.2.4)
fluent-plugin-webhdfs (1.2.3)

Td-agent version:

[[email protected] conf.d]# rpm -qa | grep td-agent
td-agent-3.2.1-0.el7.x86_64

Td-agent conf:

[[email protected] td-agent]# cat td-agent.conf
@include /etc/td-agent/conf.d/*

infratest:

[[email protected]]# cat infratest

@type tail
format none
path /var/log/messages
pos_file /tmp/infra-test.log.pos
time_format %Y-%m-%dT%H:%M:%S.%NZ
tag test03.development.infratest
read_from_head false

<match **.infratest>
@type sumologic
endpoint https://collectors.sumologic.com/receiver/v1/http/XXXXXXXX==
log_format text
source_category infra-test
source_name infra-test
open_timeout 10

Plugin stops working after a couple of hours

I'm running 2 docker container my setup. The first container runs our Rails app. The other runs for FluentD collecting logs and sending it to SumoLogic.

When the FluentD container is started, everything works as expected. After a couple of hours FluentD stops sending logs to SumoLogic. At this point of time, FluentD is still collecting logs and dumping them to the buffer_type:file. I see files increasing in size but nothing in SumoLogic. Here's my buffer settings:

    # buffering
    buffer_type file
    buffer_path /fluentd/buffer/
    flush_interval 10s
    buffer_chunk_limit 16m
    buffer_queue_limit 512
    disable_retry_limit true
    flush_at_shutdown true
    buffer_queue_full_action drop_oldest_chunk

How to diagnose or debug this issue? Any pointers appreciated.

Multi-worker support

It doesn't appear that this plugin supports the multi-worker configuration using the latest version of Fluentd 1.2.2 (via td-agent v3.2.0):

error_class=Fluent::ConfigError error="Plugin 'sumologic' does not support multi workers configuration (Fluent::Plugin::Sumologic)"

https://docs.fluentd.org/v1.0/articles/performance-tuning-single-process#multi-workers

The details on the changes required are in this blog post: https://www.fluentd.org/blog/fluentd-v0.14.12-has-been-released

Hash messages having key of `message` with non string value will raise: NoMethodError: undefined method `chomp!`

We switched from a different fluent->sumo plugin and ran into this issue. There were a few places where we were doing something like:

log_hash = {
  zoot: 'boot',
  message: { text: 'no way', status: 'bad' },
}
@fluent_logger.post('foo', log_hash)

Because the value of the message key in the hash has a non-string value it will raise a NoMethodError: undefined method 'chomp!' here:

record[@log_key].chomp! if record[@log_key]

I'm wondering if it should do something like and then the fluentd formatters can handle the formatting of the message?

# Strip any unwanted newlines
record[@log_key].chomp! if record[@log_key] && record[@log_key].respond_to?(:chomp!)

1.4.1 broke parameter expansion of `${tag_parts[1]}`

Hello.
I'm using the plugin with next configuration:

<match {app,ecs}.**>
 @type           sumologic
 endpoint        "#{ENV["SUMOLOGIC_ENDPOINT"]}"
 log_format      json
 source_category "#{ENV["SUMOLOGIC_SOURCE_CATEGORY"]}"
 source_name     ${tag_parts[1]} # use second part of the tag as source_name
 open_timeout    300
</match>

Yesterday I upgraded to the v1.4.1 and placeholder expansion for ${tag_parts[1]} stopped working. PR #40 broke it, now ${tag[1]} has to be used. Not a big problem for me, but I think you should mention somewhere that this is breaking change.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.