Code Monkey home page Code Monkey logo

honeytail's Introduction

honeytail

OSS Lifecycle CircleCI

honeytail is Honeycomb's agent for ingesting log file data into Honeycomb and making it available for exploration. Its favorite format is JSON, but understands how to parse a range of other well-known log formats.

See our documentation to read about how to configure and run honeytail, to find tips and best practices, and to download prebuilt versions.

Supported Parsers

honeytail supports reading files from STDIN as well as from a file on disk.

Our complete list of parsers can be found in the parsers/ directory, but as of this writing, honeytail will support parsing logs generated by:

Installation

Install from source:

go install github.com/honeycombio/honeytail@latest

to install to a specific path:

GOPATH=/usr/local go install github.com/honeycombio/honeytail@latest

the binary will install to /usr/local/bin/honeytail

Use a prebuilt binary: find the latest version on Honeycomb.io

Usage

Using command line arguments:

honeytail --writekey=YOUR_WRITE_KEY --dataset='Best Data Ever' --parser=json --file=/var/log/api_server.log

Using a config file:

honeytail --config honeytail-conf.ini

For more advanced usage, options, and the ability to scrub or drop specific fields, see our documentation.

Related Work

We've extracted out some generic work for a particular log format:

  • mysqltools contains logic specific to normalizing MySQL queries

Contributions

Features, bug fixes and other changes to honeytail are gladly accepted. Please open issues or a pull request with your change. Remember to add your name to the CONTRIBUTORS file!

All contributions will be released under the Apache License 2.0.

honeytail's People

Contributors

adihoney avatar bdarfler avatar cakoose avatar christineyen avatar dependabot[bot] avatar dsoo avatar emfree avatar gsalisbury avatar igorwwwwwwwwwwwwwwwwwwww avatar ismith avatar jamiedanielson avatar kentquirk avatar lizthegrey avatar maplebed avatar mikegoldsmith avatar nathanleclaire avatar neunhoef avatar ngauthier avatar nlincoln avatar pckilgore avatar pkanal avatar rfong avatar ringods avatar robbkidd avatar samstokes avatar thusoy avatar toshok avatar tredman avatar tylerhelmuth avatar vreynolds avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

honeytail's Issues

[docs] Download instructions for non-linux OS

๐Ÿ‘‹ I'm trying to try out Honeycomb and backfill an example data set. My thought process:

  • Oh cool, they have a backfill doc, and a cli tool that can help!
  • I just need to download honeytail... Let's look at the doc

Running the Honeytail agent | Honeycomb 2019-03-14 18-06-22

Oh... I use a mac tho...

finds github repo

Okay, I can use go get

go get github.com/honeycombio/honeytail

That works for me, but I just started using golang not too long ago, and wouldn't have had that toolchain installed a month ago. A brew option, a self-contained installer, or even just the go get instructions in the official docs would have been really helpful here.

mysql parser does not support unix socket to fetch extra info and connection ID

  1. When specifing mysql.host as a unix socket, i.e.:
    mysql.host=@unix(/var/run/mysqld/mysql.sock)
    honeytail produces an error.

  2. Also in mysql slow log format there is a connection IDentifier is passed along with user and client host attributes. Honeytail just gets rid of Id completly. It would be nice to get support of that as well.

# administrator command: Prepare;
# User@Host: root[root] @  [10.0.1.76]  Id: 325920
...

Here is patch for support all mentioned above:
mysql.txt

Document --api_host

I couldn't find mention of --api_host on docs.honeycomb.io or in honeytail --help, I just guessed and it worked.

mysql queries that include the word timestamp break parser

When a query gets parsed that includes a column named timestamp (at least in mysql) the parser stops parsing the other data from the query, I've done some testing on this to try to verify, I added the following to to the mysql_test.go:

        { /* 30 */
		  	rawE: []string{
				"SET timestamp=1459470669;",
				"select /* important_data */ a.name as a_name, a.ts as ts from funtable a;",
			},
			sq: map[string]interface{}{
				queryKey: "select /* important_data */ a.name as a_name, a.ts as ts from funtable a",
				normalizedQueryKey: "select a.name as a_name,a.ts as ts from funtable as a",
				commentsKey: "/* important_data */",
				statementKey: "select",
				tablesKey: "a funtable",
			},
			timestamp: t1.Truncate(time.Second),
		},
        { /* 31 */
			rawE: []string{
			  "SET timestamp=1459470669;",
			  "select /* important_data */ a.name as a_name, a.timestamp as ts from funtable a;",
		  },
		  sq: map[string]interface{}{
			  queryKey: "select /* important_data */ a.name as a_name, a.timestamp as ts from funtable a",
			  normalizedQueryKey: "select a.name as a_name,a.timestamp as ts from funtable as a",
			  commentsKey: "/* important_data */",
			  statementKey: "select",
			  tablesKey: "a funtable",
		  },
		  timestamp: t1.Truncate(time.Second),
	  },

test 30 passes just fine, but the result of test 31 is:

res is map[normalized_query:select /* important_data */ a.name as a_name, a.timestamp as ts from funtable a query:select /* important_data */ a.name as a_name, a.timestamp as ts from funtable a statement:]
--- FAIL: TestHandleEvent (0.00s)
    mysql_test.go:630: case num 31: expected to parse 5 fields, got 3
    mysql_test.go:635: case num 31, key normalized_query:
        	expected:	"select a.name as a_name,a.timestamp as ts from funtable as a"
        	got:		"select /* important_data */ a.name as a_name, a.timestamp as ts from funtable a"
    mysql_test.go:635: case num 31, key comments:
        	expected:	"/* important_data */"
        	got:		%!q(<nil>)
    mysql_test.go:635: case num 31, key statement:
        	expected:	"select"
        	got:		""
    mysql_test.go:635: case num 31, key tables:
        	expected:	"a funtable"
        	got:		%!q(<nil>)
FAIL
exit status 1
FAIL	github.com/honeycombio/honeytail/parsers/mysql	0.465s

GetTimestamp says it deletes the time value it returns, but not if it's already a time.Time

Documentation for GetTimestamp() says that it deletes the timestamp that it finds from the map that's passed in.

However, if this timestamp is already in time.Time format, it (simply returns)[https://github.com/honeycombio/honeytail/blob/0ff79e56031cc7281606b51e781da4e956a5abe7/httime/httime.go#L108].

There should be a delete before this return.

This was found by code inspection.

Ability to perform health checks on honeytail container

Is your feature request related to a problem? Please describe.

When running honeytail in containerized workloads, it'd be nice to perform some of sort of health check to indicate that the process/container is "up and doing fine". For instance when running on kubernetes, fargate, etc.

Describe the solution you'd like

  • Exposing an endpoint like /health that returns a 200 could be nice. Users can then do a curl..
  • Or if http server / endpoint is not an option, we can also do something like honeytail health (to introduce a new command) which returns some output with an exit code of 0, or 1 accordingly on the success of the validation command. Behind the scenes it could make some sort of auth/validation request against Honeycomb endpoint to ensure the setup is right (maybe use the API key to perform a auth ping check?). Or if there is a way to determine that internally without having to thrash honeycomb API, then thats grand too.

Describe alternatives you've considered

Running pgrep -x 'honeytail'

Additional context

I am happy to work propose a PR if there is interest and if the high level idea here sounds like a something that worth building into honeytail :)

Honeytail 1.4.1 only supports a single value for AddFields in .conf

I haven't had a chance to repro this from source or bisect, but in upgrading from an ancient version of Honeytail to 1.4.1, I noticed that the extra fields I'd been adding per-honeytail instance via AddFields key=val in the config were getting dropped.

Previously I was invoking Honeytail something like this:
/usr/bin/honeytail -c /etc/honeytail/honeytail_nginx.conf

But as a work-around I'm now doing this:
/usr/bin/honeytail -c /etc/honeytail/honeytail_nginx.conf --add_field parser=nginx --add_field hostname=api-0.env.hostname.com --add_field service=api etc.

This is mostly 6-of-1, half-dozen of another, but since AddFields still shows up in the default config, I was wondering if there was a regression (or if I'm simply not specifying it correctly).

Old configs looked like:

AddFields = parser=nginx
AddFields = hostname=api-0.env.hostname.com
AddFields = service=api

Thanks!

Allow for fields with periods < . > in the name

Current behvaiour: The --add_field parameter allows field names with periods < . > in them but the conf file for --parser nginx --nginx.format haproxy does not.

Impact: This means that fields which should align with other applications using beelines can not. E.g. Trace.traceId

Desired behaviour: Allow for periods in the conf file as well ๐Ÿ™

My current test data trimmed down:

example log line:

Nov 18 13:53:02 ip-10-120-64-22 haproxy[24977]: 10.120.0.38:36322 [18/Nov/2019:13:53:02.015] stats

working conf file:

log_format haproxy '... .. ..:..:.. $hostname $process[$pid]: '
    '$request_remote_addr:$request_remote_addr_port [$Timestamp] '
    '$haproxy_frontend_listener'

desired conf file:

log_format haproxy '... .. ..:..:.. $hostname $process[$pid]: '
    '$request.remote_addr:$request.remote_addr.port [$Timestamp] '
    '$haproxy.frontend_listener'

command being run:

honeytail --writekey={{HONEYCOMB_KEY}}     --parser=nginx --dataset="{{DATASET_NAME}}"     --file={{PATH_TO_LOG_FILE}} --nginx.conf {{PATH_TO_CONF_FILE}}     --nginx.format haproxy     --backfill --debug

output on failure:

INFO[0000] Starting honeytail                           
DEBU[0000] about to call tail.TailFile                   conf={Paths:[{{PATH_TO_LOG_FILE}}] FilterPaths:[] Type:0 Options:{ReadFrom:beginning Stop:true Poll:false StateFile:}} location=<nil> statefile=/tmp/short.abby.leash.state tailConf={Location:<nil> ReOpen:false MustExist:true Poll:false Pipe:false RateLimiter:<nil> Follow:false MaxLineSize:0 Logger:0xc4200a4b40}
DEBU[0000] Attempting to process nginx log line          line=Nov 18 13:53:02 ip-{{MACHINE_IP}} haproxy[24977]: {{MACHINE_IP}}:36322 [18/Nov/2019:13:53:02.015] stats
DEBU[0000] failed to parse nginx log line                error=access log line 'Nov 18 13:53:02 ip-{{MACHINE_IP}} haproxy[24977]: {{MACHINE_IP}}:36322 [18/Nov/2019:13:53:02.015] stats' does not match given format '^... .. ..:..:.. (?P<hostname>[^ ]*) (?P<process>[^[]*)\[(?P<pid>[^]]*)\]: (?P<request>[^.]*).remote_addr:(?P<request>[^.]*).remote_addr.port \[(?P<Timestamp>[^]]*)\] (?P<haproxy>[^.]*).frontend_listener$' logline=Nov 18 13:53:02 ip-{{MACHINE_IP}} haproxy[24977]: {{MACHINE_IP}}:36322 [18/Nov/2019:13:53:02.015] stats
DEBU[0000] Initializing stats reporting. Will print stats once/60 seconds 
DEBU[0000] lines channel is closed, ending nginx processor 
INFO[0000] Summary of sent events                        avg_duration=0s count=0 count_per_status=map[] errors=map[] fastest=0s lifetime_count=0 response_bodies=map[] slowest=0s
INFO[0000] Total number of events sent                   number sent by response status code=map[] total attempted sends=0
INFO[0000] Honeytail is all done, goodbye!   

allow disabling logs

Is your feature request related to a problem? Please describe.
We use honeytail to forward logs to honeycomb from our ECS containers. In the ECS logs (which I have to look at sometimes as not all logs are forwarded), we keep seeing a ton of log lines caused by this line. These logs pollute the log stream and make it harder to find other relevant logs outside of the Honeycomb interface.

Describe the solution you'd like
I'd love to be able to set an env var (i.e. HONEYTAIL_NO_LOG) that would prevent these logs from being written at all.

Describe alternatives you've considered
I've contemplated not using honeytail at all for logs, but migration away from the status quo is just not something we have capacity for at the moment.

Additional context
n/a

Postgres parsing regular expressions are broken

In postgresql, databases, usernames and more can contain characters outside your chosen regular expressions for slow query logs

Compare:
slowQueryHeader = \s*(?P[A-Z0-9]+):\s+duration: (?P[0-9.]+) ms\s+(?:(statement)|(execute \S+)): `

With the log line:
2020-03-12 09:22:12.102 CET [12422] [email protected] LOG: duration: 8618.372 ms plan:

This makes honeytail miss all log events in our database.

It's not possible to specify multi-valued fields (slices) in a .ini file

Steps to reproduce

  1. Create an ini file with multiple values for AddFields (or other multi-valued fields)
  2. Specify that ini file with --config; only the first value will be returned.

Additional context

The bug is that the jessevdk/go-flags package doesn't handle multiple entries for slice-valued items by default. You can, however, specify in the tags for the config object a delimiter like this: env-delim:",", which will split the field on that delimiter. The fix is to add that tag specification to all slice fields in main.go.

I tested this with a dummy app and it works.

Internal slack conversation: https://houndsh.slack.com/archives/C012HRW16HZ/p1660257971833609

Honeytail seems to skip first line with globs

I seem to be getting a bit of an odd behavior (first line in file skipped) with Honeytail using globs and one line files.

try this:

mkdir /tmp/htissue
echo '{"skipped": "maybe"}' >/tmp/htissue/issue.json
honeytail --debug -k $key -d httest -f '/tmp/htissue/issue.json'

I don't see any record of parsing and sending, even though state file thinks it's sent it on the next run. The debug output below - I'd expect to see "Attempting to process json log file" but instead I see no such message.

INFO[0000] Starting honeytail
DEBU[0000] getStartLocation failed to open the statefile  error=open /tmp/ossifrage.json.leash.state: no such file or directory starting at=end
DEBU[0000] about to call tail.TailFile                   conf={Paths:[/tmp/ossifrage/*.json] Type:0 Options:{ReadFrom:last Stop:false Poll:false StateFile:}} location=&{Offset:0 Whence:2} statefile=/tmp/ossifrage.json.leash.state tailConf={Location:0xc4203ee610 ReOpen:true MustExist:true Poll:false Pipe:false RateLimiter:<nil> Follow:true MaxLineSize:0 Logger:0xc420050c80}
DEBU[0000] Initializing stats reporting. Will print stats once/60 seconds
^CAborting! Caught signal "interrupt"
Cleaning up...

I can't figure out if this behavior is consistent or not yet. If I add another line to the file, honeytail does seem to pick up on that.

Going to look into this more soon.

honeytail exits without reading from named pipes as input source

I'm attempting to integrate a data collector of mine written in C with honeytail. My first try was with a named pipe (/var/run/syspoll/fifo). My collector spits out JSON data to that named pipe every 60 seconds. When honeytail starts up it waits for data from the named pipe. Once it reads a line of input it immediately exits without uploading (or parsing) anything. It doesn't appear to actually read the line of JSON that gets output.

I was able to get it to work by having the collector print the JSON output to stdout and to tell honeytail to read from stdin, but the named pipe would be preferable.

honeytail panics if it can't find a url in incoming data

I was trying to ingest some data that apparently had a weird log line (second one):

5.135.253.54 - - [01/Feb/2018:11:09:46 +0000] "POST /wp-admin/admin-ajax.php HTTP/1.1" 404 18786 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1) AppleWebKit/533.26.2 (KHTML, like Gecko) Version/4.0 Safari/533.26.2"
5.135.253.54 - - [01/Feb/2018:11:09:46 +0000] "157" 400 0 "-" "-"

This caused honeytail to panic, which it shouldn't do even if I'm trying to pretend this apache log is nginx.

honeytail -k mykey -p "nginx" -d "allfeorlen" -f "/home/feorlen/tmp/feorlenlogs/*" --nginx.conf "/etc/apache2/sites-enabled/feorlen.org.conf" --nginx.format "combined" --debug

DEBU[0000] Attempting to process nginx log line          line=5.135.253.54 - - [01/Feb/2018:11:09:46 +0000] "157" 400 0 "-" "-"
panic: interface conversion: interface {} is int64, not string

goroutine 2660 [running]:
main.(*requestShaper).requestShape(0xc4204d9540, 0x869721, 0x7, 0xc420f12eb8, 0x801615, 0x19, 0x1, 0x0, 0x0, 0x1, ...)
	/home/travis/gopath/src/github.com/honeycombio/honeytail/leash.go:371 +0xea5
main.modifyEventContents.func1.1(0xc420446cc0, 0xc420335c00, 0xc42032ffb0, 0xc4204d9540, 0x0, 0x0, 0xc420446de0, 0xc420361610)
	/home/travis/gopath/src/github.com/honeycombio/honeytail/leash.go:307 +0x483
created by main.modifyEventContents.func1
	/home/travis/gopath/src/github.com/honeycombio/honeytail/leash.go:324 +0xdd

Allow extra unknown fields at the end

The AWS ALB logs use the nginx parser to interpret the log line. AWS says "You should ignore any fields at the end of the log entry that you were not expecting." Extend the nginx parser to optionally allow for extra unknown fields at the end of the line.

honeytail container image on Docker hub

I want to use honeytail to collect logs of some apps running on our servers. I found the Dockerfile that would allow me to build the container image that I would be able to use, I haven't found the image already built on the Docker hub.

I see you publish already several images under honeycombio account already. Could it be possible to also build and publish honeytail there?

Installing honeytail to the host system is not an option for us as we want to run everything in containers and keep the host system unmodified.

We can't use Kubernetes solution yet either.

honeytail will fail if gonx regex fields are missing completely

I ran into a somewhat odd case with gonx -

if you have the log format set like recommended at https://honeycomb.io/docs/connect/nginx#missing-default-options:

log_format combined '$remote_addr - $remote_user [$time_local] $host '
    '"$request" $status $bytes_sent $body_bytes_sent $request_time '
    '"$http_referer" "$http_user_agent" $request_length "$http_authorization" '
    '"$http_x_forwarded_proto" "$http_x_forwarded_for" $server_name';

But don't have server_name set, it seems the nginx log line will leave it off the end completely?

e.g.:

172.22.0.1 - [16/Feb/2018:22:43:44 +0000] localhost "POST /api/todos/ HTTP/1.1" 200 291 136 0.053 "-" "curl/7.54.0" 228 "-" "-" "-"

This causes the regex parsing to fail completely (thus sending no events at all) when it should ideally be giving best effort. the generated regexp:

^(?P<remote_addr>[^ ]*) - \[(?P<time_local>[^]]*)\] (?P<host>[^ ]*) "(?P<request>[^"]*)" (?P<status>[^ ]*) (?P<bytes_sent>[^ ]*) (?P<body_bytes_sent>[^ ]*) (?P<request_time>[^ ]*) "(?P<http_referer>[^"]*)" "(?P<http_user_agent>[^"]*)" (?P<request_length>[^ ]*) "(?P<http_authorization>[^"]*)" "(?P<http_x_forwarded_proto>[^"]*)" "(?P<http_x_forwarded_for>[^"]*)" (?P<server_name>[^ ]*)$

Not urgent since it was somewhat PEBKAC (didn't set server_name), but ideally... we should either account for this in the regex, or alert the user that some critical field missing is the reason for the failure.

Weird behavior for hosted_on attribute for mysql parser

It looks from the code that hosted_on does not seem to work on most self-hosted servers (not RDS).

Probably hosted_on should contain hostname instead of basedir variable value. Or at least work for not just RDS instances.

It would be good data value to have hosted_on (or other attribute) which represents a hostname of the server because there might be more than just one server honeytail will parse logs from.

Here we have a patch with our suggestion:
mysql-hosted.txt

API key is leaked via process tree

Description

honeytail accepts the API_KEY via a command line argument: --writekey=API_KEY. This means the key will show up in the process tree allowing any user on the system to see that key.

Could honeytail be extended with accepting the API_KEY via an environment variable or via a file? That way the key can be hidden from other users.

Issue with Honeytail not parsing through Arangodb logs

Hi, had some some issues with parsing through some arangodb logs.
When running with --debug, I get DEBU[0000] logline didn't parse, skipping. throughout the whole log file.
The command I'm using:
./honeytail --writekey=writekey --dataset='example-arango' --parser=arangodb --file=arangod.log --backfill --debug

I attached an example of the log I am running.
arangod.log

honeytail syslogs notworking

For below command syslogs are not getting uploaded, in debug mode we are seeing attempting to process line, for each entry and the final out put as shown below, usedboth syslog modes result is same

command
honeytail --writekey=XXXXXXXXXXXXXXXXXXXXX --parser=syslog --dataset=ubuntu_test
--file=/var/log/auth.log --backfill --syslog.mode=rfc5424 --debug

OUTPUT

authenticating user root 212.47.244.235 port 41990 [preauth]
DEBU[0001] lines channel is closed, ending syslog processor
INFO[0001] Summary of sent events avg_duration=0s count=0 count_per_status=map[] errors=map[] fastest=0s lifetime_count=0 response_bodies=map[] slowest=0s
INFO[0001] Total number of events sent number sent by response status code=map[] total attempted sends=0
INFO[0001] Honeytail is all done, goodbye!


GetTimestamp might delete a field that isn't a timestamp

In the loop at the bottom of GetTimestamp(), the function records a name to delete, and then attempts to parse its value. The result of that parse will be a zero Time value if the parse failed (and if so, the loop continues). But if the loop runs without ever finding a valid timestamp, it will still delete the last field that matched one of the names it's searching.

To be fair, if someone has called a field "Timestamp" and this code can't parse it, there are probably bigger problems. But I noticed this while reviewing the code.

Support for specifying a date for timestamps.

What

I'm looking at adding support for a 'default' date of sorts when parsing a strftime'd timestamp comes back as 0000-01-01.

Why

(reposting from the pollinators slack)
Does anyone know a current integration that separates date and time? Trying to get Minecraft server logs into :honeycomb: and the log entries only give a time [%H:%M:%S], implicitly under server date. Would also be a useful option for any archived log files like this.

honeytail parsing ends up looking like
INFO[0060] Last parsed event event=map[source:Server thread log_level:INFO message:CaseyLeask left the game] event_timestamp=0000-01-01 10:09:55 +0000 UTC
=0000-01-01

How

One option for Now(), and another for a different date (supporting log archives)

--date-used=server-time

or

--date-used=2020-10-10

Steps to reproduce

Install Minecraft under Ubuntu. Parse the latest.log file.

ubuntu@ip-255-255-255-255:~$ ./honeytail --writekey=<REDACTED> --parser=regex --dataset="Minecraft" --file=/home/ubuntu/minecraft/logs/latest.log --regex.line_regex="\[(?P<time>\d\d:\d\d:\d\d)\] \
\[(?P<source>(?:\w|\s)+)\/(?P<log_level>\w+)\]: (?P<message>.*$)" --tail.read_from=start

Sample of logs

(Swapped my user-id for $GUID)

[11:22:35] [main/INFO]: Environment: authHost='https://authserver.mojang.com', accountsHost='https://api.mojang.com', sessionHost='https://sessionserver.mojang.com', name='PROD'
[11:22:36] [main/WARN]: Ambiguity between arguments [teleport, destination] and [teleport, targets] with inputs: [Player, 0123, @e, $GUID]
[11:22:36] [main/WARN]: Ambiguity between arguments [teleport, location] and [teleport, destination] with inputs: [0.1 -0.5 .9, 0 0 0]
[11:22:36] [main/WARN]: Ambiguity between arguments [teleport, location] and [teleport, targets] with inputs: [0.1 -0.5 .9, 0 0 0]
[11:22:36] [main/WARN]: Ambiguity between arguments [teleport, targets] and [teleport, destination] with inputs: [Player, 0123, $GUID]
[11:22:36] [main/WARN]: Ambiguity between arguments [teleport, targets, location] and [teleport, targets, destination] with inputs: [0.1 -0.5 .9, 0 0 0]
[11:22:36] [main/INFO]: Reloading ResourceManager: Default
[11:22:37] [Worker-Main-2/INFO]: Loaded 7 recipes
[11:22:38] [Worker-Main-2/INFO]: Loaded 927 advancements
[11:22:40] [Server thread/INFO]: Starting minecraft server version 1.16.1
[11:22:40] [Server thread/INFO]: Loading properties
[11:22:40] [Server thread/INFO]: Default game type: SURVIVAL
[11:22:40] [Server thread/INFO]: Generating keypair
[11:22:41] [Server thread/INFO]: Starting Minecraft server on *:25565
[11:22:41] [Server thread/INFO]: Using epoll channel type
[11:22:41] [Server thread/INFO]: Preparing level "world"
[11:22:41] [Server thread/INFO]: Preparing start region for dimension minecraft:overworld
[11:22:44] [Server thread/INFO]: Preparing spawn area: 0%
[11:22:44] [Server thread/INFO]: Preparing spawn area: 0%
[11:22:44] [Server thread/INFO]: Preparing spawn area: 0%
[11:22:44] [Server thread/INFO]: Preparing spawn area: 0%
[11:22:44] [Server thread/INFO]: Preparing spawn area: 0%
[11:22:44] [Server thread/INFO]: Preparing spawn area: 0%
[11:22:44] [Worker-Main-2/INFO]: Preparing spawn area: 0%
[11:22:45] [Server thread/INFO]: Preparing spawn area: 30%
[11:22:46] [Server thread/INFO]: Time elapsed: 4637 ms
[11:22:46] [Server thread/INFO]: Done (4.838s)! For help, type "help"
[11:22:49] [Server thread/WARN]: Can't keep up! Is the server overloaded? Running 2018ms or 40 ticks behind
[11:23:18] [Server thread/INFO]: com.mojang.authlib.GameProfile@40ed17c4[id=<null>,name=CaseyLeask,properties={},legacy=false] (/180.150.36.177:4977) lost connection: Disconnected
[11:23:33] [Server thread/INFO]: Unknown or incomplete command, see below for error
[11:23:33] [Server thread/INFO]: <--[HERE]
[11:23:46] [User Authenticator #1/INFO]: UUID of player CaseyLeask is $GUID
[11:23:47] [Server thread/INFO]: CaseyLeask[/180.150.36.177:4987] logged in with entity id 515 at (-247.9621645626505, 79.0, -635.3213752486413)
[11:23:47] [Server thread/INFO]: CaseyLeask joined the game
[11:23:49] [Server thread/WARN]: Can't keep up! Is the server overloaded? Running 2297ms or 45 ticks behind
[11:23:59] [Server thread/WARN]: CaseyLeask moved too quickly! 3.212071771716893,-7.263677079418912,6.545505692867891
[11:24:16] [Server thread/WARN]: Can't keep up! Is the server overloaded? Running 12474ms or 249 ticks behind
[11:38:01] [Server thread/WARN]: Fetching packet for removed entity bbg['Air'/3256, l='ServerLevel[world]', x=95.86, y=69.00, z=-380.05]
[12:02:30] [Server thread/WARN]: Fetching packet for removed entity bbg['Air'/6498, l='ServerLevel[world]', x=95.66, y=69.00, z=-380.21]
[12:33:30] [Server thread/WARN]: Fetching packet for removed entity bbg['Air'/10070, l='ServerLevel[world]', x=97.33, y=69.00, z=-381.80]
[12:41:34] [Server thread/INFO]: CaseyLeask tried to swim in lava
[12:41:49] [Server thread/WARN]: Fetching packet for removed entity bbg['Air'/11131, l='ServerLevel[world]', x=97.62, y=69.00, z=-377.08]
[13:21:12] [Server thread/WARN]: Fetching packet for removed entity bbg['Air'/15994, l='ServerLevel[world]', x=98.80, y=69.00, z=-380.27]
[13:24:25] [Server thread/WARN]: Fetching packet for removed entity bbg['Air'/16444, l='ServerLevel[world]', x=97.64, y=69.00, z=-376.71]
[13:25:50] [Server thread/INFO]: CaseyLeask lost connection: Disconnected
[13:25:50] [Server thread/INFO]: CaseyLeask left the game

Panic crashing honeytail

Hey folks,

We've got honeytail crashing in production. Happened yesterday but we didn't have all the logs going into honeycomb yet. Once we set up a heroku logdrain we were able to capture the panic. Here's the trace:

buildpack=nginx at=exit process=honeytail
	/home/travis/gopath/src/github.com/honeycombio/honeytail/leash.go:343 +0x15b
created by main.modifyEventContents.func1
	/home/travis/gopath/src/github.com/honeycombio/honeytail/leash.go:401 +0xfd6
main.modifyEventContents.func1.1(0xc4203944e0, 0xc420383800, 0xc4204e01c0, 0xc42000e108, 0xc4202eaed0, 0x8c2b40, 0xc42035e240, 0x0, 0x0, 0x0, ...)
	/home/travis/.gimme/versions/go1.10.linux.amd64/src/math/rand/rand.go:326 +0x37
math/rand.Intn(0x8000000000000000, 0xc4204080c0)
	/home/travis/.gimme/versions/go1.10.linux.amd64/src/math/rand/rand.go:166 +0x9c
math/rand.(*Rand).Intn(0xc42001e1b0, 0x8000000000000000, 0xc42035e26c)
goroutine 184 [running]:
panic: invalid argument to Intn

It appears to have something to do with dynamic sampling, so here's that app's dynamic sampling config:

AddFields = service_name=<%= ENV['HEROKU_APP_NAME'] %>-nginx
RequestShape = http.request
RequestParseQuery = all
ParserName = keyval
SampleRate = 5
DynSample = http.status
DynSample = http.upstream_cache_status
DynSample = http.request_path

Here's a sample of what these values look like:

Timestamp,http.request_path,http.status,http.upstream_cache_status
2018-10-27T14:13:11.991452953Z,/v1/dashboard,200,EXPIRED
2018-10-27T14:13:11.990323051Z,/v1/exchange-rates,200,STALE
2018-10-27T14:13:11.9560657Z,/v1/markets/interval,200,HIT
2018-10-27T14:13:11.937851771Z,,,
2018-10-27T14:13:11.937807612Z,,,
2018-10-27T14:13:11.928231977Z,/v1/exchange_candles,200,STALE
2018-10-27T14:13:11.875924483Z,/v1/market-cap/sparkline,200,STALE
2018-10-27T14:13:11.86590822Z,,,
2018-10-27T14:13:11.856936627Z,/v1/currencies/sparkline,200,STALE
2018-10-27T14:13:11.828181399Z,,,
2018-10-27T14:13:11.374426611Z,/v1/markets/interval,200,STALE
2018-10-27T14:13:11.263195926Z,/v1/exchange_candles,200,HIT
2018-10-27T14:13:11.176086127Z,/v1/candles,200,STALE
2018-10-27T14:13:11.162250444Z,,,
2018-10-27T14:13:11.081968323Z,/v1/prices,200,HIT
2018-10-27T14:13:10.901980656Z,,,
2018-10-27T14:13:10.901907841Z,,,
2018-10-27T14:13:10.55836332Z,/v1/market-cap/sparkline,200,HIT
2018-10-27T14:13:10.278278303Z,/v1/exchange_candles,200,MISS
2018-10-27T14:13:09.942998359Z,/v1/exchange_candles,200,EXPIRED
2018-10-27T14:13:09.822235538Z,/v1/prices,200,HIT
2018-10-27T14:13:09.613869743Z,/v1/exchange_candles,200,STALE
2018-10-27T14:13:08.692443027Z,,,
2018-10-27T14:13:08.396816014Z,/v1/market-cap/sparkline,200,HIT
2018-10-27T14:13:08.199238458Z,/v1/prices,200,EXPIRED
2018-10-27T14:13:08.191089884Z,/v1/exchange-rates,200,HIT
2018-10-27T14:13:08.189405998Z,,,
2018-10-27T14:13:08.18828458Z,/v1/markets/prices,200,EXPIRED
2018-10-27T14:13:07.980840577Z,/v1/exchange-markets/prices,200,EXPIRED

Also, Right before the crash I saw http.upstream_cache_status=- so it's worth checking if the - character can cause this crash as well.

Thanks and let me know if you need more information from us.

/cc @spencewood

If nginx uses the default `combined` log format, honeytail has a hard exit.

$>    ./honeytail --writekey="kjafdls" --parser="nginx" --nginx.conf="/usr/local/etc/nginx/nginx.conf" --nginx.format="combined" --file="/usr/local/var/log/nginx/access.log" --tail.read_from=beginning --tail.stop --dataset="Nginx Quickstart" --backoff 
INFO[0000] Starting leash                               
FATA[0000] err initializing parser module                err=`log_format combined` not found in given config parser=nginx

/usr/local/etc/nginx/nginx.conf

#user  nobody;
worker_processes  1;

#error_log  logs/error.log;
#error_log  logs/error.log  notice;
#error_log  logs/error.log  info;

#pid        logs/nginx.pid;


events {
    worker_connections  1024;
}


http {
    include       mime.types;
    default_type  application/octet-stream;

    #log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
    #                  '$status $body_bytes_sent "$http_referer" '
    #                  '"$http_user_agent" "$http_x_forwarded_for"';

    #access_log  logs/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    #keepalive_timeout  0;
    keepalive_timeout  65;

    #gzip  on;

    server {
        listen       8080;
        server_name  localhost;

        #charset koi8-r;

        #access_log  logs/host.access.log  main;

        location / {
            root   html;
            index  index.html index.htm;
        }

        #error_page  404              /404.html;

        # redirect server error pages to the static page /50x.html
        #
        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   html;
        }

        # proxy the PHP scripts to Apache listening on 127.0.0.1:80
        #
        #location ~ \.php$ {
        #    proxy_pass   http://127.0.0.1;
        #}

        # pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
        #
        #location ~ \.php$ {
        #    root           html;
        #    fastcgi_pass   127.0.0.1:9000;
        #    fastcgi_index  index.php;
        #    fastcgi_param  SCRIPT_FILENAME  /scripts$fastcgi_script_name;
        #    include        fastcgi_params;
        #}

        # deny access to .htaccess files, if Apache's document root
        # concurs with nginx's one
        #
        #location ~ /\.ht {
        #    deny  all;
        #}
    }


    # another virtual host using mix of IP-, name-, and port-based configuration
    #
    #server {
    #    listen       8000;
    #    listen       somename:8080;
    #    server_name  somename  alias  another.alias;

    #    location / {
    #        root   html;
    #        index  index.html index.htm;
    #    }
    #}


    # HTTPS server
    #
    #server {
    #    listen       443 ssl;
    #    server_name  localhost;

    #    ssl_certificate      cert.pem;
    #    ssl_certificate_key  cert.key;

    #    ssl_session_cache    shared:SSL:1m;
    #    ssl_session_timeout  5m;

    #    ssl_ciphers  HIGH:!aNULL:!MD5;
    #    ssl_prefer_server_ciphers  on;

    #    location / {
    #        root   html;
    #        index  index.html index.htm;
    #    }
    #}
    include servers/*;
}

Update go to 1.18beta

Is your feature request related to a problem? Please describe.
Update go to 1.18beta.

Intel requests that we optimize for c6i instances

Describe the solution you'd like

Describe alternatives you've considered

Additional context

NGINX log format with escape parameter not matching

When I define the escape parameter in the nginx config the honeytail parser thinks its part of the expected value instead of skipping it as a param.

nginx conf snippet:
log_format customFormat escape=json '$remote_addr -

honeytail with debug snippet:
does not match given format '^scape=json '(?P<remote_addr>[^ ]*) -

nginx tail doesn't appear to support http_authorization

As recommended by the installer here, adding $http_authorization to a parser to a Basic secured site fails to parse.

I could not find examples in the parser's tests to confirm that this is intentional.

Removing the $http_authorization directive from the config allows the line to be parsed.

Here's a scrubbed example of a log line with basic auth that fails parsing:

logline="1.2.3.4 - myserver [17/Jan/2017:16:39:59 +0000] \"GET /2016/slides/slide-4.jpg HTTP/1.1\" 200 79974 \"https://example.com/2016/slides\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36\" \"-\" server.example.com 667 - Basic RandomString 0.294 80412 server.example.com"

Deterministic Sampler tests are flaky

Could be an actual problem with the sampler:

--- FAIL: TestDeterministicSampler (3.17s)
    deterministic_sampler_test.go:85: Sampled more or less than we should have:  1895 (sample rate  100 )

Syslog parser rfc3164 format not reading

Trying to read my application syslog generated logfiles, on an ubuntu machine.

root@ip-10-100-10-36:/var/log# cat test.log <34>Jan 16 11:23:35 ip-10-100-10-36 ia-upload-service[16312]: 2020-01-16 11:23:35,173 - root - INFO - Looking for new tasks Jan 16 11:23:35 ip-10-100-10-36 ia-upload-service[16312]: 2020-01-16 11:23:35,173 - root - INFO - Looking for new tasks

using

honeytail --parser=syslog --writekey=****** --file=/var/log/test.log --dataset='TestI' --syslog.mode=rfc3164 --backfill --debug

gives the output

INFO[0000] Starting honeytail DEBU[0000] about to call tail.TailFile conf={Paths:[/var/log/test.log] FilterPaths:[] Type:0 Options:{ReadFrom:beginning Stop:true Poll:false StateFile:}} location=<nil> statefile=/tmp/test.leash.state tailConf={Location:<nil> ReOpen:false MustExist:true Poll:false Pipe:false RateLimiter:<nil> Follow:false MaxLineSize:0 Logger:0xc4200aeb40} DEBU[0000] attempting to process line line=<34>Jan 16 11:23:35 ip-10-100-10-36 ia-upload-service[16312]: 2020-01-16 11:23:35,173 - root - INFO - Looking for new tasks DEBU[0000] attempting to process line line=Jan 16 11:23:35 ip-10-100-10-36 ia-upload-service[16312]: 2020-01-16 11:23:35,173 - root - INFO - Looking for new tasks DEBU[0000] Initializing stats reporting. Will print stats once/60 seconds DEBU[0000] lines channel is closed, ending syslog processor DEBU[0000] event send record received body= duration=48.721427ms error=<nil> retry_send=false status_code=202 timestamp=2020-01-16 11:23:35 +0000 UTC INFO[0000] Summary of sent events avg_duration=48.721427ms count=1 count_per_status=map[202:1] errors=map[] fastest=48.721427ms lifetime_count=1 response_bodies=map[:1] slowest=48.721427ms INFO[0000] Last parsed event event=map[facility:4 severity:2 process:ia-upload-service timestamp:2020-01-16 11:23:35 +0000 UTC hostname:ip-10-100-10-36 content:2020-01-16 11:23:35,173 - root - INFO - Looking for new tasks priority:34] event_timestamp=2020-01-16 11:23:35 +0000 UTC INFO[0000] Total number of events sent number sent by response status code=map[202:1] total attempted sends=1 INFO[0000] Honeytail is all done, goodbye!

My understanding is the standard rfc3164 format does not have the <34> 'priority' field at the start, and this is reason that honeytail doesn't parse my log files.

And looking at https://rsyslog-5-8-6-doc.neocities.org/rsyslog_conf_templates.html it appears that the format honeycomb has marked as rfc3164 could be the ForwardFormat.

Remove dependency on mongodbtools

mongodbtools does not support newer log format and we'd like to sunset it.

Remove the dependency on mongodbtools and bring the code into honeytail. See if we can fix the log format support while we're at it, but as a stop gap measure, we can just file it as a follow-up issue.

[Feature Request] add honeytail command line flag to handle parser failures

Most parsers I have inspected will throw away logs that don't parse:

// skip lines that won't parse

We have to work with a lot of third party tools that log in stupid ways that we can't change, so it would be helpful to have a command line flag that allows logs that can't be parsed to still go to Honeycomb but throw the whole line into a field of your choosing, maybe log by default to sort of match the way the nop parser works.

Instead of skipping lines that fail to parse when running in this mode honeytail would parse normally for lines that match, but would generate an event for the log line with the entire contents of the line in the log field, or add an option to choose which field to be the fallback field for parser failures.

The other benefit of this would be an in-dataset searchable record of parser failures to help us improve our agent configurations.

Publish step doesn't work properly

CircleCI is not behaving the way I expect, and I'm not sure why. I had to manually upload the artifacts from CircleCI's artifacts. Just marking this here for the next hapless person who has to cut a new release...

unable to build on FreeBSD 11.2Rp4 amd64

Hi, this is the main release of FreeBSD at the moment. I'd love to have this included in the main ports tree if we can get it working. Thanks!

~> go get github.com/honeycombio/honeytail
# github.com/honeycombio/honeytail/tail
/repos/go/src/github.com/honeycombio/honeytail/tail/tail.go:327:17: invalid operation: state.INode != logStat.Ino (mismatched types uint64 and uint32)
/repos/go/src/github.com/honeycombio/honeytail/tail/tail.go:431:14: cannot use logStat.Ino (type uint32) as type uint64 in assignment

~> go version
go version go1.11.1 freebsd/amd64
~> uname -a
FreeBSD i09 11.2-RELEASE-p4 FreeBSD 11.2-RELEASE-p4 #0: Thu Sep 27 08:16:24 UTC 2018     [email protected]:/usr/obj/usr/src/sys/GENERIC  amd64

Honeytail doesn't skip syslog prefix when parsing

We push all logs to a shared /var/log/messages via syslog. This means all log lines have some additional prefixing:

Nov 17 14:29:29 cc-prod-tokumx-blue mongod.27018: Thu Nov 17 19:29:29.136 [conn596789] query code_climate_production.constants ...
^ prefix                                        : ^ parsable mongo log starts

I'm able to work around this by first piping the log through:

cut -d : -F 4- | sed 's/^ //'

Then honeytail processes things successfully.

Note: mloginfo accepts our logs without pre-stripping that prefix.

Add Alpine support

Problem

honeytail currently doesn't work with Alpine:

$ docker run -it alpine sh
curl -sLo /usr/bin/honeytail https://github.com/honeycombio/honeytail/releases/download/v1.1.4/honeytail-linux-amd64
/ # curl -sLo /usr/bin/honeytail https://github.com/honeycombio/honeytail/releases/download/v1.1.4/honeytail-linux-amd64
/ # chmod +x /usr/bin/honeytail
/ # which honeytail
/usr/bin/honeytail
/ # honeytail
sh: honeytail: not found

That's because Alpine is based on musl, which doesn't have all glicb libs. Installing the full glibc fixes it:

/ # # copy-paste from https://github.com/aws/aws-cli/issues/4685#issuecomment-615872019
/ # export GLIBC_VER=2.31-r0
curl -sL https://alpine-pkgs.sgerrand.com/sgerrand.rsa.pub -o /etc/apk/keys/sgerrand.rsa.pub \
    && curl -sLO https://github.com/sgerrand/alpine-pkg-glibc/releases/download/${GLIBC_VER}/glibc-${GLIBC_VER}.apk \
    && curl -sLO https://github.com/sgerrand/alpine-pkg-glibc/releases/download/${GLIBC_VER}/glibc-bin-${GLIBC_VER}.apk \
    && apk add --no-cache \
    glibc-${GLIBC_VER}.apk \
    glibc-bin-${GLIBC_VER}.apk
/ # honeytail
Parser required to be specified with the --parser flag.

Usage: honeytail -p <parser> -k <writekey> -f </path/to/logfile> -d <mydata> [optional arguments]

For even more detail on required and optional parameters, run
honeytail --help

Request

Add alpine/musl as a target in CI ๐Ÿ™ ๐Ÿค—

Exits if file to tail is not found

When installing honeytail to a new server there's a race condition if honeytail starts before the server has received any traffic as the log files it's supposed to watch might not have received any data yet. If this happens, systemd will try to restart the job a couple times, and then stop trying since the restart limit is reached.

If the systemd config specifies something like

RestartSec=5
StartLimitInterval=0

instead the job will be attempted restarted every 5s forever until it succeeds, which means it'll automatically detect when the file it's supposed to watch appears.

I'm not sure if this is the best solution to the problem, but at least for us this would prevent new servers from occasionally not submitting data until the agent is restarted since the file is initially missing.

Honeytail nginx parser doesn't recognize log_format defined in included config.

Versions

  • Go:
  • Honeytail: 1.8.2

Steps to reproduce

  1. Have nginx.conf file without log definition
  2. Add include /etc/nginx/conf.d/*.conf
  3. Add file /etc/nginx/conf.d/app.conf with log definition, e.g. log_format combined_apm...
  4. Run honeytail with command: honeytail --parser=nginx --nginx.format=combined_apm --dataset=my-dataset --nginx.conf=/etc/nginx/nginx.conf --file=/var/log/nginx.log --status_interval=1 --add_field service.name=my-service

In this case honeytail fails with error:

FATA[0000] Error initializing nginx parser module: `log_format combined_apm` not found in given config

Additional context

I tried use --nginx.conf=/etc/nginx/conf.d/app.conf but I have the same error.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.