Cloudtrail logs are being dropped if the batch size is greater than MAX_BATCH_SIZE * BATCH_SIZE_FACTOR

It appears that cloudtrail batches can be much bigger than MAX_BATCH_SIZE * BATCH_SIZE_FACTOR, when this is the case, the entire batch is currently dropped. Attached screenshot of the error. I can't upload the contents of the batch dropped as it is from a customer.

Ideally, the batch size / size of log content should be checked prior to sending it to the log API, if it's bigger than the limit, it should be broken up instead of being dropped altogether.

Enable to monitor more than one S3 bucket

Hi everyone, I started using this lambda to get logs from a bucket and it works well. What if I want to monitor others buckets? Should I deploy one instance for each bucket? How about change the code to receive a list of buckets to monitor?

I guess that with this feature this lambda will attend more scenarios and as consequence be more used since it will cause less overhead to setup than deploy one instance per bucket.

Support parsing timestamp and add it to log event when log message is formatted as JSON.

I wonder it's related to #26.

When a log file contains one line JSON and JSON has a timestamp value, how about parsing timestamp and add it to a Log event timestamp?

{"timestamp": 1622622938, "key1": "value1", "key2": "value"}
{"timestamp": 1622622948, "key1": "valueA", "key2": "valueB"}

At this point, the handler doesn't parse anything so the log event timestamp will be a time when logs are sent to New Relic Logs.

https://github.com/newrelic/aws_s3_log_ingestion_lambda/blob/master/src/handler.py#L112-L126

Add support for encrypted Environment Variables

It would be useful if the lambda supported the "Enable helpers for encryption in transit" feature.

Our customer would like to exclude certain folders from being processed by the log ingest function.

When the cloudtrail integrity check is enabled, it will deliver checksum files with the same extension .json.gz into a "CloudTrail-Digest" folder location. The customer cannot use the lambda suffix/prefix options in this case as the folder structure is XXX/cloudtrail/AWSLogs//CloudTrail-Digest/XXX.json.gz for the digest and XXX/cloudtrail/AWSLogs//CloudTrail/XXX.json.gz for the logs. These cloudtrail logs are from multiple (100+) AWS accounts saved in a central location in one AWS Account.
For a use case like this, the AWS S3 Lambda prefix and suffix options do not work as they don't support wildcards. Rather than hardcode the exclusion of the folder CloudTrail-Digest in the function, the better option would be to add a feature to base the exclusion of folders on a regex that is passed into the function as an env variable.

add a Terraform module to this codebase

Hi -

It would be super helpful if there were a Terraform module to deploy this S3 log ingestion Lambda, similar to how the New Relic CloudWatch log ingestion Lambda, here: https://github.com/newrelic/aws-log-ingestion.

Thanks!

Ignore logs that have no `message` field defined

Hello NewRelic devs! We have a use-case, or really a strong requirement, where we want to ignore all logs being sent to NewRelic that don't have a message field. In our case the field can be string or a json struct itself ... but if it's missing entirely we don't want to forward it along. Is there anyway or config or whatever I can take advantage of to do this?

Auto-generated S3 bucket Optional/Customizable, Allows user to specify Pre-existing S3 Bucket

Could the Auto-generated S3 bucket be optional, and configurable by name. Currently, the auto-generated bucket forces a S3 Bucket with the name of: serverlessrepo-newrelic-s3-log-in-sourcelogbucket-XXXXXXXXXXXXX. If the user decided to take advantage of a S3 bucket created by the serverlessrepo Application, it would be nice if the Parameters took a value to customize this generated S3 Buckets name. Similarly, If the user decided to not use the generated S3 Bucket, the choice to auto-create a S3 Bucket should be optional, and we should be able to specify a pre-existing S3 Bucket to use via the serverlessrepo Application Template Parameters to use with the serverlessrepo-NewRelic-s3-log-ingestion Lambda Function.

Using request_creation_time as record timestamp, and additional custom parsing

We would like to use request_creation_time as the timestamp when ingested into New Relic (versus default which is using the time-of-ingest). Is it possible to add custom parsing to the lamba to enable this?

I see there might be potential to parse the data here:
https://github.com/newrelic/aws_s3_log_ingestion_lambda/blob/master/src/handler.py#L112-L126

The second question is, are we able to parse and add additional fields. Our use-case is we would like to add additional filtering on the uri path (eg: filter out some uri params such as https://hello.com/user/1234). Such filtering may be difficult or impossible to do using glob patterns with the NR server-side parsing

Lambda function is created with invalid environment variable name.

In the code, environment variable with name S3_CLOUDTRAIL_LOG_PATTERN is read from environment variables.

https://github.com/newrelic/aws_s3_log_ingestion_lambda/blob/master/src/handler.py#L107

However if this is deployed from AWS Serverless Application Repository, the environment variable S3_CLOUD_TRAIL_LOG_PATTERN is defined.
("_" is inserted between "CLOUD" and "TRAIL")

https://github.com/newrelic/aws_s3_log_ingestion_lambda/blob/master/template.yml#L91
https://github.com/newrelic/aws_s3_log_ingestion_lambda/blob/master/template.yml#L128

This may introduce miss configuration by the users. They should have same environment variable name.

Decoding Error with S3 objects uploaded by fluent-bit

Just made switch from fluentd to aws-fluent-bit agent here (https://docs.fluentbit.io/manual/pipeline/outputs/s3) and noticed NewRelic ingestion Lambda broke with the following error When specifying content_type= application/gzip or content_type =application/x-gzip.

[ERROR] BadGzipFile: Not a gzipped file (b'{"')
Traceback (most recent call last):
  File "/var/task/handler.py", line 287, in lambda_handler

Then I also tried compression=gzip and the default content-type=binary/octet-stream, the NewRelic lambda does not seem to show any errors, but I was not able to see logs in NewRelic. So, I am not sure if this combo is failing silently or not.
I will investigate this combo further and post back.

Support CloudTrail logs

CloudTrail's JSON payload begins with an array of Records, the current lambda places all Records into a single NR Log entry.

The ask it to detect CloudTrail as a source and generate one NR Log entry for each CloudTrail Records entry.

requirements.txt Not Being Picked Up By serverless-python-requirements

It doesn't appear the src/requirements.txt is being picked up by the serverless-python-requirements plugin. When I move the requirements.txt to the root directory of the repo, it does find it.

NewRelic importing logs from S3 bucket/folder_name path

We are creating data partitions from an S3 bucket which contains logfiles for several components in a stack.
We’d like to write query like this → entityName = ‘S3BucketName/folder_name’ so that we can segregate the data at the time of creation of a data partition without having to apply filters afterwards.
Or at least, create an attribute (folder_name) for the path, so that the filtering is more efficient than searching through text.

Remove merged branches

To minimise confusion it would be the case to remove already merged branches.

Thanks

[Repolinter] Open Source Policy Issues

Repolinter Report

🤖This issue was automatically generated by repolinter-action, developed by the Open Source and Developer Advocacy team at New Relic. This issue will be automatically updated or closed when changes are pushed. If you have any problems with this tool, please feel free to open a GitHub issue or give us a ping in #help-opensource.

This Repolinter run generated the following results:

❗ Error	❌ Fail	⚠️ Warn	✅ Pass	Ignored	Total
0	1	0	6	0	7

Fail
- ❌ readme-contains-link-to-security-policy
Passed

Fail #

❌ `readme-contains-link-to-security-policy` #

Doesn't contain a link to the security policy for this repository (README.md). New Relic recommends putting a link to the open source security policy for your project (https://github.com/newrelic/<repo-name>/security/policy or ../../security/policy) in the README. For an example of this, please see the "a note about vulnerabilities" section of the Open By Default repository. For more information please visit https://nerdlife.datanerd.us/new-relic/security-guidelines-for-publishing-source-code.

Passed #

Click to see rules

✅ `license-file-exists` #

Found file (LICENSE). New Relic requires that all open source projects have an associated license contained within the project. This license must be permissive (e.g. non-viral or copyleft), and we recommend Apache 2.0 for most use cases. For more information please visit https://docs.google.com/document/d/1vML4aY_czsY0URu2yiP3xLAKYufNrKsc7o4kjuegpDw/edit.

✅ `readme-file-exists` #

Found file (README.md). New Relic requires a README file in all projects. This README should give a general overview of the project, and should point to additional resources (security, contributing, etc.) where developers and users can learn further. For more information please visit https://github.com/newrelic/open-by-default.

✅ `readme-starts-with-community-plus-header` #

The first 5 lines contain all of the requested patterns. (README.md). The README of a community plus project should have a community plus header at the start of the README. If you already have a community plus header and this rule is failing, your header may be out of date, and you should update your header with the suggested one below. For more information please visit https://opensource.newrelic.com/oss-category/.

✅ `readme-contains-discuss-topic` #

Contains a link to the appropriate discuss.newrelic.com topic (README.md). New Relic recommends directly linking the your appropriate discuss.newrelic.com topic in the README, allowing developer an alternate method of getting support. For more information please visit https://nerdlife.datanerd.us/new-relic/security-guidelines-for-publishing-source-code.

✅ `code-of-conduct-should-not-exist-here` #

New Relic has moved the CODE_OF_CONDUCT file to a centralized location where it is referenced automatically by every repository in the New Relic organization. Because of this change, any other CODE_OF_CONDUCT file in a repository is now redundant and should be removed. Note that you will need to adjust any links to the local CODE_OF_CONDUCT file in your documentation to point to the central file (README and CONTRIBUTING will probably have links that need updating). For more information please visit https://docs.google.com/document/d/1y644Pwi82kasNP5VPVjDV8rsmkBKclQVHFkz8pwRUtE/view. Did not find a file matching the specified patterns. All files passed this test.

✅ `third-party-notices-file-exists` #

Found file (THIRD_PARTY_NOTICES.md). A THIRD_PARTY_NOTICES.md file can be present in your repository to grant attribution to all dependencies being used by this project. This document is necessary if you are using third-party source code in your project, with the exception of code referenced outside the project's compiled/bundled binary (ex. some Java projects require modules to be pre-installed in the classpath, outside the project binary and therefore outside the scope of the THIRD_PARTY_NOTICES). Please review your project's dependencies and create a THIRD_PARTY_NOTICES.md file if necessary. For JavaScript projects, you can generate this file using the oss-cli. For more information please visit https://docs.google.com/document/d/1y644Pwi82kasNP5VPVjDV8rsmkBKclQVHFkz8pwRUtE/view.

Function Description Missing

Could the created AWS Lambda Function have a function description of:
Send log data from a S3 bucket to New Relic Logging.

This should be a very easy lift, currently the created function has no function description:

Thanks!

[Repolinter] Open Source Policy Issues

Repolinter Report

🤖This issue was automatically generated by repolinter-action, developed by the Open Source and Developer Advocacy team at New Relic. This issue will be automatically updated or closed when changes are pushed. If you have any problems with this tool, please feel free to open a GitHub issue or give us a ping in #help-opensource.

This Repolinter run generated the following results:

❗ Error	❌ Fail	⚠️ Warn	✅ Pass	Ignored	Total
0	3	0	4	0	7

Fail
Passed

Fail #

❌ `readme-starts-with-community-plus-header` #

The README of a community plus project should have a community plus header at the start of the README. If you already have a community plus header and this rule is failing, your header may be out of date, and you should update your header with the suggested one below. For more information please visit https://opensource.newrelic.com/oss-category/. Below is a list of files or patterns that failed:

README.md: The first 5 lines do not contain the pattern(s): Open source Community Plus header (see https://opensource.newrelic.com/oss-category).
- 🔨 Suggested Fix: prepend the latest code snippet found at https://github.com/newrelic/opensource-website/wiki/Open-Source-Category-Snippets#code-snippet-2 to file

❌ `readme-contains-link-to-security-policy` #

Doesn't contain a link to the security policy for this repository (README.md). New Relic recommends putting a link to the open source security policy for your project (https://github.com/newrelic/<repo-name>/security/policy or ../../security/policy) in the README. For an example of this, please see the "a note about vulnerabilities" section of the Open By Default repository. For more information please visit https://nerdlife.datanerd.us/new-relic/security-guidelines-for-publishing-source-code.

❌ `readme-contains-forum-topic` #

Doesn't contain a link to the appropriate forum.newrelic.com topic (README.md). New Relic recommends directly linking the your appropriate forum.newrelic.com topic in the README, allowing developer an alternate method of getting support. For more information please visit https://nerdlife.datanerd.us/new-relic/security-guidelines-for-publishing-source-code.

Passed #

Click to see rules

✅ `license-file-exists` #

Found file (LICENSE). New Relic requires that all open source projects have an associated license contained within the project. This license must be permissive (e.g. non-viral or copyleft), and we recommend Apache 2.0 for most use cases. For more information please visit https://docs.google.com/document/d/1vML4aY_czsY0URu2yiP3xLAKYufNrKsc7o4kjuegpDw/edit.

✅ `readme-file-exists` #

Found file (README.md). New Relic requires a README file in all projects. This README should give a general overview of the project, and should point to additional resources (security, contributing, etc.) where developers and users can learn further. For more information please visit https://github.com/newrelic/open-by-default.

✅ `code-of-conduct-should-not-exist-here` #

New Relic has moved the CODE_OF_CONDUCT file to a centralized location where it is referenced automatically by every repository in the New Relic organization. Because of this change, any other CODE_OF_CONDUCT file in a repository is now redundant and should be removed. Note that you will need to adjust any links to the local CODE_OF_CONDUCT file in your documentation to point to the central file (README and CONTRIBUTING will probably have links that need updating). For more information please visit https://docs.google.com/document/d/1y644Pwi82kasNP5VPVjDV8rsmkBKclQVHFkz8pwRUtE/view. Did not find a file matching the specified patterns. All files passed this test.

✅ `third-party-notices-file-exists` #

Found file (THIRD_PARTY_NOTICES.md). A THIRD_PARTY_NOTICES.md file can be present in your repository to grant attribution to all dependencies being used by this project. This document is necessary if you are using third-party source code in your project, with the exception of code referenced outside the project's compiled/bundled binary (ex. some Java projects require modules to be pre-installed in the classpath, outside the project binary and therefore outside the scope of the THIRD_PARTY_NOTICES). Please review your project's dependencies and create a THIRD_PARTY_NOTICES.md file if necessary. For JavaScript projects, you can generate this file using the oss-cli. For more information please visit https://docs.google.com/document/d/1y644Pwi82kasNP5VPVjDV8rsmkBKclQVHFkz8pwRUtE/view.

newrelic / aws_s3_log_ingestion_lambda Goto Github PK

aws_s3_log_ingestion_lambda's People

Contributors

Stargazers

Watchers

Forkers

aws_s3_log_ingestion_lambda's Issues

Enable to monitor more than one S3 bucket

Repolinter Report

Fail #

❌ readme-contains-link-to-security-policy #

Passed #

✅ license-file-exists #

✅ readme-file-exists #

✅ readme-starts-with-community-plus-header #

✅ readme-contains-discuss-topic #

✅ code-of-conduct-should-not-exist-here #

✅ third-party-notices-file-exists #

Repolinter Report

Fail #

❌ readme-starts-with-community-plus-header #

❌ readme-contains-link-to-security-policy #

❌ readme-contains-forum-topic #

Passed #

✅ license-file-exists #

✅ readme-file-exists #

✅ code-of-conduct-should-not-exist-here #

✅ third-party-notices-file-exists #

Recommend Projects

Recommend Topics

Recommend Org

❌ `readme-contains-link-to-security-policy` #

✅ `license-file-exists` #

✅ `readme-file-exists` #

✅ `readme-starts-with-community-plus-header` #

✅ `readme-contains-discuss-topic` #

✅ `code-of-conduct-should-not-exist-here` #

✅ `third-party-notices-file-exists` #

❌ `readme-starts-with-community-plus-header` #

❌ `readme-contains-link-to-security-policy` #

❌ `readme-contains-forum-topic` #

✅ `license-file-exists` #

✅ `readme-file-exists` #

✅ `code-of-conduct-should-not-exist-here` #

✅ `third-party-notices-file-exists` #