newrelic / aws_s3_log_ingestion_lambda Goto Github PK
View Code? Open in Web Editor NEWAWS Lambda that sends log data from S3 to New Relic Logging
License: Apache License 2.0
AWS Lambda that sends log data from S3 to New Relic Logging
License: Apache License 2.0
It appears that cloudtrail batches can be much bigger than MAX_BATCH_SIZE * BATCH_SIZE_FACTOR, when this is the case, the entire batch is currently dropped. Attached screenshot of the error. I can't upload the contents of the batch dropped as it is from a customer.
Ideally, the batch size / size of log content should be checked prior to sending it to the log API, if it's bigger than the limit, it should be broken up instead of being dropped altogether.
Hi everyone, I started using this lambda to get logs from a bucket and it works well. What if I want to monitor others buckets? Should I deploy one instance for each bucket? How about change the code to receive a list of buckets to monitor?
I guess that with this feature this lambda will attend more scenarios and as consequence be more used since it will cause less overhead to setup than deploy one instance per bucket.
I wonder it's related to #26.
When a log file contains one line JSON and JSON has a timestamp value, how about parsing timestamp and add it to a Log event timestamp?
{"timestamp": 1622622938, "key1": "value1", "key2": "value"}
{"timestamp": 1622622948, "key1": "valueA", "key2": "valueB"}
At this point, the handler doesn't parse anything so the log event timestamp will be a time when logs are sent to New Relic Logs.
https://github.com/newrelic/aws_s3_log_ingestion_lambda/blob/master/src/handler.py#L112-L126
It would be useful if the lambda supported the "Enable helpers for encryption in transit" feature.
When the cloudtrail integrity check is enabled, it will deliver checksum files with the same extension .json.gz into a "CloudTrail-Digest" folder location. The customer cannot use the lambda suffix/prefix options in this case as the folder structure is XXX/cloudtrail/AWSLogs//CloudTrail-Digest/XXX.json.gz for the digest and XXX/cloudtrail/AWSLogs//CloudTrail/XXX.json.gz for the logs. These cloudtrail logs are from multiple (100+) AWS accounts saved in a central location in one AWS Account.
For a use case like this, the AWS S3 Lambda prefix and suffix options do not work as they don't support wildcards. Rather than hardcode the exclusion of the folder CloudTrail-Digest in the function, the better option would be to add a feature to base the exclusion of folders on a regex that is passed into the function as an env variable.
Hi -
It would be super helpful if there were a Terraform module to deploy this S3 log ingestion Lambda, similar to how the New Relic CloudWatch log ingestion Lambda, here: https://github.com/newrelic/aws-log-ingestion.
Thanks!
Hello NewRelic devs! We have a use-case, or really a strong requirement, where we want to ignore all logs being sent to NewRelic that don't have a message
field. In our case the field can be string or a json struct itself ... but if it's missing entirely we don't want to forward it along. Is there anyway or config or whatever I can take advantage of to do this?
Could the Auto-generated S3 bucket be optional, and configurable by name. Currently, the auto-generated bucket forces a S3 Bucket with the name of: serverlessrepo-newrelic-s3-log-in-sourcelogbucket-XXXXXXXXXXXXX
. If the user decided to take advantage of a S3 bucket created by the serverlessrepo
Application, it would be nice if the Parameters took a value to customize this generated S3 Buckets name. Similarly, If the user decided to not use the generated S3 Bucket, the choice to auto-create a S3 Bucket should be optional, and we should be able to specify a pre-existing S3 Bucket to use via the serverlessrepo
Application Template Parameters to use with the serverlessrepo-NewRelic-s3-log-ingestion
Lambda Function.
request_creation_time
as the timestamp when ingested into New Relic (versus default which is using the time-of-ingest). Is it possible to add custom parsing to the lamba to enable this?I see there might be potential to parse the data here:
https://github.com/newrelic/aws_s3_log_ingestion_lambda/blob/master/src/handler.py#L112-L126
https://hello.com/user/1234
). Such filtering may be difficult or impossible to do using glob patterns with the NR server-side parsingIn the code, environment variable with name S3_CLOUDTRAIL_LOG_PATTERN is read from environment variables.
https://github.com/newrelic/aws_s3_log_ingestion_lambda/blob/master/src/handler.py#L107
However if this is deployed from AWS Serverless Application Repository, the environment variable S3_CLOUD_TRAIL_LOG_PATTERN is defined.
("_" is inserted between "CLOUD" and "TRAIL")
https://github.com/newrelic/aws_s3_log_ingestion_lambda/blob/master/template.yml#L91
https://github.com/newrelic/aws_s3_log_ingestion_lambda/blob/master/template.yml#L128
This may introduce miss configuration by the users. They should have same environment variable name.
Just made switch from fluentd to aws-fluent-bit agent here (https://docs.fluentbit.io/manual/pipeline/outputs/s3) and noticed NewRelic ingestion Lambda broke with the following error When specifying content_type= application/gzip
or content_type =application/x-gzip
.
[ERROR] BadGzipFile: Not a gzipped file (b'{"')
Traceback (most recent call last):
File "/var/task/handler.py", line 287, in lambda_handler
Then I also tried compression=gzip
and the default content-type=binary/octet-stream
, the NewRelic lambda does not seem to show any errors, but I was not able to see logs in NewRelic. So, I am not sure if this combo is failing silently or not.
I will investigate this combo further and post back.
CloudTrail's JSON payload begins with an array of Records, the current lambda places all Records into a single NR Log entry.
The ask it to detect CloudTrail as a source and generate one NR Log entry for each CloudTrail Records entry.
It doesn't appear the src/requirements.txt
is being picked up by the serverless-python-requirements
plugin. When I move the requirements.txt
to the root directory of the repo, it does find it.
We are creating data partitions from an S3 bucket which contains logfiles for several components in a stack.
We’d like to write query like this → entityName = ‘S3BucketName/folder_name’ so that we can segregate the data at the time of creation of a data partition without having to apply filters afterwards.
Or at least, create an attribute (folder_name) for the path, so that the filtering is more efficient than searching through text.
To minimise confusion it would be the case to remove already merged branches.
Thanks
🤖This issue was automatically generated by repolinter-action, developed by the Open Source and Developer Advocacy team at New Relic. This issue will be automatically updated or closed when changes are pushed. If you have any problems with this tool, please feel free to open a GitHub issue or give us a ping in #help-opensource.
This Repolinter run generated the following results:
❗ Error | ❌ Fail | ✅ Pass | Ignored | Total | |
---|---|---|---|---|---|
0 | 1 | 0 | 6 | 0 | 7 |
readme-contains-link-to-security-policy
#Doesn't contain a link to the security policy for this repository (README.md
). New Relic recommends putting a link to the open source security policy for your project (https://github.com/newrelic/<repo-name>/security/policy
or ../../security/policy
) in the README. For an example of this, please see the "a note about vulnerabilities" section of the Open By Default repository. For more information please visit https://nerdlife.datanerd.us/new-relic/security-guidelines-for-publishing-source-code.
license-file-exists
#Found file (LICENSE
). New Relic requires that all open source projects have an associated license contained within the project. This license must be permissive (e.g. non-viral or copyleft), and we recommend Apache 2.0 for most use cases. For more information please visit https://docs.google.com/document/d/1vML4aY_czsY0URu2yiP3xLAKYufNrKsc7o4kjuegpDw/edit.
readme-file-exists
#Found file (README.md
). New Relic requires a README file in all projects. This README should give a general overview of the project, and should point to additional resources (security, contributing, etc.) where developers and users can learn further. For more information please visit https://github.com/newrelic/open-by-default.
readme-starts-with-community-plus-header
#The first 5 lines contain all of the requested patterns. (README.md
). The README of a community plus project should have a community plus header at the start of the README. If you already have a community plus header and this rule is failing, your header may be out of date, and you should update your header with the suggested one below. For more information please visit https://opensource.newrelic.com/oss-category/.
readme-contains-discuss-topic
#Contains a link to the appropriate discuss.newrelic.com topic (README.md
). New Relic recommends directly linking the your appropriate discuss.newrelic.com topic in the README, allowing developer an alternate method of getting support. For more information please visit https://nerdlife.datanerd.us/new-relic/security-guidelines-for-publishing-source-code.
code-of-conduct-should-not-exist-here
#New Relic has moved the CODE_OF_CONDUCT
file to a centralized location where it is referenced automatically by every repository in the New Relic organization. Because of this change, any other CODE_OF_CONDUCT
file in a repository is now redundant and should be removed. Note that you will need to adjust any links to the local CODE_OF_CONDUCT
file in your documentation to point to the central file (README
and CONTRIBUTING
will probably have links that need updating). For more information please visit https://docs.google.com/document/d/1y644Pwi82kasNP5VPVjDV8rsmkBKclQVHFkz8pwRUtE/view. Did not find a file matching the specified patterns. All files passed this test.
third-party-notices-file-exists
#Found file (THIRD_PARTY_NOTICES.md
). A THIRD_PARTY_NOTICES.md
file can be present in your repository to grant attribution to all dependencies being used by this project. This document is necessary if you are using third-party source code in your project, with the exception of code referenced outside the project's compiled/bundled binary (ex. some Java projects require modules to be pre-installed in the classpath, outside the project binary and therefore outside the scope of the THIRD_PARTY_NOTICES
). Please review your project's dependencies and create a THIRD_PARTY_NOTICES.md file if necessary. For JavaScript projects, you can generate this file using the oss-cli. For more information please visit https://docs.google.com/document/d/1y644Pwi82kasNP5VPVjDV8rsmkBKclQVHFkz8pwRUtE/view.
🤖This issue was automatically generated by repolinter-action, developed by the Open Source and Developer Advocacy team at New Relic. This issue will be automatically updated or closed when changes are pushed. If you have any problems with this tool, please feel free to open a GitHub issue or give us a ping in #help-opensource.
This Repolinter run generated the following results:
❗ Error | ❌ Fail | ✅ Pass | Ignored | Total | |
---|---|---|---|---|---|
0 | 3 | 0 | 4 | 0 | 7 |
readme-starts-with-community-plus-header
#The README of a community plus project should have a community plus header at the start of the README. If you already have a community plus header and this rule is failing, your header may be out of date, and you should update your header with the suggested one below. For more information please visit https://opensource.newrelic.com/oss-category/. Below is a list of files or patterns that failed:
README.md
: The first 5 lines do not contain the pattern(s): Open source Community Plus header (see https://opensource.newrelic.com/oss-category).
the latest code snippet found at https://github.com/newrelic/opensource-website/wiki/Open-Source-Category-Snippets#code-snippet-2
to filereadme-contains-link-to-security-policy
#Doesn't contain a link to the security policy for this repository (README.md
). New Relic recommends putting a link to the open source security policy for your project (https://github.com/newrelic/<repo-name>/security/policy
or ../../security/policy
) in the README. For an example of this, please see the "a note about vulnerabilities" section of the Open By Default repository. For more information please visit https://nerdlife.datanerd.us/new-relic/security-guidelines-for-publishing-source-code.
readme-contains-forum-topic
#Doesn't contain a link to the appropriate forum.newrelic.com topic (README.md
). New Relic recommends directly linking the your appropriate forum.newrelic.com topic in the README, allowing developer an alternate method of getting support. For more information please visit https://nerdlife.datanerd.us/new-relic/security-guidelines-for-publishing-source-code.
license-file-exists
#Found file (LICENSE
). New Relic requires that all open source projects have an associated license contained within the project. This license must be permissive (e.g. non-viral or copyleft), and we recommend Apache 2.0 for most use cases. For more information please visit https://docs.google.com/document/d/1vML4aY_czsY0URu2yiP3xLAKYufNrKsc7o4kjuegpDw/edit.
readme-file-exists
#Found file (README.md
). New Relic requires a README file in all projects. This README should give a general overview of the project, and should point to additional resources (security, contributing, etc.) where developers and users can learn further. For more information please visit https://github.com/newrelic/open-by-default.
code-of-conduct-should-not-exist-here
#New Relic has moved the CODE_OF_CONDUCT
file to a centralized location where it is referenced automatically by every repository in the New Relic organization. Because of this change, any other CODE_OF_CONDUCT
file in a repository is now redundant and should be removed. Note that you will need to adjust any links to the local CODE_OF_CONDUCT
file in your documentation to point to the central file (README
and CONTRIBUTING
will probably have links that need updating). For more information please visit https://docs.google.com/document/d/1y644Pwi82kasNP5VPVjDV8rsmkBKclQVHFkz8pwRUtE/view. Did not find a file matching the specified patterns. All files passed this test.
third-party-notices-file-exists
#Found file (THIRD_PARTY_NOTICES.md
). A THIRD_PARTY_NOTICES.md
file can be present in your repository to grant attribution to all dependencies being used by this project. This document is necessary if you are using third-party source code in your project, with the exception of code referenced outside the project's compiled/bundled binary (ex. some Java projects require modules to be pre-installed in the classpath, outside the project binary and therefore outside the scope of the THIRD_PARTY_NOTICES
). Please review your project's dependencies and create a THIRD_PARTY_NOTICES.md file if necessary. For JavaScript projects, you can generate this file using the oss-cli. For more information please visit https://docs.google.com/document/d/1y644Pwi82kasNP5VPVjDV8rsmkBKclQVHFkz8pwRUtE/view.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.