Code Monkey home page Code Monkey logo

graylog-plugin-integrations's Introduction

Integrations Plugin for Graylog

Overview

Integrations are tools that help Graylog work with external systems. This plugin contains all open source integrations features.

Please refer to the documentation for additional details and setup instructions.

graylog-plugin-integrations's People

Contributors

antonebel avatar bernd avatar casperbiering avatar chunters avatar danotorrey avatar dennisoelkers avatar dependabot-preview[bot] avatar dependabot[bot] avatar edmundoa avatar florianpopp avatar garybot2 avatar gaya avatar jaak-pruulmann-sympower avatar janheise avatar kingzacko1 avatar kmerz avatar kroepke avatar kyleknighted avatar lingpri avatar linuspahl avatar luk-kaminski avatar moesterheld avatar mpfz0r avatar roberto-graylog avatar ryan-carroll-graylog avatar supahgreg avatar thll avatar todvora avatar waab76 avatar zeeklop avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

graylog-plugin-integrations's Issues

Assume Role ARN Authentication

Lennart mentioned that some customers using the existing AWS plugin used an alternate authentication scheme called Assume Role ARN. Apparently this was popular among bigger customers.

We will need to investigate how this works and see if we can support it in the new integration. Most likely this should just be a copy/paste operation from the existing AWS plugin.

From the existing code, it looks like it's just additional args provided during the auth provider setup. See https://github.com/Graylog2/graylog-plugin-aws/blob/ea7f5d91e7dbdfbe809d3cdc007bb5badbe69e79/src/main/java/org/graylog/aws/inputs/transports/KinesisTransport.java#L160 and the associated commit Graylog2/graylog-plugin-aws@3c6ecf7

Improved Setup UX

Description

Provide an effective wizzard-like setup flow such that:

  • Guides the user through the setup process.
  • Provides input validations during setup to help the user correct any configuration and permissions issues quickly during setup.
  • Help with any post-setup steps like parsing and stream assignment.
  • Indicate the policy permissions that the user needs at the time when they choose a service and Graylog

Tasks (consider breaking out into separate issues to make it easier to estimate)

  • Flow charts
  • UI mocks
  • API endpoint explorations
  • Build it? TBD.

Open Questions

  • tbd

Palo Alto Input parser does not properly capture quoted values

Description

Palo Alto logs are comma separated with optional double quotes for values that may include a comma. In the case of a quoted value, the Palo Alto Input breaks and counts the comma within the quotes and shifts the fields.

Steps To Reproduce

1.Forward Palo Alto logs to Palo Alto TCP Input for a Firewall with ThreatDetect enabled.
2. Send a log containing a comma in any field.
3. Observe as the fields shift.

A packet capture containing the offending messages are included.

palo.pcap.gz

Environment

  • Graylog Version: 3.0
  • Elasticsearch Version: 6.6
  • MongoDB Version: 4.0.2
  • Browser Version: All (tested 4 different browsers)

Get all Supported Plugins into the Repository

From a customer perspective, we should include all official supported plugins into our repository that customer can install and update the plugins via their system tools.

https://github.com/Graylog2?utf8=%E2%9C%93&q=+plugin+&type=&language=

It happens regularly that customers did not update the Plugins because they do not check the release page of each plugin manually. Counts only for Plugins that are not shipped by default but officially supported.

ping @kroepke

Kinesis Transport part 3: Get the transport working

After the KinesisTransport classes have been ported over in #78, the next step is to get the transport working. A working transport should pull logs from Kinesis in real time as the input is running. This is broken out into its own issue, because it will probably require some iterative testing and troubleshooting. Certain values might need to be hard-coded (such as the stream name) in the KinesisTransport an KinesisConsumer classes in order to get it working. This is totally OK to start with.

Add API call for CloudWatch - DescribeLogGroups

Implement the DescribeLogGroups API call. This will be needed for both the automated and manual setup.

This should include the web resource endpoint and the code that actually communicates with AWS through the SDK.

Create general AWS input

Create a general input for AWS (named "General AWS").

In the initial version, the user will choose this General AWS input from the list of available inputs on the Inputs page. We will want it to be clear that this is different from the existing AWS plugin inputs listed here.

Some of this input code will be written in #51.

This should tie together the following issues:

  • Kinesis Transport #71
  • Kinesis Codec #70

Auto Kinesis setup step 7: Integrate UI with backend and add error checking

Integrate UX with backend and add error checking for automated Kinesis setup. There are a lot of steps and error paths. We need to give some deliberate attention to how the progress and any errors are provided to the user. Perhaps we need to provide a list of setup actions on a results page and add ✅ or ❌ based on whether an action succeeded or failed. This will help the user to troubleshoot any issues.

For example:

✅ Create Stream
❌ Create Policy
Subscribe group to steam

Kinesis Transport part 2: Port existing AWS Kinesis transport over to the new AWS Input

The existing KinesisTransport works well we spent a lot of time fine tuning it in the past year. So, the first step in implementing a Kinesis Transport for the new AWS integrations is to port the existing one over and get it working. It uses the old 1.x version of the Kinesis Client Library, but that should be ok to start with.

To complete this issue, just copy over all KinesisTransport classes and components, and get to the point where you can start the Graylog server without any syntax errors.

See this class to get started. This class references several other classes that will need to be copied over too.
https://github.com/Graylog2/graylog-plugin-aws/blob/2bc458ed5a5d6a55a9014c6edb72f48aeb2534dc/src/main/java/org/graylog/aws/inputs/transports/KinesisTransport.java

Log message identification and parsing

Part of to #35.

After a successful health check (which retrieves one log message from AWS), a detected log type will be returned to the UI (eg. Flowlogs, unknown). If a known log type is detected, then we should automatically associate the appropriate codec with the integration. If the log type cannot be determined, we should specify a minimal codec that at least stores the message with the minimum fields (eg. timestamp etc.). The user can then set up whatever pipeline rules they need separately. Perhaps in a subsequent milestone, we can add some automated/guided parsing at the end of the setup process.

Kinesis Transport part 5: Test throttling support

This issue requires that #78, #79, and #71 be implemented first.

Throttling support exists already in the code that will be ported over with #78. The goal of this issue is to test and ensure that throttling is still working correctly in the latest version of the KCL.

See the Throttling section in the existing AWS plugin. for more info on how throttling works and why it is needed. @danotorrey implemented this functionality for the existing AWS plugin and can provide a overview for how that code works. A similar approach can probably be used for the AWS plugin.

This is L size, because it can be quite a challenge to implement throttling support with Kinesis (based on past experience). This will also require quite a lot of testing to ensure it works correctly.

Workflow Mocks

Description

Create mocks of the expected interface matching the criteria from #37

What

Why

Add API call for list of available AWS Services

The implementation of this might require:

  • Resource method with annotations.
  • Service method in AWSService. Would probably return List

AWSService class might have these properties:

String name
String description
String policyPermission
String helperText
String learnMoreLink

To support this page:

image

Add API call for ListStreams

Implement the ListStreams API call. This should include the web resource endpoint and the code that actually communicates with AWS through the SDK.

Palo Alto Networks Input quoted string with comma’s

Description

The Palo Alto Input doesn’t ignore commas in a quoted string, and therefore doesn’t index the fields properly.

Steps To Reproduce

  1. Create a Palo Alto Networks Input
  2. Forward Logs from a Palo Alto System via TCP to Graylog
  3. Logs that include commas within a quoted string shifts the fields in the index

Example message:

1,2019/04/25 15:32:04,009900009999,SYSTEM,globalprotect,0,2019/04/25 15:32:04,,globalprotectportal-auth-succ,GP_PortalAdm_Optional,0,0,general,informational,"GlobalProtect portal user authentication succeeded. Login from: 10.0.0.1, Source region: DK, User name: pre-logon, Auth type: client certificate.",16657523,0x0,0,0,0,0,,PA-VM

Environment

  • Graylog Version: 3.0.1
  • Elasticsearch Version: 6.7.1
  • MongoDB Version: 4.0.9

An error occurs when sending messages into the Palo Alto input

An error occurs when sending messages into the Palo Alto input. The specific error is unknown, but we are working to obtain it. A PCAP has also been obtained, which is being used to investigate the issue further.

More details will be added to this issue shortly as the specific issue that is occurring is identified.

Add retry support when AWS API rate limits are exceeded

The AWS CloudWatch API is subject to rate limits. If these limits are exceeded by Graylog AWS API communication, then an exception will be thrown.

We will likely need to add backoff and retry logic for when any rate limits are exceeded. Most likely a specific exception will be thrown, and we can catch it and retry again after some delay.

Just capturing this as an issue so we don't forget to add this handling.

API call for Kinesis HealthCheck: Parsing

Parse the log message obtained from Kinesis and return a JSON structure that looks like:

{
  "full_message": "2 123456789010 eni-abc123de 172.31.16.139 172.31.16.21 20641 22 6 20 4249 1418530010 1418530070 ACCEPT OK",
  "version": 2,
  "account-id": 123456789010,
  "interface-id": "eni-abc123de",
  "src_addr": "172.31.16.139",
  "dst_addr": "172.31.16.21",
  "src_port": 20641,
  "dst_port": 22,
  "protocol": 6,
  "packets": 20,
  "bytes": 4249,
  "start": 1418530010,
  "end": 1418530070,
  "action": "ACCEPT",
  "log-status": "OK"
}

3 of 3 HealthCheck issues

Persistence data structures

Identify and implement the correct database document structures to persist:

  • Account credentials (should be encrypted, and should be stored in a structure that supports multi-account management later).
  • All other AWS service integration data (log group name, stream names, log type etc.)
  • Identify how this data is associated with and works with the existing inputs system and structures: We had decided that AWS integration services should run as inputs, so is the data stored in the input config? Does it show on the System/Inputs page?

API call for Kinesis HealthCheck: Structure

Overall structure for the Kinesis HealthCheck (resource endpoint, tie together log retrieval, detection, and parsing).

Three other issues exist as part of this larger issue:

More info:
Note: Before this can be implemented, we first need to identify if we can read a message from a Kinesis stream without deleting it from the stream (to avoid it being lost). The Kinesis Client Library for Java will create a consumer that retrieves a large number of message at a time and does not support the ability to leave a message on a stream. We need to do some research to identify if a workaround exists to read a sample message from the stream and leave it in the stream.

See https://docs.aws.amazon.com/streams/latest/dev/kcl-migration.html for the new KCL config options.

An issue was opened with this Kinesis Client Library team, and they recommended using the direct kinesis API.

Implement the HealthCheck API call. This should perform the following operations:

Attempt to pull a log message from a Kinesis stream see Kinesis consumer code in this issue, also see this sample consumer/subscriber code for how to use the pattern in the latest Kinesis client library.

Identify if the log message is in a known format. Indicate that format in the response message. This will automatically establish a codec/parsing for it. Perhaps we can use a regex for this? See this link that lists the format for all AWS services that can publish messages to CloudWatch: https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/EventTypes.html We definitely have to support CloudWatch and supporting others would be a plus.

  • Return log message for parsing.

Add request payload with AWS credential and region for API calls

Some API calls in AWSResource in AWS do not yet have a request payload that passes in AWS credentials and a region:

org.graylog.integrations.aws.resources.AWSResource#getLogGroupNames()
org.graylog.integrations.aws.resources.AWSResource#getKinesisStreams()

Add an AutoValue value requesr object that passes in the credentials and region (similar to org.graylog.integrations.aws.resources.requests.KinesisHealthCheckRequest) for the healthCheck call.

public Response kinesisHealthCheck(@ApiParam(name = "JSON body", required = true) @Valid @NotNull KinesisHealthCheckRequest heathCheckRequest) throws ExecutionException, IOException {

Update readme

The README for this repository is out of date. Update it.

Create Kinesis codec

Create the codec that will decode AWS CloudWatch log messages. This will need to look at the input settings to identify which log format was identified during the setup process.

API call for Kinesis HealthCheck: Log retrieval

Retrieve a log message directly from a Kinesis stream using the Kinesis Client library.

Writing a method that retrieves a batch of messages from the Kinesis stream would probably be good. The method can return a list of software.amazon.awssdk.services.kinesis.model.Record objects.

1 of 3 HealthCheck issues

Auto Kinesis setup part 2: API resource method

Add automated setup for CloudWatch/Kinesis. This should be one resource the calls to one AWSService method (which may also call to other methods depending on the design). Similar to this resource example:

@POST
@Timed
@Path("/inputs")
@ApiOperation(value = "Create a new AWS input.")
@RequiresPermissions(RestPermissions.INPUTS_CREATE)
@AuditEvent(type = AuditEventTypes.MESSAGE_INPUT_CREATE)
public Response create(@ApiParam(name = "JSON body", required = true)
@Valid @NotNull AWSInputCreateRequest saveRequest) throws Exception {
Input input = awsService.saveInput(saveRequest, getCurrentUser());
return Response.ok().entity(getInputSummary(input)).build();
}

Test

Description

Steps To Reproduce

Environment

  • Graylog Version:
  • Elasticsearch Version:
  • MongoDB Version:
  • Browser Version:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.