Code Monkey home page Code Monkey logo

bigip-logs's Introduction

Enhancing BIGIP ASM logs with Logstash

For remote logging, F5 ASM supports multiple formats in order to accomodate sending the logs to a variaty of SIEM solutions. These formats include:

  • Key-Value Pairs (Splunk)

  • Common Event Format (Arcsight)

  • BIGIQ

  • Comma Separated Values (CSV, which is the default logging format).

Out of the four options only CSV is the only logging format that allows you to select specific fields (out of a 70+ total fields) along with a customized logging format. In addition it provides access to violation_details field that Key-Value and CEF formats do not provide. Violation details, in BIG-IP 13.x and higher provides the complete violation details in an XML format (https://my.f5.com/manage/s/article/K9435). For example, on a attack signature (VIOL_ATTACK_SIGNATURE) violation, you should expect to find the name of the entity that the signature was detected on (parameter, cookie, header), the entity value that matched the signature, but also things like staging, enforcement_level, etc. With a VIOL_PARAMETER_VALUE_METACHAR violation, you should expect to find information regarding the meta-char Hex number as well as the entity name the it was detected on (e.g parameter name). The values within the violation_details XML are base64 decoded. An example of the violation_details field looks as follows:

<?xml version='1.0' encoding='UTF-8'?><BAD_MSG><violation_masks><block>4100010c03c0c00-3a03030830000072-8000000000000000-0</block><alarm>475f0ffcbbd0fea-befbf358b000007e-f000000000000000-0</alarm><learn>0-0-0-0</learn><staging>0-0-0-0</staging></violation_masks><request-violations><violation><viol_index>42</viol_index><viol_name>VIOL_ATTACK_SIGNATURE</viol_name><context>parameter</context><parameter_data><value_error/><enforcement_level>global</enforcement_level><name>dGVzdA==</name><auto_detected_type>alpha-numeric</auto_detected_type><value>PHNjcmlwdD4=</value><location>query</location><expected_location></expected_location><is_base64_decoded>false</is_base64_decoded><param_name_pattern>*</param_name_pattern><staging>0</staging></parameter_data><staging>0</staging><sig_data><sig_id>200001475</sig_id><blocking_mask>3</blocking_mask><kw_data><buffer>dGVzdD08c2NyaXB0Pg==</buffer><offset>6</offset><length>7</length></kw_data></sig_data><sig_data><sig_id>200000098</sig_id><blocking_mask>3</blocking_mask><kw_data><buffer>dGVzdD08c2NyaXB0Pg==</buffer><offset>5</offset><length>7</length></kw_data></sig_data></violation><violation><viol_index>24</viol_index><viol_name>VIOL_PARAMETER_VALUE_METACHAR</viol_name><parameter_data><value_error/><enforcement_level>global</enforcement_level><name>dGVzdA==</name><auto_detected_type>alpha-numeric</auto_detected_type><value>PHNjcmlwdD4=</value><location>query</location><expected_location></expected_location><is_base64_decoded>false</is_base64_decoded></parameter_data><wildcard_entity>*</wildcard_entity><staging>0</staging><language_type>4</language_type><metachar_index>60</metachar_index><metachar_index>62</metachar_index></violation><violation><viol_index>122</viol_index><viol_name>VIOL_BOT_CLIENT</viol_name></violation></request-violations></BAD_MSG>

Within the repo we will take you though the process of you you can:

  • Create a custom CSV logging format on ASM
  • Setup a logstash isntance to receive the ASM log
  • Transform the CSV log fields to values you SIEM can undsterstand eassily
  • Send the transformed log to your SIEM of choice.

Step 1. Configure Logging profile for BIGIP

On BIGIP go to Event Logs -> Logging Profiles and create a new profile. Select the following variables:

  • Application Security: Enabled
  • Storage Destination: Remote Storage
  • Logging Format: Comma-Separated Values
  • Protocol: TCP
  • Storage Format: User-Defined
date_time="%date_time%",is_truncated="%is_truncated%",ip_client="%ip_client%",vs_name="%vs_name%",dest_ip="%dest_ip%",dest_port="%dest_port%",attack_type="%attack_type%",blocking_exception_reason="%blocking_exception_reason%",method="%method%",policy_name="%policy_name%",protocol="%protocol%",request_status="%request_status%",response_code="%response_code%",severity="%severity%",sig_cves="%sig_cves%",sig_ids="%sig_ids%",sig_names="%sig_names%",sig_set_names="%sig_set_names%",sub_violations="%sub_violations%",support_id="%support_id%",threat_campaign_names="%threat_campaign_names%",unit_hostname="%unit_hostname%",uri="%uri%",violation_rating="%violation_rating%",x_forwarded_for_header_value="%x_forwarded_for_header_value%",violations="%violations%",violation_details="%violation_details%",request="%request%"
  • Maximum Request Size: 2048
  • Maximum Entry Length: 10K

Step 2. Setup Logstash to receive the logs from BIGIP

While you have multiple option on how to run Logstash, for simplicity purposes we will document the docker option. As a pre-requisite you need to have the docker and docker-compose installed on your machine.

Clone the repo

Clone this repo to your local machine using https://github.com/skenderidis/bigip-logs and switch the working directory to be bigip-logs

git clone https://github.com/skenderidis/bigip-logs
cd bigip-logs

Install Logstash using docker-compose

TZ=Asia/Dubai && docker-compose up -d

NOTES:

  • Change the timezone used in the docker containers by altering the inline environment variable in the command above accordingly to your location. A list of TZ Database Names can be found here.
  • The TCP port that Logstash is listening to is 8515. This can be changed from the docker-compose.yaml file.

Step 3. Understanding logstash.conf

In this section we will go through the most important parts of the logstash.conf.

Input

This is were you can define the post and type of the information you expecting Logstash to receive. In our case, it listens on port 8515 and it is expecting syslog messages.

input {
  syslog {
    port => 8515
    codec => plain {
      charset => "ISO-8859-1"
    }
  }
}

Grok

Grok is a great way to parse unstructured log data into something structured and queryable. Grok works by combining text patterns into something that matches your logs. In our configuration we are using Grok to match the fields that have been configured on ASM CSV logging format.

 grok {
   match => {
     "message" => [
        ",attack_type=\"%{DATA:attack_type}\"",
        ",blocking_exception_reason=\"%{DATA:blocking_exception_reason}\"",
        ",date_time=\"%{DATA:date_time}\"",
        ",serverIp=\"%{DATA:dest_ip}\"",
        ",dest_port=\"%{DATA:dest_port}\"",
        ",clientIp=\"%{DATA:ip_client}\"",
        ",is_truncated=\"%{DATA:is_truncated}\"",
        ",method=\"%{DATA:method}\"",
        ",policy_name=\"%{DATA:policy_name}\"",
        ",protocol=\"%{DATA:protocol}\"",
        ",requestStatus=\"%{DATA:request_status}\"",
        ",request=\"%{DATA:request}\"",
        ",request_body_base64=\"%{DATA:request_body_base64}\"",
        ",response_code=\"%{DATA:response_code}\"",
        ",severity=\"%{DATA:severity}\"",
        ",sig_cves=\"%{DATA:sig_cves}\"",
        ",sig_ids=\"%{DATA:sig_ids}\"",
        ",sig_names=\"%{DATA:sig_names}\"",
        ",sig_set_names=\"%{DATA:sig_set_names}\"",
        ",src_port=\"%{DATA:src_port}\"",
        ",staged_sig_cves=\"%{DATA:staged_sig_cves}\"",
        ",staged_sig_ids=\"%{DATA:staged_sig_ids}\"",
        ",staged_sig_names=\"%{DATA:staged_sig_names}\"",
        ",staged_threat_campaign_names=\"%{DATA:staged_threat_campaign_names}\"",
        ",sub_violations=\"%{DATA:sub_violations}\"",
        ",support_id=\"%{DATA:support_id}\"",
        ",threat_campaign_names=\"%{DATA:threat_campaign_names}\"",
        ",unit_hostname=\"%{DATA:unit_hostname}\"",
        ",uri=\"%{DATA:uri}\"",
        ",violations=\"%{DATA:violations}\"",
        ",violation_details=\"%{DATA:violation_details_xml}\"",
        ",violation_rating=\"%{DATA:violation_rating}\"",
        ",virtualServerName=\"%{DATA:vs_name}\"",
        ",x_forwarded_for_header_value=\"%{DATA:x_forwarded_for_header_value}\""
     ]
   }
   break_on_match => false
 }

Converting violation details from XML to JSON

If violation_details field exists, then we are converting it from XML to JSON in order to be able to parse easier later on.

  if [violation_details_xml] != "N/A" {
    # change from XML to JSON Obj
    xml {
      source => "violation_details_xml"
      target => "violation_details"
      remove_namespaces => true
			force_array => false
    }
  }

GeoIP

In the next section we are using the GeoIP functionality of Logstash to create a GeoHash for the client IP address. This will allow us to display a Geolocation Map with the origin of the clients. The first part of the code, determines if the client-IP is set as an XFF header value in order to extract the IP details from there, while the second part tries to determine if the client is coming from an Internal IP so that it skips Geo Tagging that particular IP.

  if [x_forwarded_for_header_value] != "N/A" {
    mutate { add_field => { "source_host" => "%{x_forwarded_for_header_value}"}}
  } 
  else {
    mutate { add_field => { "source_host" => "%{ip_client}"}}
  }

  cidr {
    address => [ "%{source_host}" ]
    add_tag => [ "internalIP" ]
    network => [ "10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16" ]
  }
  if "internalIP" not in [tags] {
    geoip {
      source => "source_host"
      ecs_compatibility => disabled
    }
  } 

Mutating fields

In the following section we perform general mutations on the ASM fields, like substituting " for ' with gsub, splitting arrays into comma separated values with split and last removing fields that we don't need any more.

  mutate {
    gsub => ["sig_names","%22","'"]
    split => { "attack_type" => "," }
    split => { "sig_cves" => "," }
    split => { "sig_ids" => "," }
    split => { "sig_names" => "," }
    split => { "sig_set_names" => "," }
    split => { "staged_sig_cves" => "," }
    split => { "staged_sig_ids" => "," }
    split => { "staged_sig_names" => "," }
    split => { "staged_threat_campaign_names" => "," }
    split => { "sub_violations" => "," }
    split => { "threat_campaign_names" => "," }
    split => { "violations" => "," }
    remove_field => [
      "[violation_details][violation_masks]",
      "violation_details_xml",
      "message", "priority","facility","facility_label", "log", "[event][original]",
      "[geoip][postal_code]","[geoip][region_code]","[geoip][timezone]",
      "[geoip][city_name]","[geoip][continent_code]","[geoip][country_code3]"
    ]
  }

Creating a cloned log for further processing.

The following configuration will create a full clone of the log, while keeping the original log intact. It will also add a tag (decoded) so that we can identify the clone from the original.

  if [violation_details][request-violations] {
    clone {  
        clones => ["decoded"]
        add_field => { "decoded" => "yes" }
    }
  }

Transforming the Violations_details fields for the cloned log.

For the cloned log, we are executing the ruby script to change the base64 decode the values. Based on the violation type we are parsing/modifying the appropriate fields.

 ruby {
    code => "
            require 'base64';
            type = event.get('[decoded]');
            data = event.get('[violation_details][request-violations]');
            ...
            ...
            ...
          

Splitting the cloned log into multiple logs.

In the violation_details field (now converted to JSON) we will have an array of all the request-violations that have been identified on the specific HTTP transaction (like shown below).

If the cloned log has multiple request violations, then we create a separate log for each request violation. This suggested as we can aggreate the logs easier and very create useful reports for the teams that are analyzing the data.

  if "decoded" in [tags] 
  {
    if [details][0] { 
      split {
        field => ["[details]"]
      }
    }
    mutate {
      copy => { "[details][sig_data][sig_id]" => "[sig_id]" }
      remove_field => [
        "violation_details", "bot_anomalies","enforced_bot_anomalies", "client_application","bot_category",
        "sig_set_names","sig_names","client_application_version","sig_ids", "violations","request","geoip", "decoded"
      ]
    }
  }

Output of the logs

Currently the output configuration is as follows:

output {
  if "decoded" in [tags] 
  {
    elasticsearch {
      hosts => ["elasticsearch:9200"]
      index => "waf-decoded-logs-%{+YYY.MM.dd}"
    }
  }
  else{
    elasticsearch {
      hosts => ["elasticsearch:9200"]
      index => "waf-logs-%{+YYY.MM.dd}"
    }
  }
}

Open the logstash.conf and modify the output plugin for the configuration. If you are using elasticseach you just need to change the hostname/IP of Elastiseach, but if you are using another SIEM solution you need to modify the output accordingly. More information on logstash output plugins can be found on (https://www.elastic.co/guide/en/logstash/current/output-plugins.html)

Step 4. Configuring index for Elastic (Only if you are using ELK)

Create index templates for on Elasticsearch

curl -d "@index-template.json" -H 'Content-Type: application/json' -X PUT 'http://ELASTIC_IP:9200/_index_template/bigip-waf-logs'

Expected Response: {"acknowledged":true}

Step 5. Importing Grafana Dashboards that work with Elasticsearch

TBD.

Support

Please open a GitHub issue for any problem or enhancement you might need.

bigip-logs's People

Contributors

skenderidis avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.