brunotm / elasticsplunk Goto Github PK

View Code? Open in Web Editor NEW

39.0 39.0 24.0 292 KB

A Search command to explore Elasticsearch data within Splunk.

License: MIT License

Python 100.00%

elasticsearch splunk

elasticsplunk's People

Contributors

Stargazers

Watchers

elasticsplunk's Issues

user authentication for cluster health

hi @brunotm @1vish99

I am trying to get the basic cluster health status using the app

Curl works -

curl -u usernam:password -XGET "https://elasticsearchdev.domain.com:443/_cluster/health"?pretty

{
"cluster_name" : "newdev",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 1608,
"active_shards" : 3216,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 10.0
}

How do I make this work using the app ? the app is installed on the same splunk server where I ran CURL to see successful results.

Thanks

TypeError: 'NoneType' object is not iterable while issuing queries to Elastic Search

How to reproduce:

1. spin up the environment

Use this docker compose setup:

# docker-compose.yml
version: '3'

services:
  splunkenterprise:

    hostname: splunkenterprise
    image: splunk/splunk
    environment:
      SPLUNK_START_ARGS: --accept-license --answer-yes --seed-passwd somepass123456789
      SPLUNK_ENABLE_LISTEN: 9997
      SPLUNK_ADD: tcp 1514
      OPTIMISTIC_ABOUT_FILE_LOCKING: 1
    volumes:
      - ./opt-splunk-etc:/opt/splunk/etc
      - ./opt-splunk-var:/opt/splunk/var
    ports:
      - "8000:8000"
      - "9997:9997"
      - "8088:8088"
      - "1514:1514"

  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:6.3.1
    container_name: elasticsearch
    environment:
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - ./esdata:/usr/share/elasticsearch/data
    ports:
      - 9200:9200

  kibana:
    image: docker.elastic.co/kibana/kibana:6.3.1
    depends_on:
      - elasticsearch
    environment:
      ELASTICSEARCH_URL: http://elasticsearch:9200
    ports:
      - 5601:5601

This will spin up spin up splunk, elastic and kibana

1.2 install the plugin

in your local ./opt-splunk-etc/apps run:

git clone https://github.com/brunotm/elasticsplunk.git

then stop and restart the infrastructure:

docker-compose down && docker-compose up

2. add data to elastic

Use kibana on http://localhost:5601 and through the development tools run the following query:

PUT _bulk
{"index": {"_index": "pear", "_type": "default"}}
{"foo":"bar", "type":"fruit", "subtype":"pear", "ts":"2018-07-19"}
{"index": {"_index": "banana", "_type": "default"}}
{"foo2":"bar2", "type":"fruit", "subtype":"banana", "ts":"2018-07-19"}
{"index": {"_index": "appple", "_type": "default"}}
{"foo3":"bar3", "type":"fruit", "subtype":"apple", "ts":"2018-07-19"}

3. query on splunk

|ess eaddr=elasticsearch:9200 action=query query="foo:bar" index=pear fields=foo include_es=true include_raw=true stype=doc_type tsfield=ts

The following error will be displayed on screen:

External search command 'ess' returned error code 1. Script output = "None error_message=TypeError at "/opt/splunk/etc/apps/elasticsplunk/bin/splunklib/searchcommands/internals.py", line 520 : 'NoneType' object is not iterable "

while my expectation was to see the following records:

{"foo":"bar", "type":"fruit", "subtype":"pear", "ts":"2018-07-19"}

handle documents with different fields

This is an enhancement proposal.

When retrieving documents with different fields from an elastic index (e.q. index="metricbeat-*" query="*") then the first document determines the names of the columns of the whole table! The content of further documents with other fields are not shown, because there is no corresponding columnname.

The following modification inserts an additional first document with all fields of all documents (and a _time value < 0 to be filtered out later). The header fields are determined depending of the scan option:

if scan=false, the columns are collected by looping through the full hits list
if scan=true, the columns are extracted from an esclient.indices.get_field_mapping call

You can additionally determine the display sequence of the columns with the fields-parameter, e.g. fields="beat.name,system.load.*,beat.*" will show _time and beat.name first, then all system.load fields and after that the remaining beat-fields (without beat.name of course).

Unfortunaly I am not familiar with pull requests/github development, therefore here a code proposal (could be modified as you like) as follows:

# KAEM BEGIN extension to get column names via get_field_mapping
#       if self.scan:  # does not work, because is string type and always true
        if self.scan in ["true", "True", 1]: 
            head = OrderedDict()
            head["_time"] = -2
            f0 = config[KEY_CONFIG_FIELDS] or ['*']
            res = esclient.indices.get_field_mapping(index=config[KEY_CONFIG_INDEX], fields=f0)
            for nx in res:
                for ty in res[nx]["mappings"]:
                    for m0 in f0:
                        for fld in sorted(res[nx]["mappings"][ty]):
                            if fld in head: continue
                            if fld.endswith(".keyword"): continue
                            if re.match(m0.replace('*', '.*'), fld): head[fld]=""
            yield head
#KAEM END

            # Execute search
            res = helpers.scan(esclient, 
            ....
       else:
            res = esclient.search(index=config[KEY_CONFIG_INDEX],
                                  size=config[KEY_CONFIG_LIMIT],
                                  _source_include=config[KEY_CONFIG_FIELDS],
                                  doc_type=config[KEY_CONFIG_SOURCE_TYPE],
                                  body=body)

# KAEM BEGIN extension to get column names via hits scanning
            head = OrderedDict()
            head["_time"] = -1
            head0 = {}
            f0 = config[KEY_CONFIG_FIELDS] or ['*']
            for hit in res['hits']['hits']:
                for fld in self._parse_hit(config, hit): head0[fld] = ""
            for m0 in f0:
                for fld in sorted(head0):
                    if fld in head: continue
                    if re.match(m0.replace('*', '.*'), fld): head[fld] = head0[fld]
            head["_time"] = -1  # setup again, because overwritten by hits in meantime
            yield head
#KAEM END

Unable to run queries

I am able to run the following and get results returned:

| ess eaddr="http://localhost:9200" action="cluster-health"
| ess eaddr="http://localhost:9200" action="indices-list"

But, when I run something like the below I get no errors and no results returned. When I try the same query from within Kibana get results returned.

| ess eaddr="http://localhost:9200" tsfield=@timestamp index="logs*" query="directionName:Local" limit="50"

Are there expected versions for compatibility? Currently the Lucene version is 6.6.0. Any guidance is appreciated.

ESS queries not working

I'm running on Ubuntu 14 using Splunk 7.0.1.

When I run indices-list, I get the following (using the Collective Intelligence Framework's Elastic instance):

_time	mappings	number_of_replicas	aliases	number_of_shards	creation_date	uuid	name
2018-01-20 13:38:39	tokens	1		5	1516475310127	85PhRO_QQVq-4Vaw_leSkA	cif.tokens
2018-01-20 13:38:39	observables	1		5	1516476414154	jrsH4eNBRE2vqfDhue_VeQ	cif.observables-2018.01

When I run the following command line query, I see results:

curl -XGET 'http://localhost:9200/cif.observables-2018.01/_search?pretty=1&otype=fqdn' | more

When I run | ess eaddr="http://localhost:9200" index="cif.observables-2018.01" query="otype:fqdn", I get:

External search command 'ess' returned error code 1.

I tried to enable INFO level logging in the logging.conf file, but I never see any kind of log file show up in /opt/splunk/var/log/splunk.

How do I go about troubleshooting this?

option scan=false is not working

Thank you very much for the implementation of this elasticsplunk interface.
Unfortunaly there is a small issue when executing the code in elasticsplunk.py (line 256):

        if self.scan:
            res = ....

Even if defined as bool in splunk, in python self.scan is a string and the if statement therefore always evaluates to true - the else branch is never executed. If you substitute it e.g. with

        if self.scan in ["true", "True", 1]:

everything works fine.

app broken

Hi,

I am using your app to pull data from elasticsearch , but i am receiving this error when i click on app. may i know what could be the issue ?

Scroll request has only succeeded on ### shards out of ###

Queries that previously worked have stopped working and return this error:

External search command 'ess' returned error code 1. Script output = "error_message=ScanError at "D:\Splunk\etc\apps\elasticsplunk-master\bin\elasticsearch\helpers_init_.py", line 394 : Scroll request has only succeeded on 3492 shards out of 3544. "

Any queries at any time ranges return this same error. The queries run OK directly from the Elasticsearch 6.3.1 API:

curl -k -X GET "http://es-host/winlogbeat-*/_search?q=event_data.param1:*AP001175*&pretty"

Ten records are returned, the response begins with:

{
"took" : 33521,
"timed_out" : false,
"num_reduce_phases" : 7,
"_shards" : {
"total" : 3544,
"successful" : 3492,
"skipped" : 0,
"failed" : 0
},

Basic Auth not supported by my instance

In order to retrieve data from my instance of ES, we don't support Basic Authorization (e.g. the auth used for urllib3), is there any way to utilize the requests module to allow for further authentication?

Perhaps the requests module where you can specify username and password by creating a session first.

Latest/Earliest field

Not sure if I am using the field incorrectly or not, but when I populate those fields (ie. now-6M or now-1y) the time doesn't actually reflect those values. Instead it just take whatever time value is populate in the Splunk gui.

tsfield="@timestamp" latest=now earliest="now-1y"
If Splunk is set to look at the last 15 minutes, the http query sent to ElasticSearch is 15 minutes not back to 1 year.

timestamp doesn't work with the format "fields.release_date"

the timestamp works like =>tsfield="timestamp"
but doesn't work like =>tsfield="fields.release_date"
===example dataset===
{
"_index": "movies",
"_type": "movie",
"_id": "40",
"_version": 1,
"_score": 2,
"_source": {
"fields": {
"directors": [
"Joss Whedon"
],
"release_date": "2015-04-29T00:00:00Z",
"genres": [
"Action",
"Adventure",
"Fantasy",
"Sci-Fi"
],
"image_url": "......",
"title": "The Avengers: Age of Ultron",
"rank": 40,
"year": 2015,
"actors": [
"Scarlett Johansson",
"Chris Hemsworth",
"James Spader"
]
},
"id": "tt2395427",
"type": "add"
},
"fields": {
"fields.release_date": [
"2015-04-29T00:00:00.000Z"
]
}
}

External search command 'ess' returned error code 1. Script output = "error_message=KeyError at ".../bin/elasticsplunk.py", line 188 : u'fields.release_date' "

Time Field cross Apps

Curious is there a way to make the latest and earliest field work in other apps? The fix in the other issue does work in the ElasticSplunk app. But if I try to use the ess command in something like Search & Reporting it still default to the time in the GUI rather then the time in the search command.

How to use ESS command with ES and shield implemnted

Basically, i want to search against the ES cluster where shield plugin is implemented and i need to provide authentication details..With the current implementation of this app, i m getting the following error:

External search command 'ess' returned error code 1. Script output = "error_message=AuthenticationException at "/Users/jigsaw/Documents/splunk/etc/apps/elasticsplunk/bin/elasticsearch/connection/base.py", line 125 : TransportError(401, u'security_exception', u'missing authentication token for REST request [/quotingreport*/_search?size=10000&scroll=5m]') "

Please help how to solve this...

Does the plugin handle wildcards?

Hi,

I'm testing out this plugin, and it works fine as long as the queries don't use wildcards. Is this supported?

For example, this works fine:

| ess eaddr="http://1.2.3.4:9200" tsfield="@timestamp" index="nprobe-2018.07.03" latest=now earliest="now-360m" query="IPV4_DST_ADDR:5.6.7.8" fields=*

But this just spins:

| ess eaddr="http://1.2.3.4:9200" tsfield="@timestamp" index="nprobe-2018.07.03" latest=now earliest="now-360m" query="IPV4_DST_ADDR:1.2.3" fields=

query against on Mulitple index

Hi,

how to execute the similar below command on multiple index.

For example in splunk.

index=* OR index=_* NOT index=main NOT index=history NOT sourcetype=stash

can we do the same thing here as well ? i tried but its not working. if you could let me know how to do , then it will be very helpful. thanks.

Data no field timestamp, how to get the data?

the dataset(name:shakespeare) who has not the field timestamp, how can I get this data?
https://www.elastic.co/guide/en/kibana/current/tutorial-load-dataset.html

Add support for nested documents

Currently ElasticSplunk doesn't handle nested documents in search hits from Elastic (it does handle searches although - object.object.attr:value in query argument).

This can be handled as it is done in resultd by Kibana, by flattening nested documents like:

{
  "l1": {
    "l2": {
     "attr1": "value"
    }
  }
}

Into:
l1.l2.attr1:value

The same structure can be used for searches.

Thanks @heipei for bringing this up.

Results in Statistic tab

Hello!

When performing a search in with elasticsplunk, the results always appears in the statistic tab instead event tab... It seems it's related to search_command but I've tried to add retainsevents = true
in the commands.conf but nothing have changed. Is this the normal behaviour?

Thanks,

Will this work with http elk?

Hi,

I installed this app on a server, and pointed it to a local elk config, and it works as long as it just use "localhost:9200" for the elk server. If i try to enter either http or https in the index address, it fails. Any suggestions?

ESS queries not working - error

Hi,

I'm trying to query an elastic search instance, but I never get any results and the logs are showing an error. Am I doing something wrong?

Query:

|ess eaddr="http://1.2.3.4:9200" tsfield="@timestamp" index=netflow-2018.05.01 earliest="now-2h" query="host:1.2.3.4" fields=host

Error:
5-02-2018 12:40:47.197 ERROR ScriptRunner - stderr from '/opt/splunk/bin/python /opt/splunk/etc/apps/elasticsplunk-master/bin/elasticsplunk.py EXECUTE eaddr="http://1.2.3.4:9200" tsfield="@timestamp" index=netflow-2018.05.01 earliest="now-2h" query="host:1.2.3.4" fields=host': 2018-05-02 12:40:47,197, Level=DEBUG, Pid=3948, Logger=splunklib, File=search_command.py, Line=624, ElasticSplunk.process finished under protocol_version=1
05-02-2018 12:40:47.238 INFO script - Invoked script ess with 399 input bytes (0 events). Returned 0 output bytes in 403 ms.

Error searching in ES

Hello!

I'm trying the splunk application and when I try to perform a search in elasticsearch I get an error. The query is:

|ess eaddr="http://localhost:9200" index=test-index-* query="test"

And in the splunkd.log:

############
11-28-2017 10:10:01.011 +0100 ERROR ScriptRunner - stderr from '/opt/splunk/bin/python /opt/splunk/etc/apps/elasticsplunk/bin/elasticsplunk.py GETINFO eaddr="http://localhost:9200" index=test-index- query="test"': 2017-11-28 10:10:01,010, Level=DEBUG, Pid=23565, Logger=splunklib, File=search_command.py, Line=572, ElasticSplunk.process started under protocol_version=1
11-28-2017 10:10:01.011 +0100 ERROR ScriptRunner - stderr from '/opt/splunk/bin/python /opt/splunk/etc/apps/elasticsplunk/bin/elasticsplunk.py GETINFO eaddr="http://localhost:9200" index=test-index- query="test"': 2017-11-28 10:10:01,011, Level=DEBUG, Pid=23565, Logger=splunklib, File=search_command.py, Line=579, Writing configuration settings
11-28-2017 10:10:01.012 +0100 ERROR ScriptRunner - stderr from '/opt/splunk/bin/python /opt/splunk/etc/apps/elasticsplunk/bin/elasticsplunk.py GETINFO eaddr="http://localhost:9200" index=test-index- query="test"': 2017-11-28 10:10:01,011, Level=DEBUG, Pid=23565, Logger=splunklib, File=search_command.py, Line=508, metadata={u'action': u'getinfo', u'searchinfo': {u'earliest_time': None, u'dispatch_dir': None, u'owner': None, u'args': ['/opt/splunk/etc/apps/elasticsplunk/bin/elasticsplunk.py', 'GETINFO', 'eaddr="http://localhost:9200"', 'index=test-index-', 'query="test"'], u'latest_time': None, u'splunk_version': u'7.0.0', u'sid': u'searchparsetmp_1116702311', u'username': None, u'search': u'|ess eaddr="http://localhost:9200" index=test-index- query="test"', u'app': None, u'session_key': None, u'splunkd_uri': None, u'raw_args': ['/opt/splunk/etc/apps/elasticsplunk/bin/elasticsplunk.py', 'GETINFO', 'eaddr="http://localhost:9200"', 'index=test-index-', 'query="test"']}, u'preview': True}, input_header={u'truncated': u'0', u'sid': u'searchparsetmp_1116702311', u'keywords': u'""', u'splunkVersion': u'7.0.0', u'preview': u'0', u'realtime': u'0', u'allowStream': u'1', u'search': u'|ess eaddr="http://localhost:9200" index=test-index- query="test"'}
############

I'm using Splunk 7 and ES 6.

Thanks in advanced.

Using elasticsplunk with elasticsearch using readonlyREST

Hi,

Is there a way to add ldap connection details so it can be used with elasticsearch cluster which is secured with readonlyREST