Code Monkey home page Code Monkey logo

tyk-pump's People

Contributors

alephnull avatar asoorm avatar asutosh avatar bender avatar buger avatar christtyk avatar dependabot[bot] avatar deployinbinary avatar duythinht avatar ermirizio avatar excieve avatar furkansenharputlu avatar gernest avatar ilijabojanovic avatar joshblakeley avatar keyur9 avatar kofoworola avatar komalsukhani avatar konrad-sol avatar letzya avatar lonelycode avatar marksou avatar matiasinsaurralde avatar mativm02 avatar mvdan avatar runivn avatar sedkis avatar sredxny avatar tbuchaillot avatar zavierazo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tyk-pump's Issues

Raw data is sent in base64 to Elasticsearch

When you use Which is pretty useless for use with kibana if you want to search in the body of the requests or the answers.
There should be some automated based64decode before raw_data is sent to elasticsearch

Error using Elastisearch 5+

I have this error message using ES 5+ with tyk pump,

Elasticsearch connection failed: no Elasticsearch node available

it works fine when I use ES 2+, this is my pump.conf

{ "analytics_storage_type": "redis", "analytics_storage_config": { "type": "redis", "host": "localhost", "port": 6379, "hosts": null, "username": "", "password": "", "database": 0, "optimisation_max_idle": 100, "optimisation_max_active": 0, "enable_cluster": false }, "purge_delay": 10, "pumps": { "elasticsearch": { "name": "elasticsearch", "meta": { "index_name": "tyk_analytics", "elasticsearch_url": "127.0.0.1:9200", "enable_sniffing": false, "document_type": "tyk_analytics", "rolling_index": false, "extended_stats": false, "version": "5" } } }, "dont_purge_uptime_data": true }

I'm using tyk pump 0.4.2

Demo data not building

I'd like to build the demo set, but with go run *.go in the repo root, it gives me:

./main.go:191:56: cannot use writeToPumps (type func([]interface {}, *"tyk-pump/vendor/github.com/gocraft/health".Job, time.Time)) as type func([]interface {}, *"github.com/TykTechnologies/tyk-pump/vendor/github.com/gocraft/health".Job, time.Time) in argument to demo.GenerateDemoData

Very new to Golang. I guess that's an easy issue, but I can't get it resolved...

Metric prefix for statsd pump

It seems like currently it's not possible to specify a prefix to apply to all metrics pushed by the statsd pump (or am I wrong?). It would really be useful to be able to create a namespace for tyk metrics (even hardcoding it to tyk would be an improvement.

All pumps should be configurable with env vars

In some setups (e.g. running containers) it might be easier to do configuration through environment variables. We already support that for MongoDB pumps using PMP_MONGO, PMP_MONGOAGG, etc. env prefixes but that's missing from the ElasticSearch pump and possibly some others. This needs to be reviewed and implemented for all the pumps.

No pre-built binaries

I may be missing this - but the release page only has the source code, and there doesn't seem to be a suefire way of getting the binary apart from through apt/yum

Would it be possible to release a packaged binary as part of the github release similarly to how other components are managed?

Add TLS support for MongoDB connections

The MongoDB driver supports TLS connections, however it requires a custom Dial, this is currently not implemented, and prevents users from being able to use hosted / SaaS MOngoDB setups where, because of the open-network hop, SSL is enforced.

This is related to this issue.

Add config to disable background MongoDB indexing

AWS DocumentDB supports only one background index build to be running at a time on a collection and will fail otherwise as documentation states:

Amazon DocumentDB allows only one background index build to occur on a collection at any given time. If DDL (Data Definition Language) operations such as rename collection or create index occur during a background index build, the background index build fails.

Currently we create all MongoDB indexes in a background mode so there's a chance for that to fail. We should add an option to disable background indexing (with background enabled by default). This shouldn't have any performance impact on new collections but should be documented somewhere in case people are migrating large existing DBs from somewhere else as index build might block other ops.

bug: time.Sleep should specify number of seconds to sleep for

tyk/rpc_analytics_purger.go:37:3: time.Sleep(sleep)
tyk-pump/pumps/elasticsearch.go:137:3: time.Sleep(5)
tyk-pump/pumps/influx.go:72:3: time.Sleep(5)
tyk-pump/pumps/mongo.go:125:3: time.Sleep(5)
tyk-pump/pumps/mongo_aggregate.go:333:3: time.Sleep(5)
tyk-pump/pumps/mongo_selective.go:104:3: time.Sleep(5)
tyk-pump/pumps/statsd.go:63:3: time.Sleep(5)

Statsd pump: Allow escaping/replacing of dots in tag values

The Statsd pump concatenates the values of the configured tags using a dot as the separator, see statsd.go at line 119. This itself is fine, as statsd metrics are usually separated by dots. In fact, the prometheus statsd-exporter expects this to be the case.

Unfortunately, some tag values (like ip_address or sometimes even path) contain dots themselves, thus breaking (or at least ruining) the generated metric name. Because of this, I would like to have the option to replace the dots inside the tag values with another character, like the underscore for example.

tyk_analytics Alias mapping

I'm pretty new to Tyk and Pump but after having a dig around what is stored in Redis and the mappings here I was just wondering why Alias is not defined in the pump mappings?

I'd like to define our users ID in the alias so we can link analytics to our users.

Happy to open a PR if this is something you are open too, but thought I'd ask first as there my be a geniune reason why Alias has not been included.

Add log browser index to analytics collections

Currently there's no efficient index created for analytics collections in MongoDB. This results in the dashboard log browser failing to display anything due to mongo timeouts on large enough collections. This also causes excessive load on the MongoDB itself.

So far the following composite index is proved to work pretty well:

db.tyk_analytics.createIndex({"timestamp" : -1, "orgid" : 1, "apiid" : 1, "apikey" : 1, "responsecode" : 1}, {background: true})

This index starts with the descending timestamp, which is always present in the log browser requests and includes all the optional parameters that it can use. Note that orgid should not be required for on-prem installations with a single org.

The index above can grow quite large too though so possibly just limiting it to timestamp + orgid (when needed) can be enough. Especially for the log browser case, which shows only a limited number of results per page.

In addition to this, pump currently creates a number of composite indices that are never utilised due to starting with an _id field. Those should be removed.

Unstable packages

The pump is the last project which is left without an unstable packages pipeline, and unstable docker images.

Health Check

Do you want to request a feature or report a bug?
Feature

What is the current behavior?
N/A

What is the expected behavior?
Tyk Pump should have a health check functionality

Which versions of Tyk affected by this issue? Did this work in previous versions of Tyk?
All

Some MongoDB index names are too long

AWS DocumentDB has limitations on index name length (up to 63 chars together with collection name). When going over the limit index creation fails:

  • One notable case in the mongo pump is the "log browser" index, created at:
    https://github.com/TykTechnologies/tyk-pump/blob/master/pumps/mongo.go#L239
    This should be replaced with a custom shorter name instead of automatically generating from field names.
  • Other indexes need to be reviewed too as e.g. in combination with a selective pump the full entity name limit (up to 127 chars) may be reached.

uptime_pump: allow configuration via env vars

Docs indicate that the uptime pump may be configured via environment variables, however the type is simply interface{} which will not allow this.

We should be able to set env vars as follows:

TYK_PMP_UPTIMEPUMPCONFIG_COLLECTIONNAME
TYK_PMP_UPTIMEPUMPCONFIG_MONGOURL
TYK_PMP_UPTIMEPUMPCONFIG_MONGOUSESSL
TYK_PMP_UPTIMEPUMPCONFIG_MONGOSSLINSECURESKIPVERIFY
TYK_PMP_UPTIMEPUMPCONFIG_MAXINSERTBATCHSIZEBYTES
TYK_PMP_UPTIMEPUMPCONFIG_MAXDOCUMENTSIZEBYTES
TYK_PMP_UPTIMEPUMPCONFIG_COLLECTIONCAPMAXSIZEBYTES
TYK_PMP_UPTIMEPUMPCONFIG_COLLECTIONCAPENABLE

If endpoint path contains dot, then mongodb data gets corrupted

Analytics data for an endpoint is recorded in a field with name same as that of the path.
MongoDB doesn't support dot in the field name. If the path contains a dot, mongodb collection gets corrupted.

Example

If analytics is enabled for endpoint /get/(.*), instead of creating field /get/(.*), /get/( embedded document with *) field gets created.

Upgrade to Elastic v5

The current version of the Tyk Pump only supports ElasticSearch version 3. Can pump/elasticsearch.go please be updated to use version 5 of the Elastic package ( gopkg.in/olivere/elastic.v5) which supports ElasticSearch v5?

mongo pump should add default indexes

Right now mongo collection used by log viewer does not have any indexes, unlike aggregated mongo pump, which proper indexes.

We should add at least "timestamp" index to mongo collection.

No reconnecting to database.

Hi everyone.

I was getting a error with the stats that pump must handle. The problem is that the stats wasn't being shown. The line that make reference in the syslog:

level=error msg="Problem inserting to mongo collection: EOF"

It seems that tyk-pump cant reconnect to the mongo database. It seems that this link is not totally correct (I mean, someone says that the problem is fixed... but no).

I stop and restart mongo and I obtain this error, and then if I restart pump... it works again.

Support for sending only aggregated data from hybrid pump

In case of MDCB/Hybrid installation slaved gateways send analytics request for each record, causing a huge amount of network traffic to MDCB.

We need a way to pre-calculate aggregate analytics on the local pump side, so it will send only aggregated data to MDCB layer.

The change will require modifications on both Pump and MDCB side. For Pump suggest extending Hybrid pump, so it will have a new "aggregated" field like this:

        "hybrid": {
            "name": "hybrid",
            "meta": {
                "rpc_key": "5b5fd341e6355b5eb194765e",
                "api_key": "008d6d1525104ae77240f687bb866974",
                "aggregated": true,
                "connection_string": "localhost:9090",
            }
        }

Possible memory leak on production

I had this memory problem today, since i'm using dynatrace as a monitoring service, i can have the timeline of whats happened.

Jul 13 18:42:52 ip-172-31-11-143 kernel: [1135762.108931] Out of memory: Kill process 27863 (tyk-pump) score 507 or sacrifice child
Jul 13 18:42:52 ip-172-31-11-143 kernel: [1135762.115685] Killed process 27863 (tyk-pump) total-vm:701084kB, anon-rss:530936kB, file-rss:0kB
Jul 13 18:42:52 ip-172-31-11-143 kernel: [1135762.209707] init: tyk-pump main process (27863) killed by KILL signal
Jul 13 18:42:52 ip-172-31-11-143 kernel: [1135762.209719] init: tyk-pump main process ended, respawning

memoryleak

Output to ELK

A brilliant feature would be creating an output to Elasticsearch, as that's a powerful tool for analytics such as this

Could use something like this https://github.com/olivere/elastic

We (Crunch) may look at doing a pull request if we create this functionality at some point, but thought I'd make an Issue here so it could be tracked

Cheers,

Generate demo data

As a developer I would like to be able to test on my local machine all the dashboard sections that present statistics (charts);

The pages that are currently missing data are:
/geo/
/geo/:isoCode
/uptime
/uptime/url
/uptime/urls

Allow Pumps to Identify Keys

Currently Pumps can track AnalyticsRecord anonymously e.g. based on API Key. When a key is created more information is available that can be used associate this API key with more permanent user information such as email address/username, Tyk developer ID and meta fields.

With this information it would be possible to track developers across key changes.

Specifically in the case of the Segment pump this information can be used with the Identify operation https://segment.com/docs/libraries/go/#identify.

logrus.Fields

I keep getting the following error (using Goland) even after installing all the imports when the code uses log.WithFields(logrus.Fields{})

The error is:
image

Am I missing something?
Thanks

add new pump type "hybrid" to send analytics data to MDCB

This requires to add new pump type "hybrid" which would re-use tyk's RPC layer to send analytics data to MDCB.

Also, we should add new option in tyk to disable sending analytics to via RPC so tyk and tyk-pump won't clash processing analytics data.

Pump tries to init MongoDB, even if not configured - then hangs forever

My tyk-pump 0.4.2 hangs on startup:

tyk-pump[24603]: time="Sep 4 14:54:55" level=info msg="## Tyk Analytics Pump, v0.4.2 ##"
tyk-pump[24603]: time="Sep 4 14:54:55" level=debug msg="CSV Initialized"
tyk-pump[24603]: time="Sep 4 14:54:55" level=info msg="Init Pump: CSV Pump"
tyk-pump[24603]: time="Sep 4 14:54:55" level=info msg="-- No max batch size set, defaulting to 10MB"
tyk-pump[24603]: time="Sep 4 14:54:55" level=info msg="-- No max document size set, defaulting to 10MB"

That is the complete log of tyk-pump.

Looking at the code, the string "No max document size set" is only logged by the MongoDB-pumps. I think it hangs forever at "m.connect()".

But I don't want to use MongoDB and I haven't configured it. This is my pump.conf:

{
    "analytics_storage_type": "redis",
    "analytics_storage_config": {
        "type": "redis",
        "host": "localhost",
        "port": 6379,
        "hosts": null,
        "username": "",
        "password": "",
        "database": 0,
        "optimisation_max_idle": 100,
        "optimisation_max_active": 0,
        "enable_cluster": false
    },
    "purge_delay": 10,
    "pumps": {
	"csv": {
            "name": "csv",
            "meta": {
                "csv_dir": "./"
            }
	}
    }
}

bug: WriteUptimeData best-effort insert logic potential data loss

WriteUptimeData attempts to insert into the AnalyticsCollection. Upon failure an error log is written and if due to disconnect, attempts reconnect. WriteUptimeData doesn't currently return anything.

https://github.com/TykTechnologies/tyk-pump/blob/master/pumps/mongo.go#L267

The problem with this logic is that because the caller StartPurgeLoop does not get feedback from WriteUptimeData as to whether the operation was successful or not. WriteUptimeData does not return anything. As such, data becomes purged from the analytics storage despite not having been successfully written to Mongo.

Is it intended functionality for WriteUptimeData to be on a best-effort basis and a degree of data loss acceptable for this logic?

Should there be a v0.6 release here?

The title of this release over in the tyk repo strongly hints that tyk-pump v0.6 should exist, but I cannot find it and it is not in the Ubuntu package repo (latest there is 0.5.4). Is this intended, or an error? I was hoping to upgrade to a tyk-pump with ElasticSearch v6 support, which was added a few months back.

Potential memory leak in the pump over time

Tyk pump has a potential memory leak over time, it is confirmed to with the standard mongo pumps, but it exacerbated by the elasticsearch pump

Behaviour was seen with 10s purge delays

tyk-pump: error: unknown long flag '--c', try --help

Hi All,

Please help me with this, i am running TYK_quickstart in my local virtual machine. It worked fine for me last week. but, i jst removed docker images and reload using docker-compose fortunately without any change in the files.

is throwing this error.

Docker Version:
Client:
Version: 17.05.0-ce
API version: 1.29
Go version: go1.7.5
Git commit: 89658be
Built: Thu May 4 22:06:25 2017
OS/Arch: linux/amd64

Server:
Version: 17.05.0-ce
API version: 1.29 (minimum version 1.12)
Go version: go1.7.5
Git commit: 89658be
Built: Thu May 4 22:06:25 2017
OS/Arch: linux/amd64
Experimental: false

pump.conf

{
"analytics_storage_type": "redis",
"analytics_storage_config": {
"type": "redis",
"host": "redis",
"port": 6379,
"hosts": null,
"username": "",
"password": "",
"database": 0,
"optimisation_max_idle": 100,
"optimisation_max_active": 0,
"enable_cluster": false
},
"purge_delay": 2,
"pumps": {
"mongo": {
"name": "mongo",
"meta": {
"collection_name": "tyk_analytics",
"mongo_url": "mongodb://mongo:27017/tyk_analytics"
}
}
},
"uptime_pump_config": {
"collection_name": "tyk_uptime_analytics",
"mongo_url": "mongodb://mongo:27017/tyk_analytics"
},
"dont_purge_uptime_data": false
}

Looking for help on this.

docker-compose file:

version: '2'
services:
tyk_redis:
image: redis:3.2
hostname: redis
ports:
- "6379:6379"
networks:
gateway:
aliases:
- redis
tyk_mongo:
image: mongo:3.0
command: ["mongod", "--smallfiles"]
hostname: mongo
ports:
- "27017:27017"
networks:
gateway:
aliases:
- mongo
tyk_dashboard:
image: tykio/tyk-dashboard:latest
ports:
- "3000:3000"
- "5000:5000"
volumes:
- ./tyk_analytics.conf:/opt/tyk-dashboard/tyk_analytics.conf
networks:
- gateway
tyk_pump:
image: tykio/tyk-pump-docker-pub:latest
networks:
- gateway
volumes:
- ./pump.conf:/opt/tyk-pump/pump.conf
tyk_gateway:
image: tykio/tyk-gateway:latest
ports:
- "80:8080"
- "8080:8080"
volumes:
- ./tyk.conf:/opt/tyk-gateway/tyk.conf
networks:
- gateway

networks:
gateway:

[TT-1337] Tag fields should be mongo-safe

When creating aggregates for tag fields, they should be made safe for inclusion in Mongo, tags that contain dots: "." will generate insert/upsert errors because the field is not usable.

Make the tyk_analytics collection capacity configurable in pump.conf

At present, the tyk_analytics collection in MongoDB can grow until it hits the in-memory limit. This occurs because the Pump does not apply a cap on the size of the collection, nor does it apply any indexing.
The capacity applied will vary from each installation to another, based on the amount of RAM installed on the machine. A new feature should be added to make the capacity configurable from within the pump.conf file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.