caicloud / event_exporter Goto Github PK

View Code? Open in Web Editor NEW

311.0 23.0 70.0 34.3 MB

Exporter for kubernetes events

License: Apache License 2.0

Makefile 12.58% Go 86.01% Dockerfile 0.31% Shell 1.11%

kubernetes prometheus exporter event maintenance

event_exporter's People

Contributors

Stargazers

Watchers

Forkers

ddysher reason2010 wangyumi bbbmj yanghongkjxy dewey363 braintribehq cofyc cloudhybrid iwanbk dskatz anoopwebs lichuan0620 supereagle belyenochi uzxmx joshadambell pvalsecc astrisk phoenixwu0229 uckingleaf maoqide mianbao-cn-com ericdstein chandankumar4 sallyan rahulchheda dhawalthakkar arramos84 amolde freenetdigital seymourtang zionwu szbrain swisscom evanlixin lolatravel mavanier bitzhjr kirschenstas husoule lang1lang chris-vest dineshbhor justex superbrothers santhosh-kumar-nalla tiger7456 slapula alexmanno ning1875 kayinli nishita-09 facets-cloud sufiyanshaikh963 xinsir88 ishaankalra forestsword aslamkhan-atlan yunixiangfeng leonavevor tohjustin iq-scm lifehacking svrc zhenghz luoweiro ryuk-0111

event_exporter's Issues

UnexpectedAdmissionError Alerting

Want to label as "help wanted"
We've encounterd an issue where pods end up being in UnexpectedAdmissionError state. Found this tool and it looks like it's exactly what i need to get monitoring setup. I was wondering if i have the log.level set to warning since the error is a "Warning" type

Events:
  Type     Reason                    Age    From                    Message
  ----     ------                    ----   ----                    -------
  Normal   Scheduled                 2m51s  default-scheduler       Successfully assigned default/backend-549f576d5f-xzdv4 to std-16gb-g7mo
  Warning  UnexpectedAdmissionError  2m51s  kubelet, std-16gb-g7m

Would the event-exporter catch this? If so i'm wondering what the event_reason parameter prometheus query would be

Add opportunity to filter events

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened: All events are published into Prometheus

What you expected to happen: Some variable to select only certain events

How to reproduce it (as minimally and precisely as possible): N/A

Anything else we need to know?: N/A

* collected metric "kubernetes_events" was collected before with the same name and label values

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:

Error returned to prometheus scraping target: server returned HTTP status 500 Internal Server Error

An error has occurred while serving metrics:
2 error(s) occurred:
* collected metric "kubernetes_events" { label:<name:"event_kind" value:"Ingress" > label:<name:"event_message" value:"Ingress xxx/docker-registry-cache-docker-registry-caching-proxy" > label:<name:"event_name" value:"docker-registry-cache-docker-registry-caching-proxy" > label:<name:"event_namespace" value:"xxx" > label:<name:"event_reason" value:"UPDATE" > label:<name:"event_source" value:"/nginx-ingress-controller" > label:<name:"event_subobject" value:"" > label:<name:"event_type" value:"Normal" > gauge:<value:1 > } was collected before with the same name and label values
* collected metric "kubernetes_events" { label:<name:"event_kind" value:"Ingress" > label:<name:"event_message" value:"Ingress xxx/docker-registry-cache-docker-registry-caching-proxy" > label:<name:"event_name" value:"docker-registry-cache-docker-registry-caching-proxy" > label:<name:"event_namespace" value:"xxx" > label:<name:"event_reason" value:"UPDATE" > label:<name:"event_source" value:"/nginx-ingress-controller" > label:<name:"event_subobject" value:"" > label:<name:"event_type" value:"Normal" > gauge:<value:1 > } was collected before with the same name and label values

What you expected to happen:

An healthy endpoint with the healthy metrics and the error in the logs.

How to reproduce it (as minimally and precisely as possible):

Specific to one of our production cluster, unfortunately could not reproduce.

Anything else we need to know?:

Checking the events with kubectl we do see three events created at the same time with the same TYPE/REASON/OBJECT

Question: What do the values mean for a kubernetes_events?

/kind feature

I noticed that some of the kubernetes_event metrics have a value of 0.
I'm assuming that a kubernetes_event with a value of 1 means that the event happened at around the time of scrapping, however I'm not sure about what an event with a 0 value means.

Like in the example:
/# HELP kubernetes_events State of kubernetes events
/# TYPE kubernetes_events gauge
kubernetes_events{event_kind="Pod",event_name="nginx-pc-534913751-2yzev",event_namespace="allen",event_reason="BackOff",event_source="kube-node-3/kubelet",event_subobject="spec.containers{nginx}",event_type="Normal"} 1
kubernetes_events{event_kind="Pod",event_name="nginx-pc-534913751-2yzev",event_namespace="allen",event_reason="Failed",event_source="kube-node-3/kubelet",event_subobject="spec.containers{nginx}",event_type="Warning"} 0

My question is, what do the 0 and 1 values mean for the kubernetes_events ?

Thank you!

Update vendors

Is this a BUG REPORT or FEATURE REQUEST?: FEATURE REQUEST

kind feature

What happened:
Vendor are getting old and last update in 3 years. Can we get it updated?

What you expected to happen:
Updated Vendors

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Major Release Proposal (v1.0)

#28 has exposed a serious bug in our current design.

We are currently grouping events by the following labels:

label	from
event_namespace	event.InvolvedObject.Namespace
event_name	event.InvolvedObject.Name
event_kind	event.InvolvedObject.Kind
event_reason	event.Reason
event_type	event.Type
event_subobject	event.InvolvedObject.FieldPath
event_message	event.Message
event_source	event.Source.Host/event.Source.Component

Due to the absence of both event.Name and event.UID, if one object produced multiple events with the same reason and error message at the same time, then our code would attempt to expose multiple metrics with identical label set and #28 would happen.

#36 attempted to address this problem by adding an event_metaname label. This would fix the problem, but it would make the label names confusing. It's not immediately obvious for what do event_metaname and event_name stand. Another problem related to label naming is that Kubernetes has adopted a new metrics design best practice and a metrics overhaul was implemented in Kubernetes 1.14. Our naming practice doesn't fit the standard.

For the reasons explained above, I propose that we release a major release (v1.0) that completely redefine the metrics. The code would use a clean up in the process as well. We could test the changes alone side Compass 2.11.0 (which is also going through a non-compatible metric overhaul), and release event_exporter v1.0.0 as soon as Compass 2.11.0.

What is the etcd version that event_exporter using?

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:
What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:
We need this information for doing some security scans on the app.

Feature: Add configuration to limits reading events to a specific namespace

/kind feature

What happened: Unable to specified a specific namespace to read events from

What you expected to happen:
In our cluster, we do not have the rights to see all cluster resources. By starting the application, we get the following exception.

E0421 17:03:18.033794       1 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Event: events is forbidden: User "system:serviceaccount:dev:event-exporter" cannot list resource "events" in API group "" at the cluster scope

How to reproduce it (as minimally and precisely as possible):

Create a new namespace
Create a new ServiceAccount in that namespace
Create a new RoleBinding to view clusterRole.
Bind the pod to the namespaced service account
Should see errors

Anything else we need to know?:
I tried installing it on OpenShift in a newly created namespace/project.

Event message no longer included as label in v1.0.0

Is this a BUG REPORT or FEATURE REQUEST?:
Potential bug report/regression or a feature request depending on motivations for v1.0.0

Uncomment only one, leave it on its own line:

kind bug

What happened:
The event message is no longer exported as a label.

I believe it was lost in the refactoring. I'm not sure if this was on purpose or by mistake.
This looks like the line in the PR where the event_message used to be: https://github.com/caicloud/event_exporter/pull/43/files#diff-ed648181f98484bd12541509e0ae7b5ad1d1de7674ab1721814b47ddd5c95de4L49

With the new implementation of the event metric defined here: https://github.com/caicloud/event_exporter/pull/43/files#diff-56f9d3288b78a9692046117d91600fe9b201066802b9db18b7f59d283808cd39R154
The message is now no longer present.

What you expected to happen:
I would expect message to be a label on the event metric that we could then pattern match with alerting rules.

Anything else we need to know?:
Like I said, I'm not sure if this is by design or not for v1.0.0 refactor.

Deploy event_exporter to Kubernetes cluster NO Metrics in Prometheus

This event_exporter is nice! We would like to use this to monitoring our cluster behavior and health.
Try to deploy it in cluster with following:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: event-exporter-sa
  namespace: kyma-system
  labels:
    app: event-exporter
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: event-exporter 
rules:
- apiGroups: [""]
  resources: ["events"]
  verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: event-exporter-rb
  labels:
    app: event-exporter
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: event-exporter
subjects:
- kind: ServiceAccount
  name: event-exporter-sa
  namespace: kyma-system

Deployment

{
   "apiVersion": "apps/v1",
   "kind": "Deployment",
   "metadata": {
      "labels": {
         "name": "event-exporter"
      },
      "name": "event-exporter"
   },
   "spec": {
      "replicas": 1,
      "revisionHistoryLimit": 2,
      "selector": {
         "matchLabels": {
            "app": "event-exporter"
         }
      },
      "strategy": {
         "type": "RollingUpdate"
      },
      "template": {
         "metadata": {
            "annotations": {
               "prometheus.io/path": "/metrics",
               "prometheus.io/port": "9102",
               "prometheus.io/scrape": "true"
            },
            "labels": {
               "app": "event-exporter",
            }
         },
         "spec": {
            "containers": [
               {
                  "command": [
                     "./event_exporter"
                  ],
                  "env": [ ],
                  "image": "caicloud/event-exporter:v0.2.0",
                  "imagePullPolicy": "Always",
                  "name": "event-exporter",
                  "ports": [
                     {
                        "containerPort": 9102,
                        "name": "http"
                     }
                  ],
                  "resources": {
                     "limits": {
                        "memory": "100Mi"
                     },
                     "requests": {
                        "memory": "40Mi"
                     }
                  }
               }
            ],
            "serviceAccountName": "event-exporter-sa",
            "terminationGracePeriodSeconds": 30
         }
      }
   }
}

Service

apiVersion: v1
kind: Service
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"prometheus.io/scrape":"true"},"labels":{"name":"event-exporter"},"name":"event-exporter","namespace":"kyma-system"},"spec":{"ports":[{"name":"http","port":80,"targetPort":9102}],"selector":{"app":"event-exporter"}}}
    prometheus.io/scrape: "true"
  creationTimestamp: "2020-04-16T15:53:31Z"
  labels:
    name: event-exporter
  name: event-exporter
  namespace: kyma-system
spec:
  clusterIP: 1*****
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 9102
  selector:
    app: event-exporter
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

In event-exporter container, it shows error like this

INFO  0417-02:21:32.480+00 store.go:110 | start event store...
INFO  0417-02:21:32.480+00 main.go:105 | Starting event_exporter (version=v0.0.1, branch=master, revision=6590adfed64518de2429d2a3becc588671704380)
INFO  0417-02:21:32.481+00 main.go:106 | Build context (go=go1.12.9, user=root@26f1ca322850, date=20190906-09:51:13)
INFO  0417-02:21:32.481+00 main.go:113 | Listening on :9102
E0417 02:21:32.482142       1 reflector.go:126] github.com/caicloud/event_exporter/store.go:111: Failed to list *v1.Event: Get https://100.64.0.1:443/api/v1/events?limit=500&resourceVersion=0: dial tcp 100.64.0.1:443: connect: connection refused
E0417 02:21:33.486319       1 reflector.go:126] github.com/caicloud/event_exporter/store.go:111: Failed to list *v1.Event: Get https://100.64.0.1:443/api/v1/events?limit=500&resourceVersion=0: dial tcp 100.64.0.1:443: connect: connection refused
E0417 02:21:34.487073       1 reflector.go:126] github.com/caicloud/event_exporter/store.go:111: Failed to list *v1.Event: Get https://100.64.0.1:443/api/v1/events?limit=500&resourceVersion=0: dial tcp 100.64.0.1:443: connect: connection refused

Open Prometheus metrics http://localhost:9090/metrics, cannot search out "kubernetes_events"

pull v1.0 - manifest unkown

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:
trying to pull the v1.0 as states in the deploy.yml ended in
Error response from daemon: manifest for caicloud/event-exporter:v1.0.0 not found: manifest unknown: manifest unknown
i'm guessing you haven't release it yet..?

If that's the case - it the src buildable in master brach?

thanks!

grafana dashboard

/kind feature

You probably already have some grafana dashboard and it would be helpful to have some starting point... So can you add a example grafana dashboard (or upload to grafana site and link to in in the documentation)?

Helm chart available?

Sorry for not following the template, but it's rather a question than a feature request: Is there an existing Helm Chart available for event_exporter? Asking because comments in https://github.com/caicloud/event_exporter/blob/master/deploy/deploy.yaml suggest the usage of helm template, but I am unable to find the source.

I see also the included clusterrolebinding is bound to the general purpose clusterrole view. Is there a list of required privileges needed so that a clusterrole can be created for that purpose?

Thanks!

Does it support use on large-scale production clusters?

Is this a BUG REPORT or FEATURE REQUEST?:
No
What happened:
请教一下，这个exporter是否支持在大规模生产集群上使用，如何高可用部署。
另外，events的指标数据规模会不会导致prometheus消耗过多的存储，有测试记录吗？�
What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

rbac.authorization.k8s.io/v1beta1 ClusterRole is deprecated in v1.17+

Is this a BUG REPORT or FEATURE REQUEST?:
BUG REPORT

/kind bug

What happened:
when run helm command there is a warning
rbac.authorization.k8s.io/v1beta1 ClusterRole is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRole

What you expected to happen:
update apiVersion: rbac.authorization.k8s.io/v1beta1 to apiVersion: rbac.authorization.k8s.io/v1

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Failed to list *api.Event: events is forbidden

/kind bug

What happened:
I removed these two in annotations, because I'm not running in AWS.

         "service.beta.kubernetes.io/aws-load-balancer-backend-protocol": "http",
         "service.beta.kubernetes.io/aws-load-balancer-ssl-ports": "https"

E1001 06:28:54.277852       1 reflector.go:216] github.com/caicloud/event_exporter/store.go:111: Failed to list *api.Event: events is forbidden: User "system:serviceaccount:monitoring:event-exporter" cannot list resource "events" in API group "" at the cluster scope

What you expected to happen:
events metrics kubernetes_events

How to reproduce it (as minimally and precisely as possible):
kubectl --context current_context_name -n kube-system apply -f deploy.yaml

Debian stretch EOL

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:
Debian stretch reached EOL
What you expected to happen:
Update linux distro to supported version
How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

docs to integrate with prometheus

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:

There is no docs about how to integrate with prometheus & alertmanager

What you expected to happen:

There is docs about how to integrate with prometheus & alertmanager

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

How to get the message in events?

how to make kube_evnt_count metric count for more than one hour?

What is the license?

/kind feature

What happened:

I did not find the governing license for this repo.

What you expected to happen:
I find a governing license for this repo.

Anything else we need to know?:
https://choosealicense.com/

Existing event alert keeps getting deleted and fired again

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug

What happened:

TLDR; The alert is getting deleted and recreate while the kubernetes event still exist.

The exporter reports existing event as new event every few minutes, meaning even when a kubernetes event exist, the exporter keeps report that the event is resolved (disappear) but then he recreate it (because the event is still exist) which cause to prometheus keeps deleting and recreate alerts of the same event.

What you expected to happen:
When event occurred, the exporter should report and keep it until the event is deleted. (and prometheus will keep firing)

How to reproduce it (as minimally and precisely as possible):
Every event that keeps repeating himself would behave like what I described.

Anything else we need to know?:
It could be miss-configuration of alert-rules in prometheus, but I wanted to be sure if it's a well-known bug.

V1.0 image - "exec format error"

Is this a BUG REPORT or FEATURE REQUEST?:
BUG REPORT

/kind bug

What happened:
After building V1.0 from makefile, created image.
started new project on openshift.
deployment with image v1.0
got error:
standard_init_linux.go:178: exec user process caused "exec format error"

What you expected to happen:
expected container to start

How to reproduce it (as minimally and precisely as possible):
git pull && make
new project on openshift 3.11
deploy with deploy file

Exporting specific events (filter by specific Event Reasons)

/kind feature

Is there a way today to specify a filter of Event reasons to the collector and generate only those metrics?
For example: I want this tool to export metrics only for Reason="OOMKilled" events.

Is it possible? what code changed need to be done to achieve this?
Thanks!

docker build error

WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.10/main/x86_64/APKINDEX.tar.gz: temporary error (try again later)
WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.10/community/x86_64/APKINDEX.tar.gz: temporary error (try again later)
ERROR: unsatisfiable constraints:

Docker image version is not up-to-date

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug

What happened:
Docker image version is still v0.0.1

What you expected to happen:
Docker image version is v0.1.0

How to reproduce it (as minimally and precisely as possible):
Run docker pull cargo.caicloud.io/sysinfra/event-exporter
Then start the container, the log says version is 0.0.1

time="2019-06-20T14:09:54+08:00" level=info msg="start event store..." source="store.go:110" 
time="2019-06-20T14:09:54+08:00" level=info msg="Starting event_exporter (version=v0.0.1, branch=doc, revision=4bc2df81dbed60cd33c379f6eabe4e3b8bcd6dac)" source="main.go:98" 
time="2019-06-20T14:09:54+08:00" level=info msg="Build context (go=go1.7.3, user=vagrant@vagrant-ubuntu-trusty-64, date=20161122-03:59:17)" source="main.go:99" 
time="2019-06-20T14:09:54+08:00" level=info msg="Listening on :9102" source="main.go:106"

Anything else we need to know?:
Would be nice to be able to run docker pull cargo.caicloud.io/sysinfra/event-exporter:v0.1.0

Question: Why not report the timestamp

/kind feature

I'm interested in using this project, but I need to find a way to maintain order of events that are scraped in the same interval. For monitoring purposes the scrape timestamp is good enough, but if multiple events are created within a scrape interval it would be nice to have the option to sort them chronologically.

I want to be able to replay events in a timeseries way in my monitoring solution. If prometheus isn't appropriate for this do you have any suggestions on what to backup events to for analytics purposes?

Container crashes

After running for a while, I see this printed over and over:

W0904 15:26:14.287525       1 reflector.go:334] github.com/caicloud/event_exporter/store.go:111: watch of *api.Event ended with: The resourceVersion for the provided watch is too old.
W0904 15:36:14.476497       1 reflector.go:334] github.com/caicloud/event_exporter/store.go:111: watch of *api.Event ended with: The resourceVersion for the provided watch is too old.

Then:

fatal error: concurrent map writes

goroutine 31137 [running]:
runtime.throw(0x1478e97, 0x15)
	/usr/local/go/src/runtime/panic.go:566 +0x95 fp=0xc42054ebe0 sp=0xc42054ebc0
runtime.mapassign1(0x12fd240, 0xc42028b890, 0xc42054ed58, 0xc42054ed68)
	/usr/local/go/src/runtime/hashmap.go:458 +0x8ef fp=0xc42054ecc8 sp=0xc42054ebe0
github.com/caicloud/event_exporter/vendor/k8s.io/kubernetes/pkg/client/cache.(*DeltaFIFO).queueActionLocked(0xc4200904d0, 0x14657ad, 0x4, 0x143ee20, 0xc420120800, 0x1, 0x0)
	/home/vagrant/gocode/src/github.com/caicloud/event_exporter/vendor/k8s.io/kubernetes/pkg/client/cache/delta_fifo.go:314 +0x249 fp=0xc42054edb0 sp=0xc42054ecc8
github.com/caicloud/event_exporter/vendor/k8s.io/kubernetes/pkg/client/cache.(*DeltaFIFO).Resync(0xc4200904d0, 0x0, 0x0)
	/home/vagrant/gocode/src/github.com/caicloud/event_exporter/vendor/k8s.io/kubernetes/pkg/client/cache/delta_fifo.go:498 +0x179 fp=0xc42054ee88 sp=0xc42054edb0
github.com/caicloud/event_exporter/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch.func1(0xc42032e0f0, 0xc420516ae0, 0xc4202ad860, 0xc420517740, 0xc42032e0f8)
	/home/vagrant/gocode/src/github.com/caicloud/event_exporter/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:289 +0x1b5 fp=0xc42054ef88 sp=0xc42054ee88
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc42054ef90 sp=0xc42054ef88
created by github.com/caicloud/event_exporter/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch
	/home/vagrant/gocode/src/github.com/caicloud/event_exporter/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:296 +0x4c0

goroutine 1 [IO wait, 2756 minutes]:
net.runtime_pollWait(0x7f396b5cf870, 0x72, 0x0)
	/usr/local/go/src/runtime/netpoll.go:160 +0x59
net.(*pollDesc).wait(0xc420354990, 0x72, 0xc4204d3ab8, 0xc42001c0b0)
	/usr/local/go/src/net/fd_poll_runtime.go:73 +0x38
net.(*pollDesc).waitRead(0xc420354990, 0x1e48ba0, 0xc42001c0b0)
	/usr/local/go/src/net/fd_poll_runtime.go:78 +0x34
net.(*netFD).accept(0xc420354930, 0x0, 0x1e46e60, 0xc4202835c0)
	/usr/local/go/src/net/fd_unix.go:419 +0x238
net.(*TCPListener).accept(0xc4201b0058, 0x29e8d60800, 0x0, 0x0)
	/usr/local/go/src/net/tcpsock_posix.go:132 +0x2e
net.(*TCPListener).AcceptTCP(0xc4201b0058, 0xc4204d3be0, 0xc4204d3be8, 0xc4204d3bd8)
	/usr/local/go/src/net/tcpsock.go:209 +0x49
net/http.tcpKeepAliveListener.Accept(0xc4201b0058, 0x150cab0, 0xc420368b00, 0x1e53360, 0xc4201afb00)
	/usr/local/go/src/net/http/server.go:2608 +0x2f
net/http.(*Server).Serve(0xc420368080, 0x1e52c20, 0xc4201b0058, 0x0, 0x0)
	/usr/local/go/src/net/http/server.go:2273 +0x1ce
net/http.(*Server).ListenAndServe(0xc420368080, 0xc420368080, 0x2)
	/usr/local/go/src/net/http/server.go:2219 +0xb4
net/http.ListenAndServe(0x1465f8d, 0x5, 0x0, 0x0, 0xc4203e64b0, 0x0)
	/usr/local/go/src/net/http/server.go:2351 +0xa0
main.main()
	/home/vagrant/gocode/src/github.com/caicloud/event_exporter/main.go:107 +0x59a

goroutine 17 [syscall, 2756 minutes, locked to thread]:
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:2086 +0x1

goroutine 7 [chan receive]:
github.com/caicloud/event_exporter/vendor/github.com/golang/glog.(*loggingT).flushDaemon(0x1e6ecc0)
	/home/vagrant/gocode/src/github.com/caicloud/event_exporter/vendor/github.com/golang/glog/glog.go:879 +0x7a
created by github.com/caicloud/event_exporter/vendor/github.com/golang/glog.init.1
	/home/vagrant/gocode/src/github.com/caicloud/event_exporter/vendor/github.com/golang/glog/glog.go:410 +0x21d

goroutine 23 [syscall, 2756 minutes]:
os/signal.signal_recv(0x0)
	/usr/local/go/src/runtime/sigqueue.go:116 +0x157
os/signal.loop()
	/usr/local/go/src/os/signal/signal_unix.go:22 +0x22
created by os/signal.init.1
	/usr/local/go/src/os/signal/signal_unix.go:28 +0x41

goroutine 24 [chan receive, 2756 minutes]:
main.(*EventStore).Run(0xc4201b5220)
	/home/vagrant/gocode/src/github.com/caicloud/event_exporter/store.go:112 +0xf5
created by main.main
	/home/vagrant/gocode/src/github.com/caicloud/event_exporter/main.go:95 +0x1e4

goroutine 11122 [select, 3 minutes]:
net/http.(*persistConn).readLoop(0xc4204ee500)
	/usr/local/go/src/net/http/transport.go:1541 +0x9c9
created by net/http.(*Transport).dialConn
	/usr/local/go/src/net/http/transport.go:1062 +0x4e9

A bit more of that and then this repeating:

goroutine 15512 [select, 1 minutes]:
github.com/caicloud/event_exporter/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch.func1(0xc4201b00f8, 0xc420516ae0, 0xc4202ad860, 0xc4204818c0, 0xc4201b0100)
	/home/vagrant/gocode/src/github.com/caicloud/event_exporter/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:283 +0x303
created by github.com/caicloud/event_exporter/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch
	/home/vagrant/gocode/src/github.com/caicloud/event_exporter/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:296 +0x4c0

Then container container is killed by k8s.

Here's the container information from k8s:

Containers:
  app:
    Container ID:  docker://87c603584b7d193716ecb79f70a10a8ed49e4050fc1e89412322e09a233473a9
    Image:         cargo.caicloud.io/sysinfra/event-exporter:latest
    Image ID:      docker-pullable://cargo.caicloud.io/sysinfra/event-exporter@sha256:826f54f71c3802f59164a108b87a7a5b002efccbd80e0d13c3da94140baf5c3a
    Port:          9102/TCP
    Host Port:     0/TCP
    Args:
      --logtostderr
    State:          Running
      Started:      Wed, 04 Sep 2019 14:03:32 +0200
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Mon, 02 Sep 2019 16:07:08 +0200
      Finished:     Wed, 04 Sep 2019 14:03:31 +0200
    Ready:          True
    Restart Count:  1
    Limits:
      cpu:     200m
      memory:  128Mi
    Requests:
      cpu:        100m
      memory:     128Mi
    Environment:  <none>

/kind bug

What happened:

Container crashes

What you expected to happen:

Container not to crash.

How to reproduce it (as minimally and precisely as possible):

Just run it for 1-5 days.

Feature request: OOM or cpu limit events

Is this a BUG REPORT or FEATURE REQUEST?:
/kind feature

I think it is a feature, but don't even know if it is possible :)

What happened:
My K8s setup was having a Pod OOM and another with a CPU limit. I was expecting that the k8s event showed those events, but all i could see was the healtcheck failure.

What you expected to happen:
Some k8s event for when the pod limits are trigger

How to reproduce it (as minimally and precisely as possible):
Set a pod with low memory and cpu limit and start it up

Anything else we need to know?:
i'm currently in k8s 1.15

Security fix: upgrade golang-runtime version

kind feature

What happened:
This image runs on top of a very old version of golang-runtime (1.13.11) while there are newer versions which include security fixes (1.20)

What you expected to happen:
Upgrade your golang-runtime version to the latest version

How to reproduce it (as minimally and precisely as possible):
Scan your image via blackduck binary analysis and you will see the issues.
See screenshot for more details

Anything else we need to know?:
No

caicloud / event_exporter Goto Github PK

event_exporter's People

Contributors

Stargazers

Watchers

Forkers

event_exporter's Issues

Recommend Projects

Recommend Topics

Recommend Org