cloudfoundry / go-loggregator Goto Github PK
View Code? Open in Web Editor NEWGo Client Library for Loggregator
License: Apache License 2.0
Go Client Library for Loggregator
License: Apache License 2.0
Now there are predefined selectors which enable selecting/filtering events based on type. How do we use the selector to select specific sub type of the events? Say I only care about application metrics and access logs, I only want container metrics and http start/stop events. How can we do that with the go-loggreator client?
This is useful if you have to compile a test binary to run on another machine. That way you don't also have to copy a bunch of filesystem state around with the test binary. The test brings its own data dependencies.
Some of the runtime metrics emitted by dropsonde
aren't emitted here, like memoryStats.numMallocs
, memoryStats.numFrees
, and numCPUS
. Should metrics match between dropsonde
and this package, or were those metrics not useful enough to include here?
It would be super cool if the pulseemitter functionality was exposed through the ingress client so that we could interact with gauges and counters in the same way as timers and events
The original author of go-bindata
deleted their github account and someone else remade the repo with the exact same name. This is a bit sketchy. I also couldn't find any go-bindata
forks that appear to be actively maintained.
We could keep using this tool, but it seems like we should go back to just having a fixtures
directory with some certs for testing, as required.
The EnvelopeStreamConnector takes a context to keep track of a stream's lifecycle. However, the underlying connection is never closed. This results in a go-routine leak.
The v2 API should be the primary code path for this package. I propose we have the following structure:
Current:
/ - compatibility layer
/v1 - v1 client
/v2 - v2 client
Desired:
/ - v2 client
/compat - v1 client and compatibility layer
This will allow for an easy rm -r compat
in the future.
These need to be updated:
https://github.com/cloudfoundry/go-loggregator/tree/041998b54f880b3e5460fcc4e7d4d77742dc3d86#example
For instance examples/main.go
doesn't exist anymore.
While working on updating downstream consumers (e.g. code.cloudfoundry.org/diego-logging-client), I noticed the following items are not working for this repo:
v
. The tag should be in the format of v8.0.4
. This is causing go modules failing to pull the latest changesThe output types of NewCounterMetric(...)
and NewGaugeMetric(...)
(*CounterMetric
and *GaugeMetric
respectively) do not offer a user a straightforward path to create a spy.
While a PulseEmitter
could be injected, the CounterMetric
and GaugeMetric
force the user to submit an envelope, and then query the underlying contents of the envelope. This implies the user has to have intimate knowledge about Loggregator envelopes, which goes against the goal of go-loggregator.
Update NewCounterMetric(...)
and NewGaugeMetric(...)
to return interface types:
type Gauge interface {
Set(int64)
}
and
type Counter interface {
Increment(uint64)
}
This would allow the entire PulseEmitter
to be replaced, in-full, with a spy/mock. It would also enable the deletion of GetDelta()
from CounterMetric
. This method is only useful from a test perspective and could easily be "cleaned" away and break test compatibility.
When using IngressClient, if the application exits before batchFlushInterval
time has elapsed after calling EmitLog()
the message will never be delivered because EmitLog()
doesn't immediately send the message, but batches requests and delivers them at some point in the future.
I attempted to work around this by configuring the client using WithBatchMaxSize(0)
but this didn't seem to help.
What's the recommended way to immediately deliver a single message without relying on time.Sleep()
?
Thanks!
Pivotal uses GITBOT to synchronize Github issues and pull requests with Pivotal Tracker.
Please add your new repo to the GITBOT config-production.yml
in the Gitbot configuration repo.
If you don't have access you can send an ask ticket to the CF admins. We prefer teams to submit their changes via a pull request.
Steps:
config-production.yml
fileIf there are any questions, please reach out to [email protected].
It would be nice if there were a wrapper method that would take care of both creating and sending a timer metric.
The recent introduction of vendored libraries whose types are used in go-loggregator interfaces (in 78f871c) breaks non-module builds of libraries which use go-loggregator. For example, building an app that uses diego-logging-client results in the following error:
# code.cloudfoundry.org/diego-logging-client
src/code.cloudfoundry.org/diego-logging-client/client.go:99:64: cannot use "google.golang.org/grpc".WithBlock() (type "google.golang.org/grpc".DialOption) as type "code.cloudfoundry.org/go-loggregator/vendor/google.golang.org/grpc".DialOption in argument to loggregator.WithDialOptions:
"google.golang.org/grpc".DialOption does not implement "code.cloudfoundry.org/go-loggregator/vendor/google.golang.org/grpc".DialOption (missing "code.cloudfoundry.org/go-loggregator/vendor/google.golang.org/grpc".apply method)
src/code.cloudfoundry.org/diego-logging-client/client.go:99:84: cannot use "google.golang.org/grpc".WithTimeout(time.Second) (type "google.golang.org/grpc".DialOption) as type "code.cloudfoundry.org/go-loggregator/vendor/google.golang.org/grpc".DialOption in argument to loggregator.WithDialOptions:
"google.golang.org/grpc".DialOption does not implement "code.cloudfoundry.org/go-loggregator/vendor/google.golang.org/grpc".DialOption (missing "code.cloudfoundry.org/go-loggregator/vendor/google.golang.org/grpc".apply method)
This is caused by using the vendored grpc.DialOption
as a parameter type here:
go-loggregator/ingress_client.go
Line 22 in 78f871c
Builds using modules still work, because then vendored libraries are ignored. From what I could find, it doesn't seem to be a good idea to have vendored types in interfaces (see this SO answer).
User scenario:
We have interests in fetch container metrics & http start/stop from v2 API with selector subscription.
Referring to the code
https://github.com/cloudfoundry/go-loggregator/blob/master/examples/envelope_stream_connector/main.go
we can open a stream with the fixed selector, but the connection can't be closed easily if the selector changes on the fly.
For example, we subscribed the metrics for appA and appB at first, then after a while, we would like to subscribe the metrics for appB and appC. In this case, we would like to close previously connection, and open a new one for the new subscription.
Issue found:
The envelope_stream_connector implementation (https://github.com/cloudfoundry/go-loggregator/blob/master/envelope_stream_connector.go ) doesn't expose a method to client to close the stream.
This is the issue I hope to addressed here.
Further Concern:
For above user scenario, if we choose to close the staled connection and issue a new one , there is a probability that some of the metrics of appB maybe lost during the connection switch. Is that possible to modify the subscription on the fly ?
The log-store has ran into an issue trying to add the loggregator.WithEnvelopeStreamBuffer
and Stream
methods. When the provided context is canceled, which we do in our code to gracefully shutdown and handle hanging connections, a nil pointer is dereferenced causing a panic.
Here's a failing test that we were able to reproduce the error:
It("wont panic when context canceled", func() {
producer, err := newFakeEventProducer()
Expect(err).NotTo(HaveOccurred())
// Producer will grab a port on start. When the producer is restarted,
// it will grab the same port.
producer.start()
defer producer.stop()
tlsConf, err := loggregator.NewIngressTLSConfig(
fixture("CA.crt"),
fixture("server.crt"),
fixture("server.key"),
)
Expect(err).NotTo(HaveOccurred())
var (
mu sync.Mutex
missed int
)
addr := producer.addr
c := loggregator.NewEnvelopeStreamConnector(
addr,
tlsConf,
loggregator.WithEnvelopeStreamBuffer(5, func(m int) {
mu.Lock()
defer mu.Unlock()
missed += m
}),
)
// Use a context that can be canceled
ctx, cancel := context.WithCancel(context.Background())
rx := c.Stream(ctx, &loggregator_v2.EgressBatchRequest{})
var count int
// Read to allow the diode to notice it dropped data
go func() {
for range time.Tick(500 * time.Millisecond) {
// Do not invoke rx while mu is locked
l := len(rx())
mu.Lock()
count += l
mu.Unlock()
}
}()
Eventually(func() int {
mu.Lock()
defer mu.Unlock()
return missed
}).ShouldNot(BeZero())
// When the context is canceled, the client panics
cancel()
mu.Lock()
l := count
mu.Unlock()
Expect(l).ToNot(BeZero())
})
We copied this from "enables buffering" in envelope_stream_connector_test.go
but added a cancel-able context.
When running we get:
❯ ./scripts/test
+ ginkgo -r -race
[1599068942] GoLoggregator Suite - 43/43 specs ••••••••••••••••••••••••••••••••panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x181778b]
goroutine 120 [running]:
code.cloudfoundry.org/go-loggregator/v8.(*OneToOneEnvelopeBatch).Next(...)
/Users/jmcbride/workspace/go-loggregator/one_to_one_envelope_batch_diode.go:45
code.cloudfoundry.org/go-loggregator/v8_test.glob..func1.4.2(0xc000561a30, 0xc0000396e0, 0xc000039700)
/Users/jmcbride/workspace/go-loggregator/envelope_stream_connector_test.go:186 +0x8c
created by code.cloudfoundry.org/go-loggregator/v8_test.glob..func1.4
/Users/jmcbride/workspace/go-loggregator/envelope_stream_connector_test.go:183 +0x7b6
Ginkgo ran 1 suite in 8.758037312s
Test Suite Failed
We would expect this client library to be able to take a context that can be canceled as we cancel our context to start a graceful shutdown of our nozzle. Let me know if you have any questions!!
This is redundant and not needed:
https://github.com/cloudfoundry-incubator/go-loggregator/blob/17682e3bc1157ea3b83e292ef6ee974ba992918c/ingress_client.go#L161-L163
https://github.com/cloudfoundry-incubator/go-loggregator/blob/17682e3bc1157ea3b83e292ef6ee974ba992918c/ingress_client_test.go#L85-L106
SourceId on the envelope is where this is stored.
I write some code according to the example(https://github.com/cloudfoundry/go-loggregator/blob/master/examples/envelope_stream_connector/main.go)
One problem is that when there is some error in the tlsConfig, the stream will fail and it continuously retry and print out too many error logs.
My question is that is it possible to set a max-retry time so that after max-retry it will return error?
streamConnector := loggregator.NewEnvelopeStreamConnector(
os.Getenv("LOGS_API_ADDR"),
tlsConfig,
loggregator.WithEnvelopeStreamLogger(loggr),
)
Hello, not sure if this it the correct place for this question, but to test locally writing code that emits to a loggregator, you would typically want to emit on localhost 3457 to a metron agent, but if you are testing on your laptop how do you simulate this? Is the best way an ssh tunnel into a bosh vm with metron agent or is there some other way to execute the samples? Thanks!
This project should switch to using google.golang.org/protobuf/proto
instead.
We have WithSourceInfo
(which is an EmitLogOption), WithCounterSourceInfo
, WithGaugeSourceInfo
and WithTimerSourceInfo
, but there is no EmitEventOption corollary.
Hi, would it be possible to update some of the dependencies that this project uses? Now that this project is using gomodules it should be possible to use dependabot to create PRs for dependency updates. To make full use of this it would be worth creating a basic build / test process, maybe using GitHub actions to validate that these changes. I'm happy to take a look at this if it would help?
As a user of go-loggregator I would like to be able to pass the ingress client keepalive.ClientParameters
or any other grpc.DialOption
.
Since we are encouraging users of loggregator to set source_id
and instance_id
for use in log-cache. We should update our examples to setting the source info.
Hello,
We generate FakeIngressServer in diego-logging-client for testing our Send*Log methods in the library. Since the update protobuf PR #81 we see that there is a new private method mustEmbedUnimplementedIngressServer
that is making our FakeIngressServer implementation no longer valid. Can you please advice how we are suppose to solve this problem so that we can continue generating a FakeIngressServer.
Context: We got here because we had to upgrade lager, which meant that we have to upgrade ginkgo to v2 everywhere in diego-release, which then meant we had to upgrade from go-loggregator v8 to v9, which then meant we had to upgrade diego-logging-client, which then meant we can no longer generate FakeIngressServer.
Do you have any suggestion for a workaround or other solution for this problem?
Recently the gRPC Go maintainers shared their intention to deprecate grpc.Dial
and grpc.DialContext
and encourage users to use grpc.NewClient
instead.
go-loggregator accepts arbitrary grpc.DialOptions
, some of which will not be honoured after switching to grpc.NewClient
. As a result when we switch we should major version bump our module.
Current plan is:
grpc.Dial
and grpc.DialContext
.Hey 👋! I'm working on porting datadog-firehose-nozzle to loggregator V2 API [1]. I noticed that it's possible to add a log
object to the NewRLPGatewayClient
function to get logs from the underlying logic. I'd like to request adding an option to add a gosteno Logger, as (IIUC) this is a logging implementation used/developed for cloudfoundry purposes and thus it would be nice to get it supported.
Note that for our usecase, it would also be sufficient to implement a channel with errors that the RLPGatewayClient would be pushing to and we could read from it and do the logging ourselves. I requested this in [2]. (Although technically the errors channel would not be useful for debug/info logs, so this would be useful either way.)
Thanks for considering!
I would expect all the required dependencies to be in the vendor directory, however there are a few missing.
Hey 👋! I'm working on porting datadog-firehose-nozzle to loggregator V2 API [1]. With the noaa/consumer
package for V1, we instantiated FilteredFirehose
which returned channel with messages (envelopes) and also channel with errors. We could then have custom logic reacting to the errors (emitting custom logging messages etc). Additionally, we were able to use methods like SetMaxRetryCount
to make the nozzle fail if there's e.g. a configuration problem and the URL we're trying to connect to has a typo in it.
The RLPGatewayClient
structure has no means of sending errors and/or setting a maximum count of connection retries. This means that we have no way of customizing error behavior (either through reading some sort of channel with errors or using specialized methods like SetMaxRetryCount
).
The biggest issue with the current codebase is that if user misconfigures the nozzle, it will keep trying to connect forever, which it shouldn't. A misconfigured nozzle should fail (soon-ish). The smaller issue is that we have no way of customizing what error messages will look like and at what levels they will be printed.
Would it be possible to add either a channel with errors or the specialized methods for error handling (or, ideally, both)? I realize that this RFE might in fact result in implementation of two unrelated features, but I hope that's ok for you as I see them very related.
Thanks for considering!
As part of diego's transition from v1 to v2 they will need to use the runtimeemitter
on v1. We discussed simply adding EmitGauge(opts ...v2.EmitGaugeOption)
to loggregator.Client
interface and then implementing that method on the v1.Client
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.