Comments (6)
Thanks a lot! I think that's all my questions at the moment!
Really appreciate the answers and all the great work!
from cloudwatchlogsbeat.
Hello!
the application logs will help you troubleshoot your situation (run ./cloudwatchlogsbeat -e -d '*'
for viewing the debug logs as well). Feel free to post some portion of the output here.
I am assuming you've read the comments in the provided configuration sample and you've set the appropriate values for your use case. I would suggest deactivating hot streams (hot_stream_event_horizon:0
) and try to troubleshoot the basic functionality.
As for your scaling question: the application process is not distributed in the sense that it does not scale horizontally. If you fire up more instances, it will process the logs as many times as you have processes running. However, the logs groups are processed in parallel within the application process (using goroutines), so your basic bottlenecks would be network- or AWS- related (e.g. AWS API throttling limits).
Hope the above help - good luck!
from cloudwatchlogsbeat.
Hi, thanks for the reply!
Yes I've gone through the config several times and the source code a little bit. Not a go expert, so it's bit hard for me to understand the whole thing.
So when I debug earlier, I was basically not see my stream get monitored by the group for a while, but once it's picked up, it get processed relative faster. Some example output for illustration,
2020-04-16T16:34:36.976Z INFO cwl/group.go:110 report[group] 251 2 0 /aws/app-group/ 3m0s
2020-04-16T16:34:42.895Z INFO cwl/stream.go:143 report[stream] 82 /aws/app1/logs 3m0s
2020-04-16T16:34:45.111Z INFO cwl/stream.go:143 report[stream] 4 /aws/app2/logs 3m0s
I was hoping to see a test-app
to pop out in the logs like above, but for a very long period time it didn't show up. Is it because I have too many streams in the group and the roundrobin loadbalancer missed it sometimes as other streams are more active?
And I noticed that the report_frequency
actually affects the group to pick up newly active streams? Cause I ran a test again this morning, it takes about 16min to finish, but ever after that, it runs smoothly.
So is it true that a new stream would be picked until the report interval is met, i.e. after 5 minutes, if the stream is not active for more than stream_event_horizon
, i.e. 2 hours ?
I will debug with hot_stream
off and if I am able to see any useful info, will let you know. Thanks!
from cloudwatchlogsbeat.
The report interval should not be affecting the stream monitoring - it's just a log statement printing out some counters.
My understanding is that your issue has to do with the delay in picking up streams when the application starts. After that, things tend to be going "smoothly".
One reason why this happens is that the application has to process all the events within the stream_event_horizon
- any event older than this value will be ignored. So for stream_event_horizon=2h
, the application has to process 2 hours worth of events when it first starts and that's why it appears to be slower. On top of that, because of the larger amount of data that need to be fetched from cloudwatchlogs, you would also get more throttling AWS API errors (these are handed gracefully by the application).
Once the application processes all the events within the stream_event_horizon
at startup time, then only new events will be considered, so things will appear to work smoothly.
Hope this helps.
from cloudwatchlogsbeat.
Thanks a lot for the reply!
I tried a bit by tweaking stream_event_horizon
and the logs do come up faster and experience less latency. And after that things are going much smoother.
Hopefully one last question regarding the throttling. After retuning the settings, I notice that even the app itself is not giving back throttling: rate limit exceeded
as often, I'm constantly experiencing throttling when manually using the AWS Console for cloudwatch. So I'm curious that if that is true the app is always set to use as much AWS api calls as possible or there is a limit I can set, e.g., only process 10 streams at given time range for x amount of time?
Is the queue_size
something that I can use to control this?
Thanks!
from cloudwatchlogsbeat.
The queue_size
is not relevant to the aws throttling errors. A counter-measure for decreasing the number of aws api calls (and reducing throttling errors) is to try increasing the values of the frequency parameters, for example:
stream_event_refresh_frequency
(how often to inquire for new events in streams),hot_stream_event_refresh_frequency
(how often to inquire for new events in "hot" streams),group_refresh_frequency
(how often to inquire for new log groups) andstream_refresh_frequency
(how often to inquire for new event streams within log groups)
The AWS client uses the default retry policy (10 retries with exponential backoff). Limits involving API calls per unit of time have not been implemented.
from cloudwatchlogsbeat.
Related Issues (20)
- Provide Docker image HOT 5
- Update to latest Beats HOT 3
- A bit confused (config) HOT 1
- ResourceNotFoundException not sure what to do next HOT 6
- S3 keys not being created, proper value of prospectors.id? HOT 1
- No Elasticsearch output HOT 5
- Xpack metrics HOT 2
- Docker should be based off a linux distribution HOT 1
- ability to add fields? HOT 1
- crashing: fatal error: stack overflow HOT 2
- No output to Elasticsearch and nothing in the activity log HOT 3
- elasticsearch 7 HOT 2
- Some questions about functionalities HOT 3
- Go Modules HOT 3
- Include AWS CloudWatch Log Group Tags HOT 2
- Add EventId in order to deduplicate
- Version of application code running HOT 1
- Question: Amazon Kinesis HOT 3
- Command line arguments HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cloudwatchlogsbeat.