Comments (14)
It's pretty close to what I desire and would certainly make a good first pass of the feature. The only enhancement I would consider is making it path aware as http://test/0123/blah
and http://test/0123/foo/blah
would be classified as http://test/*/blah
under a basic wild card approach when really they should be two entries.
from aws-xray-sdk-python.
Seen this issue in some of our Python work at Custom Ink. Subscribed to the issue to see where we land on this solution. Thanks ahead of time!
In the meantime, I think we have utilized wrapt
to freedom patch record_subsegment
and that seems to have been working well.
from aws-xray-sdk-python.
Hi,
Thank you for your time for the writeup. I agree with you. With the presence of path based routing, the SDK doesn't have enough information to do the grouping correctly. In your example it is very likely that the resource/*
is served by a single service. But there are also counterexamples: data/1
is routed to control-plane fleet while data/2
is routed to data-plane fleet.
I think it really depends on who owns those services. If you own those services that serving resource/*
and they are also integrated with X-Ray, they will also actively emit segments. The X-Ray service graph will respect the segment emitted from the server side other than caller. So if all resource/1
, resource2
are served with a Django app which emits segment named my_service
, then in service graph you will only see a node called my_service
.
If you don't own the service that serving those requests, then the question comes down to if you want to aggregate them or not and aggregate up to which level on the url schema. Generally I would recommend to not aggregate anything because you don't know things behind the scene and it can be difficult to identify issues if all urls are aggregated. But I definitely would like to know more about your use case here. The SDK will for sure provide better customer experience by having more flexibility on user configuration.
Please let me know your thoughts.
from aws-xray-sdk-python.
Thanks for your answer @haotianw465, some comments about your comments
I think it really depends on who owns those services. If you own those services that serving resource/* and they are also integrated with X-Ray, they will also actively emit segments. The X-Ray service graph will respect the segment emitted from the server side other than caller. So if all resource/1, resource2 are served with a Django app which emits segment named my_service, then in service graph you will only see a node called my_service
Well in an organization that is starting to adopt the tracing technology, where only a few of services have this technology in place it's the perfect scenario which can't trust on the caller to name the node with the proper service name. True that at least for all of the services that you own at some point once the tracing technology is adopted all of the nodes will be named with the proper name. But it will take some time.
In any case, there is the case when you are calling an external service that you don't have any kind of control. In that case, the name of the node that will persist will be always the URL with all of the issues that I've commented.
Going back to your example of having two different URLs that are prefixed by the same hostname but can end up in a different service, data/1
and data/2
. This is a scenario is quite similar to one of the use cases in our infrastructure. Where, there is an HTTP middleware that implements some commonalities such as Auth, Rate limiting and routes the traffic to downstream services using part of the URI to identify it univocally. In that case, I'm still thinking that the best option is to use the hostname to identify the node. So, having a unique node in the AWS Xray console that identifies all of the calls to this intermediate layer, while later, the downstream services that are behind to each URL path will have the chance to be identified also in the AWS Xray console with their proper name.
And last but not least take into account that the full URL is always recorded as a tag value.
I'm still thinking that the first part of the URL is the least bad.
from aws-xray-sdk-python.
Thank you for the explanation. I totally understand your concerns and this use case is very reasonable and common. I'm open to discuss a way of configuring custom url name capture. Please let me know if you have any suggestions of how you want to conveniently configure url renaming.
We have a dynamic naming configuration for middleware to name segment based on host header from incoming request. See https://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-python-middleware.html#xray-sdk-python-middleware-naming. Please let me know if you would like to see something similar on subsegment naming for outgoing http requests.
from aws-xray-sdk-python.
What I was hoping for was either a config where I could something like
http://hostname/path/*/blah/*
where * gets aggregated with a placeholder in the uploaded segments.
Or alternatively prior to using requests being able to specify a custom name for that segment.
from aws-xray-sdk-python.
Hey, thanks for the feedback! We'll definitely take this into consideration for the UX for this feature request. Do you have any recommendations for how custom names for downstream requests should be specified at the SDK? Would similar to what was mentioned above by Haotian465 be a good experience for naming these traces? https://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-python-middleware.html#xray-sdk-python-middleware-naming
from aws-xray-sdk-python.
I have also ended up with a graph much like @TheSkorm has. As an intermediary solution, would it be possible to configure in such a way so that you have the option of only using the domain name? (a boolean config vs custom path matching)
from aws-xray-sdk-python.
Can you elaborate more with what you mean?
As an example, let's say we have a key AggregateDomainName
and it's a boolean flag.
Setting it to true
would mean that any subsegment which has the same domain name would be aggregated into a single node by having their subsegments all have the same name with just the domain. This would mean that https://amazon.com/get/food
and https://amazon.com/get/some/books
would both aggregate as https://amazon.com/
.
Is this an accurate depiction of what you're requesting?
from aws-xray-sdk-python.
Yes, that's along the lines of what I was thinking. It's not as configurable as the wildcard style matching mentioned previously but perhaps easier to implement?
Right now the requests
tracing is not helpful in my particular situation, as there is no aggregation on response times/codes for HTTP requests (my graph looks similar to @TheSkorm )
from aws-xray-sdk-python.
Thanks for the response. We will consider both cases and mark this accordingly as a feature request.
The first case would be to allow customers to enable wildcards for url names and aggregate them based on the expression:
For example,
http://hostname/path/*/blah/* would aggregate the following as the same nodes
http://hostname/path/somepath/blah/unknown
http://hostname/path/diffpath/blah/somestring
The second case would be to aggregate whole nodes based on their host names:
This would mean that https://amazon.com/get/food and https://amazon.com/get/some/books would both aggregate as https://amazon.com/.
In either case, we would need a centralized system to be able to keep track of downstream calls. It would probably be better to utilize existing centralized sampling rules to to aggregate known domain names together, and have the SDKs name the downstream calls accordingly.
from aws-xray-sdk-python.
It turns out that the subsegment name should only contain the hostname of the endpoint that it's being targetted and has been a behavior consistent across all our other SDKs except this one.
WIth PR #192, does this fix the issue you guys were having? All then nodes were intended to be aggregated and not be differentiated into different nodes for each unique path.
from aws-xray-sdk-python.
Can confirm that the 2.4.3
release that includes #192 fixes the issue enough for me to make requests tracing usable. Thanks!
from aws-xray-sdk-python.
Hi @sreid
That's great! Hope this fixes the issue for everyone. In case the issue still persists, feel free to reopen this or create a new one.
from aws-xray-sdk-python.
Related Issues (20)
- How to discard a trace by manual? HOT 2
- aws-xray-sdk 2.11.0 removes causes from exceptions HOT 1
- ERROR: cannot find the current segment/subsegment when segment is open and uploading file to s3. HOT 3
- Installing SDK without botocore and other transitive dependencies HOT 4
- Custom emitter based on boto3 creates an infinite loop in the SDK HOT 3
- Bug: nested subsegments don't work across threads HOT 4
- IndexError when using AWS X-Ray SDK with SQLAlchemy HOT 1
- EKSPlugin HOT 1
- Very rare ReferenceError HOT 2
- Support for psycopg3 HOT 2
- Link a Textract async operation with downstream process HOT 1
- Current tox versions do not like how testenv.passenv is set
- aws_xray_sdk.core.exceptions.exceptions.SegmentNameMissingException: Segment name is required. HOT 1
- Using psycopg2 connection_factory throws exceptions HOT 4
- Documentation for patch_all HOT 2
- Sampling configuration should discuss DefaultSampler vs. LocalSampler HOT 1
- Flask middleware errors when an earlier Flask extension throws an exception in a before_request method HOT 1
- Patched DB cursor and template may record outside of XRayMiddleware HOT 2
- Segment not propgated when subsegment metadata is sufficiently long HOT 4
- sqlalchemy_core patch errors for unencoded special characters in db url HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aws-xray-sdk-python.