aws / aws-xray-sdk-python Goto Github PK

AWS X-Ray SDK for the Python programming language

License: Apache License 2.0

Python 99.33% HTML 0.05% Smarty 0.01% Dockerfile 0.03% HCL 0.58%

aws-xray-sdk-python's Issues

Support http url characters as trace entity names per RFC#1738

I'm getting the following, running aws-xray-sdk==1.0:

Removing Segment/Subsugment Name invalid characters from https://my.elasticsearch.cluster/iis_log_prod-*/_search. Elasticsearch does use a bit weird url naming, but it AFAIK they're fully conformant.

For info, i'm not manually naming xray-subsegments, but I'm assuming this happens under the cover when I use the requests package to invoke url requests.

Recorder Exception When Beginning Segment

AWSXRayRecorder is throwing a TypeError when creating a new segment. The exception is being thrown because the recorder is calling should_trace on the sampler, which starting in 2.0 requires sample_req.

Stack trace

should_trace() takes exactly 2 arguments (1 given): TypeError
Traceback (most recent call last):
File "/var/task/src/setup_database.py", line 55, in handler
xray_recorder.begin_segment('setup_database.handler')
File "/var/task/aws_xray_sdk/core/recorder.py", line 208, in begin_segment
decision = self._sampler.should_trace()
TypeError: should_trace() takes exactly 2 arguments (1 given)

patch_all() is unsafe

There is a bit of ambiguity of patch_all() approach on aws-xray-sdk-python. Use thereof automatically enables new patchers regardless to how well they've been tested against real use cases (#48, #90). This in turn can can lead to a nasty surprise where deployment of lambda function will not work as it patches the calls.

Some ideas on making the situation better:

Discourage use of patch_all()due to its unsafe nature.
Do not include new patchers in patch_all() outside of major versions
Split patch_all() into two functions patch_all() and patch_all_bleeding_edge() where new, and un-vetted patchers are first introduced into patch_all_bleeding_edge() and after they've been tested are then introduced to patch_all()

Issue with using xray with async function

Tried using xray with async function but keep getting this error

Traceback (most recent call last):
  File "app.py", line 333, in <module>
    xray_recorder.end_segment()
  File "/usr/local/lib/python3.6/site-packages/aws_xray_sdk/core/recorder.py", line 206, in end_segment
    if self.current_segment().ready_to_send():
AttributeError: 'NoneType' object has no attribute 'ready_to_send'

PIP

aiobotocore     0.9.4
aws-xray-sdk    1.1
boto3           1.7.58
botocore        1.10.58

N.B: I know latest version of aws-xray-sdk is not supported with aiobotocore

Thanks

Support for OpenTracing naming conventions

Understand OpenTracing lacks a standardised interface specification however I think it would be a great step forward if the terminology were adopted by this project;

patch_all should take a flag for an alternative patching strategy

I created a lambda calling itself recursively for 3 times.
The lambda is set to“Enable active tracing”, and called patch_all()
It uses boto3 to send request.

Code:

import boto3
import json
import requests
from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.core import patch_all

patch_all()

client = boto3.client('lambda')

def lambda_handler(event, context):
    action = event['action']
    return globals()[action](event, context)


# input event['i']
# output {'result': j}, where j = 1^2 + 2^2 + .. + i^2
def square_sum(event, context):
    i = event['i']
    if i <= 0:
        return {'result': 0}
    if i == 1:
        return {'result': 1}
    payload_str = json.dumps({'action': 'square_sum', 'i': i-1})
    payload_b = bytes(payload_str, 'utf-8')
    tailcall = client.invoke(FunctionName='Test-Xray-Caller',
                             InvocationType='RequestResponse',
                             Payload=payload_b)
    tailcall_result = json.loads(tailcall['Payload'].read())
    result = tailcall_result['result'] + i * i
    return {'result': result}

Expected graph

Real graph

Thinkings

In my understanding, a "remote" service means X-Ray does not recognize what endpoint it is, e.g. a non-AWS service. A call to AWS Lambda, shall have no "remote" services, with only one clean edge from the caller AWS service to the callee AWS service.

This redundant edge occurs more than here: When I call from lambda to Elastic Beanstalk endpoint, it still has "remote" services.

How do I get rid of them?

Direct xray api put segments emitter

Given the current construction (send on non sampled segment end) looks like it will may optionally want some buffering in the emitter. The use case here is for deployments operating sans daemon.

TypeError: init() takes exactly 2 arguments (1 given)

Django==1.11.8
aws-xray-sdk==0.95

Traceback (most recent call last):
  File "/Users/mikehelmick/.virtualenvs/test-x-ray/lib/python2.7/site-packages/django/utils/autoreload.py", line 228, in wrapper
    fn(*args, **kwargs)
  File "/Users/mikehelmick/.virtualenvs/test-x-ray/lib/python2.7/site-packages/django/core/management/commands/runserver.py", line 147, in inner_run
    handler = self.get_handler(*args, **options)
  File "/Users/mikehelmick/.virtualenvs/test-x-ray/lib/python2.7/site-packages/django/contrib/staticfiles/management/commands/runserver.py", line 28, in get_handler
    handler = super(Command, self).get_handler(*args, **options)
  File "/Users/mikehelmick/.virtualenvs/test-x-ray/lib/python2.7/site-packages/django/core/management/commands/runserver.py", line 68, in get_handler
    return get_internal_wsgi_application()
  File "/Users/mikehelmick/.virtualenvs/test-x-ray/lib/python2.7/site-packages/django/core/servers/basehttp.py", line 44, in get_internal_wsgi_application
    return get_wsgi_application()
  File "/Users/mikehelmick/.virtualenvs/test-x-ray/lib/python2.7/site-packages/django/core/wsgi.py", line 14, in get_wsgi_application
    return WSGIHandler()
  File "/Users/mikehelmick/.virtualenvs/test-x-ray/lib/python2.7/site-packages/django/core/handlers/wsgi.py", line 151, in __init__
    self.load_middleware()
  File "/Users/mikehelmick/.virtualenvs/test-x-ray/lib/python2.7/site-packages/django/core/handlers/base.py", line 58, in load_middleware
    mw_instance = mw_class()
TypeError: __init__() takes exactly 2 arguments (1 given)

I tried setting up my env based off of documentation I pieced together from:

settings.py

XRAY_RECORDER = {
    'AUTO_INSTRUMENT': True,  # If turned on built-in database queries and template rendering will be recorded as subsegments
    'AWS_XRAY_CONTEXT_MISSING': 'LOG_ERROR',
    'AWS_XRAY_TRACING_NAME': 'Site',
    'DYNAMIC_NAMING': '*.local.site.net', # defines a pattern that host names should match
}

I added the MIDDLEWARE and the INSTALLED_APP.

Prior to this error I was getting something similar to #4

aws_xray_sdk.core.exceptions.exceptions.SegmentNotFoundException: cannot find the current segment/subsegment, please make sure you have a segment open

I had to set the AWS_XRAY_CONTEXT_MISSING setting to LOG_ERROR to avoid it and then got this error. Any help is appreciated.

Supporting aiobotocore

Back again with more async stuff 😄

I've been using the aiobotocore library which essentially wraps botocore and uses aiohttp to perform async http requests.

I was looking into patching parts of it so the SDK creates subsegments when queries to AWS are made.
I've gotten it to work but its a bit messy.

The following is based off the botocore patcher. (The injecting headers bit is exactly the same bar a different class needs patching)

    wrapt.wrap_function_wrapper(
        'aiobotocore.client',
        'AioBaseClient._make_api_call',
        _xray_traced_aiobotocore,
    )

async def _xray_traced_aiobotocore(wrapped, instance, args, kwargs):
    service = instance._service_model.metadata["endpointPrefix"]

    return await xray_recorder.record_subsegment_async(
        wrapped, instance, args, kwargs,
        name=service,
        namespace='aws',
        meta_processor=aws_meta_processor,
    )

When wrapt runs, the wrapped service is a coroutine which needs to be awaited.
Until its awaited, the injecting of headers wont run.

I created a copy of record_subsegment called record_subsegment_async which is also a coroutine and it runs return_value = await wrapped(*args, **kwargs) which during that the headers are injected and then the result is returned. After that everything else works.

I tried awaiting the wrapped function inside of _xray_traced_aiobotocore but when the headers are injected the subsegment hasn't begun as thats done inside of record_subsegment.

What would you reckon is the best way for solving this, I was thinking of subclassing xray_recorder but if there is a simpler way I'm up for that. Then another PR will come your way 😄.

Patching instructions may be wrong for specific modules

I'm not sure if this project looks after the relevant AWS docs page but I followed this page to do the specific module patch (for boto3):
https://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-python-patching.html

Following "Example main.py – patch specific libraries" I wrote this:

from aws_xray_sdk.core import patch

libraries = ('boto3')
patch(libraries)

Which returned:

Exception: modules o, 3, t, b are currently not supported for patching

It seems to work if I put it in as:

libraries = (['boto3'])

So it may be worth changing the docs to show that if you are patching a single module it needs to be in [ ] brackets? I assume it thinks the firs tone is a list of chars, but the second is a single element list? (I'm new to Python - so that may be wrong!)

No module named pkg_resources with Serverless Framework

We are deploying our Lambda functions with the Serverless Framework and when we use version .95, our Lambda functions throw this error:

Unable to import module 'lib/cancel_trial_subscription': No module named pkg_resources

When using version .94, there isn't an issue. Also, if a deployment is made from a mac instead of through our CI/CD pipeline, which runs on Linux, it's also not an issue. Any ideas how this could be happening?

SegmentNotFoundException patching boto3

I have the following requirements:
Django==1.11.6
boto3==1.4.7
django-storages==1.6.5
aws-xray-sdk==0.93
I want to store media files in an S3 bucket. To trace that I do:

if 'aws_xray_sdk.ext.django' in settings.INSTALLED_APPS:
    from aws_xray_sdk.core import patch_all
    patch_all()

After that, when I try to upload file to my site, a SegmentNotFoundException is raised.
If I do:

if 'aws_xray_sdk.ext.django' in settings.INSTALLED_APPS:
    from aws_xray_sdk.core import patch
    patch(['requests'])

then all is good, but, obviously, there are no traces for boto3.
The xray middleware is in the list of middleware classes, and the xray app is in the INSTALLED_APPS list.

Later streamed subsegments are not in Xray

I have a lambda that is using patch() and active tracing. Boto3 and the requests library are what is patched. After the lambda is completed the Xray Timeline says the function is "Pending" with a duration of 2.2 minutes. When removing patch function and removing importing the sdk, the duration is 3.7 minutes(in the Xray console) . When patch(boto3) is included the first 30-40 boto3 client api function calls are in the raw data(StartQueryExecution, GetDatabases). I put on logging at debug level for the lambd function. There is a line "streaming subsegments" then "sending:
{
"format": "json",
"version": 1
...
"aws": {
"operation": "StartQueryExecution",
"region": "us-west-2",
. . .
} "

But lets say the 41st client function call the logs say the same thing("streaming subsegments" "sending:{}) but what is in the "sending: {}" line does not appear in raw data view in Xray. So at some point in time the later subsegments are not transmitted to the trace.

Add ability to run disabled

datadog tracing (https://github.com/DataDog/dd-trace-py) has the ability to disable tracing via a enabled parameter to their configure method (https://github.com/DataDog/dd-trace-py/blob/master/ddtrace/tracer.py#L82). This is really useful so one can control the behavior of tracing via an environment variable for example without having to change code, allowing for local and prod behaviors. I suggest a similar mechanism for xray

Django integration failing on exception

I'm using Django 2.0 on Python 3.6, with a view that raises PermissionDenied, which Django will convert to a 403 response, the XRayMiddleware dies:

[ERROR]	2018-08-29T15:48:07.91Z	ec1dfd19-aba2-11e8-870f-1f51d52b6dae	Internal Server Error: /graphql/
Traceback (most recent call last):
File "/var/task/django/core/handlers/base.py", line 126, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/var/task/django/views/generic/base.py", line 69, in view
return self.dispatch(request, *args, **kwargs)
File <redacted>
raise PermissionDenied('Not authenticated')
django.core.exceptions.PermissionDenied: Not authenticated

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/var/task/django/core/handlers/exception.py", line 35, in inner
response = get_response(request)
File "/var/task/django/core/handlers/base.py", line 128, in _get_response
response = self.process_exception_by_middleware(e, request)
File "/var/task/django/core/handlers/base.py", line 168, in process_exception_by_middleware
response = middleware_method(request, exception)
File "/var/task/aws_xray_sdk/ext/django/middleware.py", line 84, in process_exception
segment.put_http_meta(http.STATUS, 500)
File "/var/task/aws_xray_sdk/core/models/facade_segment.py", line 42, in put_http_meta
raise FacadeSegmentMutationException(MUTATION_UNSUPPORTED_MESSAGE)
aws_xray_sdk.core.exceptions.exceptions.FacadeSegmentMutationException: FacadeSegments cannot be mutated.

I tried to replicate it in the test suite, but this passes for me (using Python 2.7 rather than 3.X, because I only have 3.7 on my local machine and various parts of this test suite don't work on 3.7):

diff --git a/tests/ext/django/app/views.py b/tests/ext/django/app/views.py
index 6bb5edf..dd05ce3 100644
--- a/tests/ext/django/app/views.py
+++ b/tests/ext/django/app/views.py
@@ -1,5 +1,6 @@
 import sqlite3
 
+from django.core.exceptions import PermissionDenied
 from django.http import HttpResponse
 from django.conf.urls import url
 from django.views.generic import TemplateView
@@ -13,6 +14,10 @@ def ok(request):
     return HttpResponse(status=200)
 
 
+def denied(request):
+    raise PermissionDenied('Not allowed')
+
+
 def fault(request):
     {}['key']
 
@@ -24,11 +29,9 @@ def call_db(request):
     return HttpResponse(status=201)
 
 
-# def template(request):
-
-
 urlpatterns = [
     url(r'^200ok/$', ok, name='200ok'),
+    url(r'^403denied/$', denied, name='403denied'),
     url(r'^500fault/$', fault, name='500fault'),
     url(r'^call_db/$', call_db, name='call_db'),
     url(r'^template/$', IndexView.as_view(), name='template'),
diff --git a/tests/ext/django/test_middleware.py b/tests/ext/django/test_middleware.py
index 66e9648..d1eea72 100644
--- a/tests/ext/django/test_middleware.py
+++ b/tests/ext/django/test_middleware.py
@@ -1,5 +1,5 @@
 import django
-from django.core.urlresolvers import reverse
+from django.urls import reverse
 from django.test import TestCase
 
 from aws_xray_sdk.core import xray_recorder
@@ -42,6 +42,18 @@ class XRayTestCase(TestCase):
         assert request['client_ip'] == '127.0.0.1'
         assert response['status'] == 404
 
+    def test_denied(self):
+        self.client.get('/403denied/')
+        segment = xray_recorder.emitter.pop()
+        assert segment.error
+
+        request = segment.http['request']
+        response = segment.http['response']
+
+        assert request['method'] == 'GET'
+        assert request['client_ip'] == '127.0.0.1'
+        assert response['status'] == 403
+
     def test_fault(self):
         url = reverse('500fault')
         try:

It could be related to changes in Python 3.X, or Django 2.0 which isn't being tested here yet and I made #85 for.

patch_all() can patch modules multiple times, causing RecursionErrors

We hit an issue where patch_all() was used in a Lambda function - after the 1000th call we saw a RecursionError when calling requests.get().

Example:

from aws_xray_sdk.core import patch_all
import requests
import sys

def lambda_handler(event, _context):
    for i in range(sys.getrecursionlimit()):
        patch_all()
    requests.get('https://google.com') # Triggers RecursionError

We should have called patch_all() outside of the function, but this still feels like a gotcha.

core.patcher._patch has this code:

def _patch(module_to_patch):
    # [...]
    if module_to_patch in _PATCHED_MODULES:
        log.debug('%s already patched', module_to_patch)

So it recognises the multiple patch, but repatches anyway.

Should _patch return early if it recognises the patch is already in place?

Missing src package on pypi

Hello,

I've looked at : https://pypi.python.org/simple/aws-xray-sdk/
And currently, there is :

aws-xray-sdk-0.91.tar.gz
aws_xray_sdk-0.91.1-py2.py3-none-any.whl
aws_xray_sdk-0.92-py2.py3-none-any.whl
aws_xray_sdk-0.92.1-py2.py3-none-any.whl
aws_xray_sdk-0.92.2-py2.py3-none-any.whl
aws_xray_sdk-0.93-py2.py3-none-any.whl

Could you publish both the source (tar.gz) and wheel for all versions ?
Thanks in advance.

Support boto3 S3 transfer manager

boto3 has a few high-level S3 API calls like upload_file, download_file which depends on s3transfer to perform multi-threaded object puts/gets to increase through put. This will result in a SegmentNotFoundException as the X-Ray recorder tries to capture the "real" S3 API call but it loses the context because the actual http call is in a worker thread from the thread pool.

The S3 transfer manager under the hood uses futures.ThreadPoolExecutor per https://github.com/boto/s3transfer/blob/develop/s3transfer/futures.py#L370 but there is no proper API on boto3 client level to propagate context from user code. And requiring user code changes is not a good customer experience.

The SDK should somehow monkey-patch S3 transfer manager so that it automatically propagate context to all worker threads so each http outbound call is captured and attached to its parent segment or subsegment properly.

Library django-storages or any storage library that supports S3 as back-end and uses boto3 certain APIs might face the same issue.

More detailed technical deep dive could be found here: #4.

unparented segment/subsegment resolution

I'm using aws-xray-sdk to trace an application which uses the google pubsub module. Unfortunately this module uses various threads which means that I get a lot of SegmentNotFoundException exceptions causing my application to fail.

It's going to be impossible for me to track down all the locations where it creates threads and pass the trace entity and call xray_recorder.set_trace_entity.

~~How about if this SDK provides support for a "root thread" which it will use to resolve the segment/subsegment if the current context does not have one?~~

Provide xray_recorder.capture() as context manager

This is a feature request to be able to use the functionality of AWSXRayRecorder.capture() as context manager.
The motivation for that is to have something to replace begin_subsegment() and end_subsegment() calls with a context manager which does proper exception handling. That's especially important when using nested subsegments, as subsegment-traces in one "branch" below the current segment will only be streamed to X-Ray if all subsegments in that branch got closed properly. When doing no proper exception handling manually to ensure to close all subsegments that quickly leads to missing data in X-Ray. What contributes to that is that AWSXRayRecorder.end_subsegment() doesn't close a specific subsegment, but rather the current open one and because of that AWSXRayRecorder isn't able to notice if one got missed.

Currently one would have to do something like that for every subsegment:

def some_function():
    print("code without custom subsegment")
    subsegment = xray_recorder.begin_subsegment("my_subsegment")
    exception = None
    stack = None
    try:
        print("code to trace as custom subsegment")
    except Exception as exception:
        stack = traceback.extract_stack(limit=self.max_trace_back)
        raise
    finally:
        if subsegment is not None:
            if exception:
                subsegment.add_exception(exception, stack)
            xray_recorder.end_subsegment()

With capture() available as context manager that would be as simple as:

def some_function():
    print("code without custom subsegment")
    with xray_recorder.capture("my_subsegment"):
        print("code to trace as custom subsegment")

As a workaround we currently use the following custom context manager, but it would be nicer if AWSXRayRecorder.capture() would support it directly of course:

class XRayCaptureContextManager:
    def __init__(self, name):
        self.name = name
        self.subsegment = None

    def __enter__(self):
        self.subsegment = xray_recorder.begin_subsegment(self.name)

    def __exit__(self, exc_type, exc_val, exc_tb):
        if self.subsegment is not None:
            if exc_val:
                self.subsegment.add_exception(exc_val, traceback.extract_tb(exc_tb))
            xray_recorder.end_subsegment()

pypi doc issue

if you compare xray: https://pypi.python.org/pypi/aws-xray-sdk/0.96
to aiohttp: https://pypi.python.org/pypi/aiohttp

You can tell something is broken :)

Chalice support?

Currently this project does not patch Chalice. We are using it and hope for a first-party support.

end_segment should not break when context_missing is LOG_ERROR

Right now, calls to end_segment will break when context_missing='LOG_ERROR', should that happen?

>>> xray_recorder.end_segment()
cannot find the current segment/subsegment, please make sure you have a segment open
No segment to end
cannot find the current segment/subsegment, please make sure you have a segment open
Traceback (most recent call last):
...
  File ".../aws-xray-sdk/aws_xray_sdk/core/recorder.py", line 229, in end_segment
    if self.current_segment().ready_to_send():
AttributeError: 'NoneType' object has no attribute 'ready_to_send'

Celery support

Celery exposes some useful signals in which the SDK can hook onto to propagate the trace.

References:

http://docs.celeryproject.org/en/latest/userguide/signals.html#task-signals

Safer Flask teardown request function

Ref: http://flask.pocoo.org/docs/0.12/reqcontext/#callbacks-and-errors

Flask requests that teardown_request function should never fail. The request teardown function implementation in XRay SDK is not robust enough because if before_request is not executed, context.current_segment would return None. Any operation on the "None" segment would fail.

Psycopg <class 'IndexError'> tuple index out of range

  File "/var/task/sqlalchemy/engine/default.py", line 412, in connect
    return self.dbapi.connect(*cargs, **cparams)
  File "/var/task/aws_xray_sdk/ext/psycopg2/patch.py", line 20, in _xray_traced_connect
    dbname = kwargs['dbname'] if 'dbname' in kwargs else re.search(r'dbname=(\S+)\b', args[0]).groups()[0]

No exception information for PynamoDB integration

Currently the PynamoDB integration doesn't provide stack traces for failed requests to DynamoDB. The reason is pretty simple: As requests needs to be patched for it, there simply is no exception yet, when the response is processed for X-Ray, as the PynamoDB code wasn't called yet.

Any idea what'd be the best way to handle that?

httplib patcher: AttributeError: 'HTTPResponse' object has no attribute '_xray_prop'

AWS Xray SDK errors out when executed outside of segment/subsegment

Expected behavior warn but do not fail on this error.

  File "/......../parsers.py", line 40, in .............
  df = pandas.read_csv(s3_url)
  File "/var/task/pandas/io/parsers.py", line 655, in parser_f
  return _read(filepath_or_buffer, kwds)
  File "/var/task/pandas/io/parsers.py", line 405, in _read
  parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/var/task/pandas/io/parsers.py", line 764, in __init__
  self._make_engine(self.engine)
  File "/var/task/pandas/io/parsers.py", line 985, in _make_engine
  self._engine = CParserWrapper(self.f, **self.options)
  File "/var/task/pandas/io/parsers.py", line 1605, in __init__
  self._reader = parsers.TextReader(src, **kwds)
  File "pandas/_libs/parsers.pyx", line 562, in pandas._libs.parsers.TextReader.__cinit__ (pandas/_libs/parsers.c:6175)
  File "pandas/_libs/parsers.pyx", line 751, in pandas._libs.parsers.TextReader._get_header (pandas/_libs/parsers.c:9268)
  File "pandas/_libs/parsers.pyx", line 953, in pandas._libs.parsers.TextReader._tokenize_rows (pandas/_libs/parsers.c:11755)
  File "pandas/_libs/parsers.pyx", line 2173, in pandas._libs.parsers.raise_parser_error (pandas/_libs/parsers.c:28589)
  File "/var/task/s3fs/core.py", line 1243, in _fetch_range
  return resp['Body'].read()
  File "/var/runtime/botocore/response.py", line 74, in read
  chunk = self._raw_stream.read(amt)
  File "/var/runtime/botocore/vendored/requests/packages/urllib3/response.py", line 239, in read
  data = self._fp.read()
  File "/var/task/aws_xray_sdk/ext/httplib/patch.py", line 106, in _xray_traced_http_client_read
  xray_data = getattr(instance, _XRAY_PROP)
AttributeError: 'HTTPResponse' object has no attribute '_xray_prop'

Module name decerators seems a typo

There is aws_xray_sdk.ext.sqlalchemy.util.decerators module. Is this a typo of decorator?

Loosen entity name validation

The segment/subsegment name supports unicode characters per schema provided in https://docs.aws.amazon.com/xray/latest/devguide/xray-api-segmentdocuments.html. Here is the content of the schema regarding name property:

    "name": {
      "type": "string",
      "pattern": "^[A-Za-z\\u00aa\\u00b5\\u00ba\\u00c0-\\u00d6\\u00d8-\\u00f6\\u00f8-\\u02c1\\u02c6-\\u02d1\\u02e0-\\u02e4\\u02ec\\u02ee\\u0370-\\u0374\\u0376\\u0377\\u037a-\\u037d\\u037f\\u0386\\u0388-\\u038a\\u038c\\u038e-\\u03a1\\u03a3-\\u03f5\\u03f7-\\u0481\\u048a-\\u052f\\u0531-\\u0556\\u0559\\u0561-\\u0587\\u05d0-\\u05ea\\u05f0-\\u05f2\\u0620-\\u064a\\u066e\\u066f\\u0671-\\u06d3\\u06d5\\u06e5\\u06e6\\u06ee\\u06ef\\u06fa-\\u06fc\\u06ff\\u0710\\u0712-\\u072f\\u074d-\\u07a5\\u07b1\\u07ca-\\u07ea\\u07f4\\u07f5\\u07fa\\u0800-\\u0815\\u081a\\u0824\\u0828\\u0840-\\u0858\\u08a0-\\u08b4\\u0904-\\u0939\\u093d\\u0950\\u0958-\\u0961\\u0971-\\u0980\\u0985-\\u098c\\u098f\\u0990\\u0993-\\u09a8\\u09aa-\\u09b0\\u09b2\\u09b6-\\u09b9\\u09bd\\u09ce\\u09dc\\u09dd\\u09df-\\u09e1\\u09f0\\u09f1\\u0a05-\\u0a0a\\u0a0f\\u0a10\\u0a13-\\u0a28\\u0a2a-\\u0a30\\u0a32\\u0a33\\u0a35\\u0a36\\u0a38\\u0a39\\u0a59-\\u0a5c\\u0a5e\\u0a72-\\u0a74\\u0a85-\\u0a8d\\u0a8f-\\u0a91\\u0a93-\\u0aa8\\u0aaa-\\u0ab0\\u0ab2\\u0ab3\\u0ab5-\\u0ab9\\u0abd\\u0ad0\\u0ae0\\u0ae1\\u0af9\\u0b05-\\u0b0c\\u0b0f\\u0b10\\u0b13-\\u0b28\\u0b2a-\\u0b30\\u0b32\\u0b33\\u0b35-\\u0b39\\u0b3d\\u0b5c\\u0b5d\\u0b5f-\\u0b61\\u0b71\\u0b83\\u0b85-\\u0b8a\\u0b8e-\\u0b90\\u0b92-\\u0b95\\u0b99\\u0b9a\\u0b9c\\u0b9e\\u0b9f\\u0ba3\\u0ba4\\u0ba8-\\u0baa\\u0bae-\\u0bb9\\u0bd0\\u0c05-\\u0c0c\\u0c0e-\\u0c10\\u0c12-\\u0c28\\u0c2a-\\u0c39\\u0c3d\\u0c58-\\u0c5a\\u0c60\\u0c61\\u0c85-\\u0c8c\\u0c8e-\\u0c90\\u0c92-\\u0ca8\\u0caa-\\u0cb3\\u0cb5-\\u0cb9\\u0cbd\\u0cde\\u0ce0\\u0ce1\\u0cf1\\u0cf2\\u0d05-\\u0d0c\\u0d0e-\\u0d10\\u0d12-\\u0d3a\\u0d3d\\u0d4e\\u0d5f-\\u0d61\\u0d7a-\\u0d7f\\u0d85-\\u0d96\\u0d9a-\\u0db1\\u0db3-\\u0dbb\\u0dbd\\u0dc0-\\u0dc6\\u0e01-\\u0e30\\u0e32\\u0e33\\u0e40-\\u0e46\\u0e81\\u0e82\\u0e84\\u0e87\\u0e88\\u0e8a\\u0e8d\\u0e94-\\u0e97\\u0e99-\\u0e9f\\u0ea1-\\u0ea3\\u0ea5\\u0ea7\\u0eaa\\u0eab\\u0ead-\\u0eb0\\u0eb2\\u0eb3\\u0ebd\\u0ec0-\\u0ec4\\u0ec6\\u0edc-\\u0edf\\u0f00\\u0f40-\\u0f47\\u0f49-\\u0f6c\\u0f88-\\u0f8c\\u1000-\\u102a\\u103f\\u1050-\\u1055\\u105a-\\u105d\\u1061\\u1065\\u1066\\u106e-\\u1070\\u1075-\\u1081\\u108e\\u10a0-\\u10c5\\u10c7\\u10cd\\u10d0-\\u10fa\\u10fc-\\u1248\\u124a-\\u124d\\u1250-\\u1256\\u1258\\u125a-\\u125d\\u1260-\\u1288\\u128a-\\u128d\\u1290-\\u12b0\\u12b2-\\u12b5\\u12b8-\\u12be\\u12c0\\u12c2-\\u12c5\\u12c8-\\u12d6\\u12d8-\\u1310\\u1312-\\u1315\\u1318-\\u135a\\u1380-\\u138f\\u13a0-\\u13f5\\u13f8-\\u13fd\\u1401-\\u166c\\u166f-\\u167f\\u1681-\\u169a\\u16a0-\\u16ea\\u16f1-\\u16f8\\u1700-\\u170c\\u170e-\\u1711\\u1720-\\u1731\\u1740-\\u1751\\u1760-\\u176c\\u176e-\\u1770\\u1780-\\u17b3\\u17d7\\u17dc\\u1820-\\u1877\\u1880-\\u18a8\\u18aa\\u18b0-\\u18f5\\u1900-\\u191e\\u1950-\\u196d\\u1970-\\u1974\\u1980-\\u19ab\\u19b0-\\u19c9\\u1a00-\\u1a16\\u1a20-\\u1a54\\u1aa7\\u1b05-\\u1b33\\u1b45-\\u1b4b\\u1b83-\\u1ba0\\u1bae\\u1baf\\u1bba-\\u1be5\\u1c00-\\u1c23\\u1c4d-\\u1c4f\\u1c5a-\\u1c7d\\u1ce9-\\u1cec\\u1cee-\\u1cf1\\u1cf5\\u1cf6\\u1d00-\\u1dbf\\u1e00-\\u1f15\\u1f18-\\u1f1d\\u1f20-\\u1f45\\u1f48-\\u1f4d\\u1f50-\\u1f57\\u1f59\\u1f5b\\u1f5d\\u1f5f-\\u1f7d\\u1f80-\\u1fb4\\u1fb6-\\u1fbc\\u1fbe\\u1fc2-\\u1fc4\\u1fc6-\\u1fcc\\u1fd0-\\u1fd3\\u1fd6-\\u1fdb\\u1fe0-\\u1fec\\u1ff2-\\u1ff4\\u1ff6-\\u1ffc\\u2071\\u207f\\u2090-\\u209c\\u2102\\u2107\\u210a-\\u2113\\u2115\\u2119-\\u211d\\u2124\\u2126\\u2128\\u212a-\\u212d\\u212f-\\u2139\\u213c-\\u213f\\u2145-\\u2149\\u214e\\u2183\\u2184\\u2c00-\\u2c2e\\u2c30-\\u2c5e\\u2c60-\\u2ce4\\u2ceb-\\u2cee\\u2cf2\\u2cf3\\u2d00-\\u2d25\\u2d27\\u2d2d\\u2d30-\\u2d67\\u2d6f\\u2d80-\\u2d96\\u2da0-\\u2da6\\u2da8-\\u2dae\\u2db0-\\u2db6\\u2db8-\\u2dbe\\u2dc0-\\u2dc6\\u2dc8-\\u2dce\\u2dd0-\\u2dd6\\u2dd8-\\u2dde\\u2e2f\\u3005\\u3006\\u3031-\\u3035\\u303b\\u303c\\u3041-\\u3096\\u309d-\\u309f\\u30a1-\\u30fa\\u30fc-\\u30ff\\u3105-\\u312d\\u3131-\\u318e\\u31a0-\\u31ba\\u31f0-\\u31ff\\u3400-\\u4db5\\u4e00-\\u9fd5\\ua000-\\ua48c\\ua4d0-\\ua4fd\\ua500-\\ua60c\\ua610-\\ua61f\\ua62a\\ua62b\\ua640-\\ua66e\\ua67f-\\ua69d\\ua6a0-\\ua6e5\\ua717-\\ua71f\\ua722-\\ua788\\ua78b-\\ua7ad\\ua7b0-\\ua7b7\\ua7f7-\\ua801\\ua803-\\ua805\\ua807-\\ua80a\\ua80c-\\ua822\\ua840-\\ua873\\ua882-\\ua8b3\\ua8f2-\\ua8f7\\ua8fb\\ua8fd\\ua90a-\\ua925\\ua930-\\ua946\\ua960-\\ua97c\\ua984-\\ua9b2\\ua9cf\\ua9e0-\\ua9e4\\ua9e6-\\ua9ef\\ua9fa-\\ua9fe\\uaa00-\\uaa28\\uaa40-\\uaa42\\uaa44-\\uaa4b\\uaa60-\\uaa76\\uaa7a\\uaa7e-\\uaaaf\\uaab1\\uaab5\\uaab6\\uaab9-\\uaabd\\uaac0\\uaac2\\uaadb-\\uaadd\\uaae0-\\uaaea\\uaaf2-\\uaaf4\\uab01-\\uab06\\uab09-\\uab0e\\uab11-\\uab16\\uab20-\\uab26\\uab28-\\uab2e\\uab30-\\uab5a\\uab5c-\\uab65\\uab70-\\uabe2\\uac00-\\ud7a3\\ud7b0-\\ud7c6\\ud7cb-\\ud7fb\\uf900-\\ufa6d\\ufa70-\\ufad9\\ufb00-\\ufb06\\ufb13-\\ufb17\\ufb1d\\ufb1f-\\ufb28\\ufb2a-\\ufb36\\ufb38-\\ufb3c\\ufb3e\\ufb40\\ufb41\\ufb43\\ufb44\\ufb46-\\ufbb1\\ufbd3-\\ufd3d\\ufd50-\\ufd8f\\ufd92-\\ufdc7\\ufdf0-\\ufdfb\\ufe70-\\ufe74\\ufe76-\\ufefc\\uff21-\\uff3a\\uff41-\\uff5a\\uff66-\\uffbe\\uffc2-\\uffc7\\uffca-\\uffcf\\uffd2-\\uffd7\\uffda-\\uffdc \\u00a0\\u1680\\u2000-\\u200a\\u2028\\u2029\\u202f\\u205f\\u30000-9\\u00b2\\u00b3\\u00b9\\u00bc-\\u00be\\u0660-\\u0669\\u06f0-\\u06f9\\u07c0-\\u07c9\\u0966-\\u096f\\u09e6-\\u09ef\\u09f4-\\u09f9\\u0a66-\\u0a6f\\u0ae6-\\u0aef\\u0b66-\\u0b6f\\u0b72-\\u0b77\\u0be6-\\u0bf2\\u0c66-\\u0c6f\\u0c78-\\u0c7e\\u0ce6-\\u0cef\\u0d66-\\u0d75\\u0de6-\\u0def\\u0e50-\\u0e59\\u0ed0-\\u0ed9\\u0f20-\\u0f33\\u1040-\\u1049\\u1090-\\u1099\\u1369-\\u137c\\u16ee-\\u16f0\\u17e0-\\u17e9\\u17f0-\\u17f9\\u1810-\\u1819\\u1946-\\u194f\\u19d0-\\u19da\\u1a80-\\u1a89\\u1a90-\\u1a99\\u1b50-\\u1b59\\u1bb0-\\u1bb9\\u1c40-\\u1c49\\u1c50-\\u1c59\\u2070\\u2074-\\u2079\\u2080-\\u2089\\u2150-\\u2182\\u2185-\\u2189\\u2460-\\u249b\\u24ea-\\u24ff\\u2776-\\u2793\\u2cfd\\u3007\\u3021-\\u3029\\u3038-\\u303a\\u3192-\\u3195\\u3220-\\u3229\\u3248-\\u324f\\u3251-\\u325f\\u3280-\\u3289\\u32b1-\\u32bf\\ua620-\\ua629\\ua6e6-\\ua6ef\\ua830-\\ua835\\ua8d0-\\ua8d9\\ua900-\\ua909\\ua9d0-\\ua9d9\\ua9f0-\\ua9f9\\uaa50-\\uaa59\\uabf0-\\uabf9\\uff10-\\uff19_.:\/%&#=+\\-@]{1,200}$",
      "java_pattern": "([\\p{L}\\p{Z}\\p{N}_.:/%&#=+\\-@]*)$"
    },

However, the SDK is dropping characters that are not ASCII: https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/core/models/entity.py#L18.

If a segment/subsegment name has invalid characters, that segment/subsegment will not be accepted by X-Ray service back-end. But since the X-Ray daemon sends segments on batches, the invalid segment/subsegment will be in "Unprocessed" of the API PutTraceSegments's response body but the call will succeed. This is the background of the sanitization happening on the SDK side to make sure valuable data will not be dropped due to invalid characters on names.

The SDK should have equal or at least loosen restriction than the back-end has. Doing a full regex using ([\\p{L}\\p{Z}\\p{N}_.:/%&#=+\\-@]*)$ adds performance overhead since this regex match happens for every single segment/subsegment capture.

The purpose is to have the SDK to switch to blacklist based sanitization. It drops common invalid characters like ? * $ ; ( ) [ ] { }. This ensures the lightweight design and unicode letters from non-English nature languages pass through.

Any feedback is welcome.

Considering the usage of strip_url to name a node as a none perfect solution

Right now most of the extensions that implement patches or machinery to allow instrumentalize HTTP clients such as Aiohttp, Requests or HTTPLib use by default the strip_url as fallback when there is no name given [1] or even as a primary way to name a subsegment [2][3].

Relying on the output of this function when there is no an alternative name, the number of nodes that are created in the AWS Xray console is as many different URLrs has been generated, so the following URLs will end up creating different nodes each one. While common sense says that all of them refer to the same service:

>>> strip_url("http://localhost/resource/1?debug=true")
'http://localhost/resource/1'
>>> strip_url("http://localhost/resource/2")
'http://localhost/resource/2'
>>> strip_url("http://localhost/resource/1,2,3")
'http://localhost/resource/1,2,3'
>>> strip_url("http://localhost/resource/1,2,3")
'http://localhost/resource/1,2,3'

So, IMO we must address this situation or at least give to the user a way of inferring the name the segment based on the URL hosts or any strategy that allows the user to group those queries that belong to the same node. Taking as example the previous set of URLs all of them will be named as http://localhost.

I guess there is no a silver bullet for that, but before start digging into a proper solution I would like to gather your feedback about the issue by itself, and ofc if you have any proposal I will like to hear them

[1] https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/ext/aiohttp/client.py#L25
[2] https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/ext/requests/patch.py#L29
[3] https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/ext/httplib/patch.py#L54

Serverless framework support

When using aws-xray-sdk 0.92 on Lambda to trace a Flask application, adding the Flask middleware results in this log and traceback.

[1509032914734] Traceback (most recent call last):
[1509032914734] File "/var/task/flask/app.py", line 1982, in wsgi_app
[1509032914734] response = self.full_dispatch_request()
[1509032914734] File "/var/task/flask/app.py", line 1615, in full_dispatch_request
[1509032914734] return self.finalize_request(rv)
[1509032914734] File "/var/task/flask/app.py", line 1632, in finalize_request
[1509032914734] response = self.process_response(response)
[1509032914734] File "/var/task/flask/app.py", line 1856, in process_response
[1509032914734] response = handler(response)
[1509032914734] File "/var/task/aws_xray_sdk/ext/flask/middleware.py", line 59, in _after_request
[1509032914734] segment.put_http_meta(http.STATUS, response.status_code)
[1509032914734] File "/var/task/aws_xray_sdk/core/models/facade_segment.py", line 42, in put_http_meta
[1509032914734] raise FacadeSegmentMutationException(MUTATION_UNSUPPORTED_MESSAGE)
[1509032914734] aws_xray_sdk.core.exceptions.exceptions.FacadeSegmentMutationException: FacadeSegments cannot be mutated.
[1509032914734] FacadeSegments cannot be mutated.

Please check the forums for the detail info :
https://forums.aws.amazon.com/thread.jspa?messageID=803898&tstart=0

Removing Segment/Subsugment Name invalid characters.

When using Mysql with flask sqlalchemy, i get this log:
Removing Segment/Subsugment Name invalid characters.

My SQLALCHEMY URI IS :

mysql://user:[email protected]:3306/semiprod?charset=utf8mb4

Even if I remove the ?charset, it seems to be automatically added by sqlalchemy

Add SNS Service "Publish" operation to the aws_para_whitelist

Currently the SNS publish operation shows up with a minimum set of metadata:

"aws": {
    "operation": "Publish",
    "region": "us-east-1",
    "request_id": "a939cee1-7c48-5675-b385-9ae2206dc121"
}

This should include at least the known internal AWS resources like TopicArn or TargetArn and maybe even the PhoneNumber.

Psycopg Support?

We use Psycopg for our connections to Postgres. Are there any plans to support Psycopg? It uses the DB API 2.0 (PEP 249), so we should be able to use the existing aws_xray_sdk/ext/dbapi2.py module. I will happily work on this if you approve and no one else is working on it. =)

S3 pre-signed url and post operations not traced

The X-Ray SDK for Python doesn't seem to properly trace S3 pre-signed url and post operations. Check out the following example:

import boto3
from aws_xray_sdk.core import patch as xray_patch


def handler(event, context):
    xray_patch(['boto3'])

    client = boto3.client('s3')
    client.list_buckets()
    client.generate_presigned_url(ClientMethod='get_object',
                                  Params={'Bucket': 'my-bucket',
                                          'Key': 'foo'})
    client.generate_presigned_post(Bucket='my-bucket'', Key='bar')
    client.list_buckets()

Support for generic annotations and meta for third-party libraries

Hi, we would need to send some annotations (or meta) data along with the data that is sent automatically as a segment or subsegment. Taking into account that this data is grabbed from the context, so this data changes at each request. A good example is the request_id that is progressed as a header between all of the services.

We are keen on the Aiohttp user case but might be worth it have a solution that fits all of the integrations.

A possible solution that wouldn't introduce much technical debt would be adding a new method of the Context that would be called each time that a segment is closed/sent to add all extra meta/annotations. By default will do nothing, but the developer would have the chance to override this method subclassing the Context class and implementing its own logic.

Thoughts, any other ideas?

kafka support?

It may be good to have a kafka client support.

What information does Flask-SQLAlchemy send?

I would like to get the actual queries that are being executed along with the time it took to execute them.

Right now the most informative text that I can see is
sqlalchemy.orm.query.one (indicating that i performed a one() query)

Is there any way to get actual sql query data , or the queries itself ?

sqlalchemy.utils.decorators.parse_bind assumes bind is a Engine

When you create a session in sqlalchemy the bind parameter can be either a Engine or a Connection, but if a Connection is used parse_bind will break.

PIP 0.94 missing Aiohttp Enhancement

I installed 0.94 using pip and its missing all the AioHttp enhancement
screenshot of the compat.py code installed

subsegment name special characters

currently the following is allowed: string.ascii_letters + string.digits + '_.:/%&#=+\-@ ' per entity.py. However in the requests module (and httplib module I'm proposing based on it) it uses the url, which can include a ? character which yields a lot of warnings. Was the intention that everything after '?' would be stripped off? If so the requests module isn't doing this either. Could either do simple string clipping or use something like yarl

Swallowed `return_value` in `record_subsegment(...)`?

My Python may not be super strong, but in my reading of try: except: finally:, it seems that the return_value is being trashed in the LOG_ERROR case.

From https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/core/recorder.py:

        return_value = None

        try:
            return_value = wrapped(*args, **kwargs)
            return return_value
        except Exception as e:
            exception = e
            stack = traceback.extract_stack(limit=self._max_trace_back)
            raise
        finally:
            # No-op if subsegment is `None` due to `LOG_ERROR`.
            if subsegment is None:
                return

Isn't that last return actually equivalent to return None?

Aiohttp improvements, feedback wanted

Hi folks,

I'm thinking in work on a pair of PR to improve the functionality of AWS-XRay for the Aiohttp framework.

First of all, I would like to add a new TraceConfig object - the new Aiohttp Client tracing system since 3.0 [1] - that will allow users just make something like that to inject the specific headers:

import aiohttp

from aws_xray_sdk.ext.aiohttp.client import TraceConfig

async with aiohttp.ClientSession(trace_configs=[TraceConfig()]) as client:
    await client.get('http://example.com/some/redirect/')

My main concern with that is how to deal with different Aiohttp versions, is there any requirement or advice with that?

Second, I would like to deprecate - remove? - the current middleware implementation that uses a deprecated pattern since the 2.X version, we strongly recommend the new one [2], any concerns?.

[1] https://docs.aiohttp.org/en/stable/client_advanced.html#client-tracing
[2] https://docs.aiohttp.org/en/stable/web_advanced.html?highlight=middleware#middlewares

Safer Flask km,

message too long errors

Occasionally I'll get the following error from the udp_emitter:

2018-02-14 16:28:46,744 - aws_xray_sdk.core.emitters.udp_emitter - ERROR - failed to send data to X-Ray daemon.
Traceback (most recent call last):
  File "/Users/amohr/.pyenv/versions/3.6.3/lib/python3.6/site-packages/aws_xray_sdk/core/emitters/udp_emitter.py", line 55, in _send_data
    self._port))
OSError: [Errno 40] Message too long

my guess is that there's too much data in the trace. This sounds like a library problem, if there's too much data it should split it into multiple packets.

try/except around serialization blocks some workflows

I'm running long running functions AWS Lambda and using a signal.signal(signal.SIGALRM, ...) handler to raise a special exception to reschedule them when they approach the Lambda timeout. (They're structured in such a way that work done is saved in the DB so no progress is lost).

Unfortunately it rarely happens that the signal fires at the same time as some object is being serialized by the X-Ray SDK, inside the try/except in serialize() https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/core/models/entity.py#L231 . I end up with a stack trace like:

Traceback (most recent call last):
  File "/var/task/aws_xray_sdk/core/models/entity.py", line 238, in serialize
    return jsonpickle.encode(self, unpicklable=False)
  File "/var/task/jsonpickle/__init__.py", line 132, in encode
    numeric_keys=numeric_keys)
  File "/var/task/jsonpickle/pickler.py", line 43, in encode
    return backend.encode(context.flatten(value, reset=reset))
  File "/var/task/jsonpickle/pickler.py", line 156, in flatten
    return self._flatten(obj)
  File "/var/task/jsonpickle/pickler.py", line 160, in _flatten
    return self._pop(self._flatten_obj(obj))
  File "/var/task/jsonpickle/pickler.py", line 176, in _flatten_obj
    return flatten_func(obj)
  File "/var/task/jsonpickle/pickler.py", line 234, in _ref_obj_instance
    return self._flatten_obj_instance(obj)
  File "/var/task/jsonpickle/pickler.py", line 262, in _flatten_obj_instance
    has_getnewargs = util.has_method(obj, '__getnewargs__')
  File "/var/task/jsonpickle/util.py", line 52, in has_method
    def has_method(obj, name):
  File "/var/task/...", line ..., in ...
    raise Reschedule()

The try / except was added in the "initial commit" so it's not clear if it's really necessary to be so broad, maybe it can just cover TypeError ? jsonpickle docs for encode() don't even list a set of potential exceptions. There are also a bunch of other try / except Exceptions littered through the library which might be able to be reduced in power.

Test with Django 2.0+

Currently the tox file installs:

django >= 1.10, <2.0

Please test with Django 2.0 and 2.1 and confirm compatibility with them. 2.0+ is Python 3 only, perhaps why this restriction was added in the first place?

Annotations with special characters in name are silently ignored

Consider this code:

s = xray_recorder.begin_subsegment('test')
s.put_annotation('log.mydata', 42)

It is executed without any error or exception, but the annotation just does not appear in XRay console.
If I look at the raw segment data I will see "annotations": {} - empty object for annotations. Looks like it is either an issue with XRay backend (silently ignores values with name containing special characters like dot . or colon :) or an issue with docs & this framework (docs should mention this limitation, and the framework should prohibit such names).
Underscore seems allowed though.

New 1.2 release with last fixes

Hi,

Do you have any ETA for the next 1.2 release with the last Aiottp fixes. We have some dependencies blocked waiting for a new release.

Thanks!

aws / aws-xray-sdk-python Goto Github PK

aws-xray-sdk-python's Issues

Code:

Expected graph

Real graph

Thinkings

Recommend Projects

Recommend Topics

Recommend Org