twisted / treq Goto Github PK

View Code? Open in Web Editor NEW

583.0 27.0 137.0 873 KB

Python requests like API built on top of Twisted's HTTP client.

License: Other

Python 100.00%

python twisted http

treq's Introduction

treq: High-level Twisted HTTP Client API

_ _ _ _

treq is an HTTP library inspired by requests but written on top of Twisted's Agents.

It provides a simple, higher level API for making HTTP requests when using Twisted.

>>> import treq

>>> def done(response):
...     print(response.code)
...     reactor.stop()

>>> treq.get("https://github.com").addCallback(done)

>>> from twisted.internet import reactor
>>> reactor.run()
200

For more info read the docs.

Contributing

treq development is hosted on GitHub.

We welcome contributions: feel to fork and send contributions over. See CONTRIBUTING.rst for more info.

Code of Conduct

Refer to the Twisted code of conduct.

Copyright and License

treq is made available under the MIT license. See LICENSE for legal details and copyright notices.

treq's People

Contributors

Stargazers

Watchers

Forkers

julian pythonmobile gurteshwar jml tomprince jonathanj jerith klizhentas fateiswar itsjustpiper dpnova cyroxx qq40660 alex kouk ralphm dstufft webknjaz jimrollenhagen hawkowl scout24 rakiru koodaamo bmuller lieryan cdunklau pombreda milesforks jrossi cwaldbieser ldanielburr ksmaheshkumar infin8 is00hcw rafallo mksh darcyg lextoumbourou fxiii jsandovalc saghul cyli opalmer hayd glyph mithrandi shuyunqi zloy531 habnabit far lapki simudream dikoufu weieast larsx2 junaidloonat jmyles rebrec jayh5 ii0 abhishesh antisvin jameshilliard adrcad markrwilliams twm shyba bizzbyster pawelmhm gaobo07 cutso dmurvihill quorumus adetokunbo stethd jsfs2019 jtbandes yangkf1985 darjeeling a917464280 jamesfunnk akaptur awesome-python andys8 felixonmars xp15417788 zhukovalexander alephc qtlab admjls nischithak rantideb simonsunvip vraoresearch jvgutierrez zhaoshiling1017 damjanovl raphapassini chevah dalavancloud

treq's Issues

testtools matching utility for requests

So... I wrote this thing, not sure where to put it:

class RequestSequence(treq_RequestSequence):
    @contextmanager
    def consume(self, sync_failure_reporter):
        yield
        if not self.consumed():
            sync_failure_reporter("\n".join(
                ["Not all expected requests were made.  Still expecting:"] +
                ["- {0!r})".format(e) for e, _ in self._sequence]))

    def __call__(self, method, url, params, headers, data):
        """
        :return: the next response in the sequence, provided that the
            parameters match the next in the sequence.
        """
        req = (method, url, params, headers, data)
        if len(self._sequence) == 0:
            self._async_reporter(
                None, Never(),
                "No more requests expected, but request {0!r} made.".format(
                    req))
            return (500, {}, "StubbingError")
        matcher, response = self._sequence[0]
        self._async_reporter(req, matcher)
        self._sequence = self._sequence[1:]
        return response

You use it by passing in self.expectThat as the "async reporter" (I kinda overloaded this to mean something else), and by passing in a matcher instead of a 5-tuple to match against the request.

You might construct the matcher something like this:

MatchesListwise([
    Equals(b'POST'),
    Equals(u'https://example.org/'),
    ContainsDict({b'foo': Equals(b'bar')}),
    Always(),
    Equals(b'')])

Because you have the full power of testtools matchers, you don't need to resort to tricks like HasHeaders or mock.ANY to do something other than an exact equality match.

This is probably only of interest to people using testtools, though.

RTDocs claim Release version, but docs are clearly built against newer code

I was excited by http://treq.readthedocs.org/en/latest/api.html#cookies and tried to use it, but Python tells me treq has no cookies attribute. I was confused why it wasn't working with the 0.2.1 release, since the docs claim here http://treq.readthedocs.org/en/latest/index.html that the version is "Release 0.2.1".

Please add at least the latest release docs to RTD along with the docs built off master.

Please release a new version to pypi for Cookies support.

The version on Pypi's documentation on RTD explains the Cookie usage but support for it is missing in version 0.2.1 and when a cookies kwarg is passed along with a request it doesn't not raise an error making it difficult to pinpoint what's going wrong.

Include an in memory implementation of the treq API.

Also document how to write testable software that uses treq.

The gist of writing testable code that uses treq is dependency injection.

Pass the treq module or an instance of a thing that conforms to treq's interface, to code that
needs to use treq.

thingy = ThingThatWantsTreq(treq)

in tests:

stub_treq = StubTreq()
thingy = ThingThatWantsTreq(stub_treq)

StubTreq should of course include an API for inspecting the requests that were made and programming the responses to return.

pypi release doesn't have multipart goodness

If possible please release a new version to pypi?

Authorization wrapper should be added before any redirect wrappers

treq.client adds redirect wrappers before it adds the auth wrapper. This will break authorization implementations that need to go through a series of request-response cycles, like HTTP Digest authorization.

Move the default agent behavior to HTTPClient

If you have an object which wishes to make HTTP requests but allow for injecting an alternate object (for testing or customization purposes, say injecting a StubTreq), HTTPClient is seemingly the right place to do so.

It would be nice if the treq behavior to automatically use the global agent was moved to this layer rather than up in treq.api, so that one could default to just HTTPClient() and not need to poke at additional treq internals.

Specifically, it would be nice I think if https://github.com/twisted/treq/blob/master/treq/api.py#L114-L122 was part of HTTPClient's defaults.

(And for anyone finding this, in the meantime, what I think is the thing to do is to instead have objects get treq.request injected into them rather than HTTPClient).

Timeout whilst getting response's body

Hi,

I've started using treq on unstable network and I realized that there is no way to define timeout while getting response content. So sometimes I can't process request as it is hanged.

My idea to solve that is to pass request_kwargs to _Response and _BufferedResponse and collect function should look like:

def collect(response, request_kwargs, collector):
    """
    Incrementally collect the body of the response.

    This function may only be called **once** for a given response.

    :param IResponse response: The HTTP response to collect the body from.
    :param collector: A callable to be called each time data is available
        from the response body.
    :type collector: single argument callable

    :rtype: Deferred that fires with None when the entire body has been read.
    """
    if response.length == 0:
        return succeed(None)

    d = Deferred()
    response.deliverBody(_BodyCollector(d, collector))
    if request_kwargs.get('timeout'):
        delayedCall = default_reactor(request_kwargs.get('reactor')).callLater(
            request_kwargs.get('timeout'), d.cancel)

        def gotResult(result):
            if delayedCall.active():
                delayedCall.cancel()
            return result

        d.addBoth(gotResult)
    return d

Guys, can you confirm that issue makes sense?
If yes, I'll prepare more proper fix with tests.

Cheers!

Support PyPy 2.0

I've manually verified that PyOpenSSL doesn't segfault on PyPy 2.0.

But travis doesn't support it yet:
travis-ci/travis-ci#1106

StringStubbingResource docstring is wrong

It says:

        :param dict headers: A dictionary of headers mapping header keys to
            a list of header values (sorted alphabetically)

While they should be lists, the code actually requires them to be single values, since the code calls setHeader. Using a list will result in the header field having a value of the repr of the list.

Requests with Accept-Encoding: gzip return GzipDecoder instead of t.w._newclient.Response

When a web page returns gzipped content, then instead of t.w._newclient.Response, the treq.get returns twisted.web.client.GzipDecoder object. and you can't call things like .status_code on GzipDecoder. Here's a minimalistic bug reproduction.

from twisted.internet.task import react
import treq

def echo(a):
    print a

def main(reactor, *args):
    d = treq.get('http://httpbin.org/get')
    d.addCallback(echo)
    dd=d.addCallback(lambda x: treq.get('http://pastebin.com/archive'))
    dd.addCallback(echo)
    return d

react(main, [])

konrads@mint-dev ~ $ python basic_get.py 
<twisted.web._newclient.Response object at 0x135e750>
<twisted.web.client.GzipDecoder object at 0x135ed10>

[Build Failed]

It is displayed build failed on travis-ci
https://travis-ci.org/dreid/treq

License update.

Can you please add an MIT License if possible to this code :). I would like to work on it as well. I did run the tests and the tests pass for me.

Port to Python 3

Is there a plan to support Python 3?

Tag a release / pop some champagne

Hey!

Any chance a release is in the cards?

Would love to have one with browser_like_redirects.

TIA

Twisted relies on OpenSSL to configure trust roots, resulting in an empty trust store by default on Windows and non-functional HTTPS in treq

Not sure if this is my setup, but I've just done a fresh install of Python 2.7.10 x64 and installed Twisted and Treq (and the required libs) through pip, using MingW to compile where appropriate.

Let's say I'm attempting to access https://google.com:

url = "https://google.com"

treq.get(url).addCallback(callback).addErrback(errback)

My errback gets called with an error of type ResponseNeverReceived, and iterating over error.value.reasons gives me the following Failure:

Failure: [Failure instance: Traceback: <class 'OpenSSL.SSL.Error'>: [('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')]
C:\Python27\lib\site-packages\twisted\internet\selectreactor.py:149:_doReadOrWrite
C:\Python27\lib\site-packages\twisted\internet\tcp.py:209:doRead
C:\Python27\lib\site-packages\twisted\internet\tcp.py:215:_dataReceived
C:\Python27\lib\site-packages\twisted\protocols\tls.py:415:dataReceived
--- <exception caught here> ---
C:\Python27\lib\site-packages\twisted\protocols\tls.py:554:_write
C:\Python27\lib\site-packages\OpenSSL\SSL.py:1271:send
C:\Python27\lib\site-packages\OpenSSL\SSL.py:1187:_raise_ssl_error
C:\Python27\lib\site-packages\OpenSSL\_util.py:48:exception_from_error_queue
]

I've tried a bunch of stuff, including trusting the cacerts.pem provided by cURL in my OS's trust store, installing idna and service_identity and reinstalling OpenSSL. Any ideas on this?

System info:

Windows 10 Insider Preview x64 (Build 10130)
Python 2.7.10 x64
Twisted 15.2.1
Treq 15.0.0

API for checking if response code indicated success

Often the question about a response is not "what is the exact response code", but simply "did it succeed". It would be nice if treq could answer this question.

Issue Uploading File

Hello

I found a issue using i386 arch.

I am using this box https://atlas.hashicorp.com/ubuntu/boxes/trusty32.

I am getting length value as 'long' type this is the error log

      File "venv/local/lib/python2.7/site-packages/treq/api.py", line 31, in post
    return _client(**kwargs).post(url, data=data, **kwargs)
  File "venv/local/lib/python2.7/site-packages/treq/client.py", line 103, in post
    return self.request('POST', url, data=data, **kwargs)
  File "venv/local/lib/python2.7/site-packages/treq/client.py", line 156, in request
    data + files, boundary=boundary)
  File "venv/local/lib/python2.7/site-packages/treq/multipart.py", line 61, in __init__
    self.length = self._calculateLength()
  File "venv/local/lib/python2.7/site-packages/treq/multipart.py", line 124, in _calculateLength
    for i in self._writeLoop(consumer):
  File "venv/local/lib/python2.7/site-packages/treq/multipart.py", line 162, in _writeLoop
    yield self._writeField(name, value, consumer)
  File "venv/local/lib/python2.7/site-packages/treq/multipart.py", line 172, in _writeField
    name, filename, content_type, producer, consumer)
  File "venv/local/lib/python2.7/site-packages/treq/multipart.py", line 197, in _writeFile
    consumer.write(producer.length)
  File "venv/local/lib/python2.7/site-packages/treq/multipart.py", line 297, in write
    self.length += len(value)
  exceptions.TypeError: object of type 'long' has no len()

drop twisted < 14.0

New Treq should really just mean new Twisted. Since HTTPS didn't really work correctly until 14.0 anyway, it seems like we should stop having such a huge build matrix for old versions of Twisted. We should also draw up a plan for what versions of Twisted are supported.

Tie pypi Releases to Github Releases

When looking at a release on pypi, it is difficult to see what has changed in that release, or even if the current Github content has been released. It would be great to see releases on pypi match releases on Github.

treq.collect(...) hangs indefinitely (never calls collect(None))

I don't know if I'm using this correctly, but it seems treq.collect(...) doesn't call collect(None) when the input is finished. Here is a simple test script which illustrates this:

from __future__ import absolute_import, division, print_function, unicode_literals

import sys
import treq

def cbCollect(a_str):
    if a_str is None:
        # Never called!
        print('FINISHED')
        sys.exit()

    print('GOT: ' + repr(a_str[:3] + ' ... ' + a_str[-3:]))

def cbGet(a_response):
    d = treq.collect(a_response, cbCollect)
    d.addErrback(ebDebug)

def ebDebug(*a_args, **a_kwargs):
    local = dict(globals())
    local.update(locals())
    import code
    code.interact(local = local)

def test():
    d = treq.get(b'http://google.com/')
    d.addCallback(cbGet)
    d.addErrback(ebDebug)

if __name__ == '__main__':
    import twisted.internet.reactor
    twisted.internet.reactor.callWhenRunning(test)
    print(sys.version)
    twisted.internet.reactor.run()

When I run this (as test.py), I get:

% python test.py
2.7.6 (default, Nov 19 2013, 15:51:54)
[GCC 4.2.1 Compatible Apple Clang 3.0 (tags/Apple/clang-211.10.1)]
GOT: u'<!d ... ati'
GOT: u'on. ... l,k'
GOT: u'){v ... rip'
GOT: u't>< ... ilt'
GOT: u'er: ... x}#'
GOT: u'gbz ... g:5'
GOT: u'px  ... #f5'
GOT: u'f5f ... ace'
GOT: u':no ... us:'
GOT: u'foc ... -li'
GOT: u'nea ... m(#'
GOT: u'fff ... om:'
GOT: u'1}. ... .2)'
GOT: u',rg ... int'
GOT: u'er; ... =fu'
GOT: u'nct ... 8d1'
GOT: u'996 ... a[b'
GOT: u'].a ... men'
GOT: u'tBy ... unc'
GOT: u'tio ... n()'
GOT: u'{re ... oad'
GOT: u'Tim ... 2E"'
GOT: u').r ... ,b,'
GOT: u'c){ ... oad'
GOT: u'(a. ... rse'
GOT: u'Int ... ;b='
GOT: u'a.h ... ass'
GOT: u'=gb ... m i'
GOT: u'd=g ... b=w'
GOT: u'e"> ... v c'
GOT: u'las ... las'
GOT: u's=" ... gle'
GOT: u'.y) ... a.d'
GOT: u'eta ... ml>'
^C%

Note at the end, "FINISHED" is never printed, and the reactor just hangs until it receives a SIG_INT (CTRL-C).

This might be user error, in which case I apologize for any inconvenience.

Cookies top-level function does not exist, but shows up in docs

http://treq.readthedocs.org/en/latest/howto.html#cookies

The example here does not work on current master, since treq does not have "cookies" as a top level name. The closest thing I found was the treq.response._Response.cookies method.

Reactor unclear on a twisted unit test when using content()

Hello

I am on twisted 15 on windows with latest version of treq and I have the following issue:

when, in a unit test, I execute treq.content(response), everything work perfecly until the end of the trial unit test. reponse is the response of treq.get(). It issues a Reactor unclear exception (delayed response). do you have seen such issue? Only occurs when retrieving the content, if I don't do the treq.content(), there is not problem with the reactor.

thanks

Errbacks fire on success when inside CooperativeTask loop

I am running a twisted beanstalkd client such as the example given in pybeanstalkd repo. The only difference is that in the executor callback I am just additionally making a normal treq POST request that registers both callback and errback.

Everything works smooth except that almost all (670 out of 685 or so I tested) successful requests also fire the errback, with the following error:

Failure instance: Traceback (failure with no frames): 
<class 'twisted.web._newclient.RequestTransmissionFailed'>:
 [<twisted.python.failure.Failure <class 'twisted.internet.error.ConnectionDone'>>

Here's further inspection of failure.value.reason:

Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionDone'>: Connection was closed cleanly.

Add support for an `auth` callable ala requests

This would be A+ great!

Incorrect behavior when using timeout parameter

Hello,
I've recently ran into an issue when trying to handle request timeouts in my code.

As I understand from the documentation, when the timeout fires, the request should fail raising a CancelledError. However, I could not reliably catch it.

Below there's a minimal example script to illustrate the problem:

import sys

import treq
from twisted.internet.defer import CancelledError, inlineCallbacks
from twisted.internet.task import react

# request to the URL below takes about 5 seconds to complete:
TEST_URL = 'http://httpbin.org/drip?duration=5'
TIMEOUT = float(sys.argv[-1])


@inlineCallbacks
def main(_):
    try:
        response = yield treq.get(TEST_URL, timeout=TIMEOUT)
        print response.code
    except CancelledError:
        print 'timeout'

react(main)

As stated in the comment, the request to the test URL requires about 5 seconds to complete. Provided that the timeout is long enough (say, 10 seconds), the request indeed completes correctly:

$ python example.py 10
200

In case the timeout is too short, I'd expect the script to print 'timeout'. Instead it raises an unhandled failure:

$ python example.py 4
main function encountered error
Traceback (most recent call last):
Failure: twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.defer.CancelledError: >]

Make the timeout even shorter, so that the request is canceled even before establishing the connection, and it will raise different error instead:

$ python example.py 0.01
main function encountered error
Traceback (most recent call last):
Failure: twisted.internet.error.ConnectingCancelledError: IPv4Address(TCP, 'httpbin.org', 80)

Again, I believe that in both these cases the excepted behavior is that treq.get raises a CancelledError in deferred, which should be handled by the script above. Am I correct?

Tested with treq 15.0.0 and Twisted 15.4.0.

print_response example in docs

Great library, I am looking forward to using it.

It took me a bit longer to get it working because I had to figure out what handler like print_repsonse might look like. I know there is a fair amount of variation, but a default example like this one from the tests:

https://github.com/dreid/treq/blob/master/treq/test/test_treq_integration.py#L33

Might be handy to help people get started.

I can write patch if this is a desirable change, or I may be missing something.

Cookies not set during redirects

I am using treq version = '0.2.1'.

I am trying to make a GET request to a site that redirects to the help docs. The help docs site has separate anonymous and logged-in views, so it redirects to an authentication page that either authenticates the requestor based on a cookie or redirects back to the help docs. In all the browsers I use, this behavior seems to work fine, because the help-docs site won't keep redirecting once you have a session. treq doesn't seem to be putting anything in the cookiejar, though, so I get a twisted.web.error.InfiniteRedirection exception.

The sites are public facing . Here is the example code:

import cookielib
from twisted.internet.task import react
import treq


def main(reactor, *args):
    #jar = {}
    jar = cookielib.CookieJar()
    method = 'GET'
    url = 'https://password.lafayette.edu/'
    options = dict(allow_redirects=True, cookies=jar)
    d = treq.request(method, url, **options)

    def logit(resp):
        jar = resp.cookies()

        print "Code:", resp.code
        print
        print 'Cookies:', jar
        print
        print "Response history:", resp.history()
        print
        print "Headers:", resp.headers
        print

        return resp


    d.addCallback(logit)
    d.addCallback(treq.content)

    return d

react(main, [])

Also, if I set allow_redirects to False, the code that attempts to get the cookies (taken from the readthedocs site), fails and says that resp doesn't have a "cookies" attribute.

Thanks,
Carl

Document treq's default behavior to buffer responses

treq's documentation is devoid of explanation of the unbuffered parameter.

https://www.google.com/search?q=site%3Atreq.readthedocs.org+%22buffer%22&oq=site%3Atreq.readthedocs.org+%22unbuffered%22

No comments, either.

$ grep -riI "unbuffered" treq/
./client.py:        if not kwargs.get('unbuffered', False):
./test/test_client.py:        d = self.client.get('http://www.example.com', unbuffered=True)

The behavior to buffer responses by default differs from t.w.c.Agent, which doesn't. This behavior is unexpected. It should be documented.

Where should it be documented? I'll submit a pull request.

Include documentation into source packages

Please, consider adding to MANIFEST.in the following code to include Sphinx documentation into the sdist package building.

recursive-include docs *
prune docs/build

Thanks,
Jordi

Handle Internationalized Domain Names

https://en.wikipedia.org/wiki/Internationalized_domain_name

We really want to use treq with Twisted for making requests, but we are afraid that our code will not work if/when we encounter internationalized urls.

treq hangs if response is not consumed

I am using treq in a project where requests are made to a second server and I am only interested in the actual response code-- not the response body. the call to treq.get or treq.post is in my deferred chain, but the next callback in the sequence just discards the result, assuming if there was a problem, the errBack will handle it.

This seemed to work for the first request, but on subsequent requests made, treq logs it made the request and the response status is 200, but it just hangs unless I specify a timeout.

I found that if I placed a treq.content after the request in the deferred chain, the issue goes away.

Is this the expected behavior? I couldn't find anywhere in the docs that suggested you must read the response body or otherwise complete the request in some way.

Thanks,
Carl

Support multipart/mixed POST

There are some API services out there in the wild that require a content-type of "multipart/mixed", so that a POST may contain, for example, a JSON payload and a binary image.

treq inspects the kwargs for a POST, and if it finds "data", it sets the content-type to "application/x-www-form-urlencoded"; if it finds "files", it sets the content-type to "multipart/form-data". In order to support "multipart/mixed", treq could probably add a conditional check for "data and files", which if true would generate the appropriate request header and body.

Treq / requests GET params behaviour difference on None values

While trying to port some code from requests to treq, I've encountered a weird compatiblity issue when a GET param has a None value. When using requests a None value is not sent. Meaning that

requests.get("http://server.example.com/", params={"foo": None})

actually sends a GET / HTTP/1.1, while the same kind of code with treq

treq.get("http://server.example.com/", params={"foo": None})

sends GET /?foo=None HTTP/1.1

Looking at the requests code, the "params" item is processed through the requests.sessions.merge_setting function, which removes None values.

I'd like to think treq should have the same behaviour.

treq unable to determine the length of a response

When using treq i've noticed that often times it is unable to get the length of the response.

Reproducible Using:

import requests
import treq


def done(response):
    print(response.length)
    reactor.stop()

treq.get("https://travis-ci.org/pypa/warehouse.png?branch=master").addCallback(done)

from twisted.internet import reactor
reactor.run()

print(requests.get("https://travis-ci.org/pypa/warehouse.png?branch=master").headers["Content-Length"])

Output:

twisted.web.iweb.UNKNOWN_LENGTH
1492

Curl:

$ curl https://travis-ci.org/pypa/warehouse.png\?branch\=master --location -I
HTTP/1.1 301 Moved Permanently
Content-length: 0
Content-Type: text/html;charset=utf-8
Location: https://api.travis-ci.org/pypa/warehouse.png?branch=master
Connection: keep-alive

HTTP/1.1 200 OK
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Content-Type, Cache-Control, Expires, Etag, Last-Modified
Age: 0
Cache-Control: no-cache
Content-Disposition: inline; filename="passing.png"
Content-length: 1461
Content-Type: image/png
Date: Mon, 03 Mar 2014 22:35:18 GMT
Etag: "33e721b0e117a07064572eb8537344a6"
Expires: Mon, 03 Mar 2014 22:35:17 GMT
Last-Modified: Mon, 03 Mar 2014 06:27:20 GMT
Pragma: no-cache
Server: nginx/1.5.7
Status: 200 OK
Strict-Transport-Security: max-age=31536000
Vary: Accept,Accept-Encoding
X-Accepted-Oauth-Scopes: public
X-Content-Digest: aba9e7b121a52e3fdbbfd0b060dba6a3bbcf1bed
X-Endpoint: Travis::Api::App::Endpoint::Repos
X-Oauth-Scopes: public
X-Pattern: /:owner_name/:name
X-Rack-Cache: miss, store
Connection: keep-alive

Automatically log requests and response codes

The project I'm currently working on wraps treq so that it logs requests made.

Now that the new twisted logger is in place, it seems to be a useful thing to automatically include in treq as well, logging at either the INFO or DEBUG level.

I was thinking of implementing this in the following manner:

Generate a request ID per request, and log the request ID, the method, the url, the headers, the data (if included), and whether or not files were included, before actually issuing the request.
Once a response is received, log the request ID, the headers, and the response code.
If an error occurs while making the request, log the request ID and the error (not at the ERROR level - just the same log level everything else is logged).

But came across the following caveats:

I'd like to be able to pass in extra parameters to include in the log event, which may be useful in tracing why a request was made. For instance, if this request was made in response to some event, then include the event ID, which we may not necessarily want to include in the request headers.
Logging the request body, or even the headers or URL, might be a security risk, since maybe the request body contains password information if you are making an auth request. Maybe the headers contain API keys. Maybe the URL contains API keys or capability hooks you don't want logged.

I'm not sure how to solve the first - it seems like it'd require treq either taking a logging callable with kwargs already bound to it, or taking an extra parameter containing extra logging kwargs.

And as for the second - it seems like there need to be more fine-grained filters on what kind of information can be logged? Maybe not logging request bodies or headers at all, or only logging some? Maybe not providing logging at all for some URLs... Not sure what this should look like.

A Content-Length header is added even if one is provided

This results in an invalid header value of 10,10 or something.

treq should either replace the provided value or notice that it's there and use it (possibly throwing an exception if it's wrong) but it shouldn't send an invalid value with the request.

"Handling Streaming Responses" buffers the file in memory

Hi,

The following example on http://treq.readthedocs.org/en/latest/howto.html:

def download_file(reactor, url, destination_filename):
    destination = file(destination_filename, 'w')
    d = treq.get(url)
    d.addCallback(treq.collect, destination.write)
    d.addBoth(lambda _: destination.close())
    return d

buffers the whole file in memory, making the process use 5+ GiB when downloading a 4 GiB file.
Is this intended?

Importing treq within a twistd plugin results in "reactor already installed"

Simplest example:

/Users/Foo/SomeProject/twisted/plugins/foo_plugin.py

import treq

def makeService(options):
    pass

The output, when running with twistd:

bash-4.3$ twistd --nodaemon --reactor=kqueue --version

[Options output elided]

/Users/Foo/virtualenvs/SomeProject/bin/twistd: The specified reactor cannot be used, failed with error: reactor already installed

This looks to me like a problem with treq importing names from twisted.web.client, which imports the reactor at module level.

Missing a Dict-like Headers Interface

Unless I'm missing it there's no dictlike interface to headers in treq either. At best it appears there is a twisted Headers class which requires me to do something like response.headers.getRawHeaders("My-Header")[0] which will blow up with TypeError: 'NoneType' object is not subscriptable.

Ideally this would be something that supports plain old dictionary access and returns a single str as it's value. If there are multiple items in the list then it does ",".join(that_list) which would be in line with RFC2616. This makes the common case of a single value way simpler, gives a more reasonable error on failure (KeyError) and still makes getting a list pretty simple (headers["MyHead"].split(",")).

At the very least this should be documented as something that treq doesn't have.

[Question] Keep-Alive & Connection Pooling isn't working

I wirite a very simple spider program to fetch webpages from single site.

Here is a minimized version https://gist.github.com/Preffer/dad9b1228fcd75cebd75

For a few requests, like 100, it works fine. But for massive requests, it will failed.

I expect all of the requests(around 3000) will be automatically pooled, scheduled and pipelined, just as described in treq's feature list.

But it doesn't, the connections are not keep-alive nor pooled.

In this programm, it did establish the connections incrementally, but the connections didn't pooled, each connecction will close after body received, and later requests never wait in the pool for an available connecction.

So it will take thousands of sockets, and finally failed due to timeout, because the remote server has a connection timeout set to 30s. Thousands of requests can't be done within 30s.

Could you please give me some help on this?

treq blows up if you give it a unicode url

If you pass a unicode url into treq it will blow up with an exception:

2014-03-03 11:57:29-0500 [HTTPChannel,0,127.0.0.1] Unhandled Error
    Traceback (most recent call last):
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/twisted/web/server.py", line 238, in render
        body = resrc.render(self)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/klein/resource.py", line 111, in render
        d = defer.maybeDeferred(_execute)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/twisted/internet/defer.py", line 139, in maybeDeferred
        result = f(*args, **kw)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/klein/resource.py", line 105, in _execute
        **kwargs)
    --- <exception caught here> ---
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/twisted/internet/defer.py", line 139, in maybeDeferred
        result = f(*args, **kw)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/klein/app.py", line 87, in execute_endpoint
        return endpoint_f(self._instance, *args, **kwargs)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/klein/app.py", line 172, in _f
        return _call(instance, f, request, *a, **kw)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/klein/app.py", line 27, in _call
        return f(*args, **kwargs)
      File "/Users/dstufft/projects/smuggler/smuggler.py", line 15, in hello
        d = treq.get(url)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/treq/api.py", line 20, in get
        return _client(**kwargs).get(url, headers=headers, **kwargs)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/treq/client.py", line 79, in get
        return self.request('GET', url, **kwargs)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/treq/client.py", line 129, in request
        method, url, headers=headers, bodyProducer=bodyProducer)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/twisted/web/client.py", line 1584, in request
        deferred = self._agent.request(method, uri, headers, bodyProducer)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/twisted/web/client.py", line 1652, in request
        deferred = self._agent.request(method, uri, headers, bodyProducer)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/twisted/web/client.py", line 1282, in request
        parsedURI = _URI.fromBytes(uri)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/twisted/web/client.py", line 598, in fromBytes
        scheme, netloc, path, params, query, fragment = http.urlparse(uri)
      File "/Users/dstufft/.virtualenvs/smuggler/lib/python2.7/site-packages/twisted/web/http.py", line 161, in urlparse
        raise TypeError("url must be bytes, not unicode")
    exceptions.TypeError: url must be bytes, not unicode

This should mimic what requests does, which is encode the host with idna, and then encode everything with utf8. This will work in the vast bulk of situations, and in situations it won't people can still pass in bytes.

The requests code is: https://github.com/kennethreitz/requests/blob/0caa2432123bab2d991e635ce558226d019d7bc7/requests/models.py#L352 and https://github.com/kennethreitz/requests/blob/0caa2432123bab2d991e635ce558226d019d7bc7/requests/models.py#L369-L378.

Missing content in a few sections on the Read The Docs page

The Making Requests and Accessing Content sections in the documentation are empty. Looking at the sources, it's supposed to auto generate the doc strings from a number of methods but for whatever reason, the content isn't being generated. Is it a misconfigured Sphinx conf.py?

https://treq.readthedocs.io/en/latest/api.html
https://github.com/twisted/treq/blob/master/docs/api.rst

Feature Request: verify=boolean

Like the requests argument. Allow you to easily check or not check the SSL cert.

This is especially helpful for dev environments where you use a self-signed cert.
I realize with Twisted 14+ and service_identity the certs are verified by default, but a way to disable the cert checking would be helpful in development scenarios.

Thanks,
Carl

Read the Docs has old docs

The docs I find at http://treq.readthedocs.org/en/latest/ (linked from the readme here and PyPI) appear to be for treq 0.2.1. I want to learn about stubbing but this makes it hard. 😭

very big passwords break basic auth

In treq/auth.py a call is made to str.encode('base64') in order to create the HTTP Basic Auth header value. However when the password is sufficiently long the base64 codec in python produces a multiline string. The code as is only strips the last newline and so authentication fails. A fix would be to use the base64 module. I will prepare a pull request.

twisted memory leak

hi,
i am new to learn treq lib.
when i practice the provided examples( basic_get.py ), i change the code like this:

@inlineCallbacks
def request_done(resp):
       text = yield treq.content(resp)

  def main(reactor, *args):
         while True:
              time.sleep(0.01)
              d = treq.get('http://www.google.com')
              d.addCallback(request_done)
         return d

react(main, [])

when i use the # top command to observe the %Mem, i find it will increase gradually;
so, i would like to ask for such a reason and the way to avoid this case.

thanks very much

Allow passing custom contextFactory for client authentication etc

I would like to use treq with a setup requiring a custom context factory. My specific use case now is talking to an endpoint with needs client authentication / MutualTLS:

http://twisted.readthedocs.org/en/latest/core/howto/ssl.html#client-authentication

I think this could be enabled by passing an optional contextFactory keyword argument (in **kwargs) in treq.api functions down to private _client() api and then pass this (if provided) to Agent's init.

Doesn't allow passing a contextFactory to Agent.

This means that there is no way for a user to make treq do any certificate verification.

Document behavior of treq's pool

Currently using treq in tests leads to reactor unclean errors which unnecessarily confuses people (#86).

Maybe also add a convenience method to close it to spare people this dance.

twisted / treq Goto Github PK

treq's Introduction

treq: High-level Twisted HTTP Client API

Contributing

Code of Conduct

Copyright and License

treq's People

Contributors

Stargazers

Watchers

Forkers

treq's Issues

Recommend Projects

Recommend Topics

Recommend Org