requests-cache / aiohttp-client-cache Goto Github PK

An async persistent cache for aiohttp requests

License: MIT License

Python 99.66% Dockerfile 0.34%

aiohttp async asyncio cache client dynamodb http mongodb persistence redis requests sqlite

aiohttp-client-cache's Introduction

Summary

requests-cache is a persistent HTTP cache that provides an easy way to get better performance with the python requests library.

Complete project documentation can be found at requests-cache.readthedocs.io.

Features

🍰 Ease of use: Keep using the requests library you're already familiar with. Add caching with a drop-in replacement for requests.Session, or install globally to add transparent caching to all requests functions.
🚀 Performance: Get sub-millisecond response times for cached responses. When they expire, you still save time with conditional requests.
💾 Persistence: Works with several storage backends including SQLite, Redis, MongoDB, and DynamoDB; or save responses as plain JSON files, YAML, and more
🕗 Expiration: Use Cache-Control and other standard HTTP headers, define your own expiration schedule, keep your cache clutter-free with backends that natively support TTL, or any combination of strategies
⚙️ Customization: Works out of the box with zero config, but with a robust set of features for configuring and extending the library to suit your needs
🧩 Compatibility: Can be combined with other popular libraries based on requests

Quickstart

First, install with pip:

pip install requests-cache

Then, use requests_cache.CachedSession to make your requests. It behaves like a normal requests.Session, but with caching behavior.

To illustrate, we'll call an endpoint that adds a delay of 1 second, simulating a slow or rate-limited website.

This takes 1 minute:

import requests

session = requests.Session()
for i in range(60):
    session.get('https://httpbin.org/delay/1')

This takes 1 second:

import requests_cache

session = requests_cache.CachedSession('demo_cache')
for i in range(60):
    session.get('https://httpbin.org/delay/1')

With caching, the response will be fetched once, saved to demo_cache.sqlite, and subsequent requests will return the cached response near-instantly.

Patching

If you don't want to manage a session object, or just want to quickly test it out in your application without modifying any code, requests-cache can also be installed globally, and all requests will be transparently cached:

import requests
import requests_cache

requests_cache.install_cache('demo_cache')
requests.get('https://httpbin.org/delay/1')

Headers and Expiration

By default, requests-cache will keep cached responses indefinitely. In most cases, you will want to use one of the two following strategies to balance cache freshness and performance:

Define exactly how long to keep responses:

Use the expire_after parameter to set a fixed expiration time for all new responses:

from requests_cache import CachedSession
from datetime import timedelta

# Keep responses for 360 seconds
session = CachedSession('demo_cache', expire_after=360)

# Or use timedelta objects to specify other units of time
session = CachedSession('demo_cache', expire_after=timedelta(hours=1))

See Expiration for more features and settings.

Use Cache-Control headers:

Use the cache_control parameter to enable automatic expiration based on Cache-Control and other standard HTTP headers sent by the server:

from requests_cache import CachedSession

session = CachedSession('demo_cache', cache_control=True)

See Cache Headers for more details.

Settings

The default settings work well for most use cases, but there are plenty of ways to customize caching behavior when needed. Here is a quick example of some of the options available:

from datetime import timedelta
from requests_cache import CachedSession

session = CachedSession(
    'demo_cache',
    use_cache_dir=True,                # Save files in the default user cache dir
    cache_control=True,                # Use Cache-Control response headers for expiration, if available
    expire_after=timedelta(days=1),    # Otherwise expire responses after one day
    allowable_codes=[200, 400],        # Cache 400 responses as a solemn reminder of your failures
    allowable_methods=['GET', 'POST'], # Cache whatever HTTP methods you want
    ignored_parameters=['api_key'],    # Don't match this request param, and redact if from the cache
    match_headers=['Accept-Language'], # Cache a different response per language
    stale_if_error=True,               # In case of request errors, use stale cache data if possible
)

Next Steps

To find out more about what you can do with requests-cache, see:

aiohttp-client-cache's People

Contributors

Stargazers

Watchers

aiohttp-client-cache's Issues

Add possibility to check if request is a cache or not.

Migrate to aioredis 2.0 for python 3.10

There are a few updates needed to work with aioredis 2.0. See: aio-libs-abandoned/aioredis-py#930

This isn't urgent, but will be needed before fully supporting python 3.10.

Add support for 'Expires' response header

This should be relatively easy after #54; just need to add support for parsing HTTP timestamps.

Use poetry for packaging

Complete Redis backend and add integration tests

The current state is compatible with the BaseCache API, but missing a couple methods, does not run asynchronously, and is untested.

Note: redis-py does not have async support, but there are several alternatives that do, including aioredis.

CachedResponse.headers are case-sensitive and do not support multiple values for the same key

CachedResponse's headers attribute is a regular dict. This makes it case sensitive and does not support multiple values for the same key. The original ClientSession makes this available through a CIMultiDictProxy and it's called out in aiohttp's docs: https://docs.aiohttp.org/en/stable/client_advanced.html#response-headers-and-cookies
The multiple values may be an edge case, but it breaks compatibility for users who use the getall method on the headers object. The case sensitivity has greater impact, producing misses when you use a different case for "content-type" or "cache-control".

SQLite Backend not working as intended

The problem

Hello, I'm trying to use the sqlite backend to cache api queries. I'm trying to query one API, parse the results and with the parsed info am making another request. If I use the standard aiohttp.ClientSession() as well as the CachedSession with the base cache everything works. Changing the CachedSession to use the SQLite backend returns some errors like Task <Task pending name='Task-3' coro=<task() running at ......

Expected behavior

I expected that changing the backend wouldn't raise those errors. I guess it could be due to the fact the code itself but since the standard Clientsession and CachedSession work, I assume the behaviour is not intended.

Steps to reproduce the behavior

I build a minimum not working example with the googlebooks api.

import asyncio
from aiohttp_client_cache import CachedSession, SQLiteBackend
from datetime import timedelta


isbn_list = [
    '9780002005883',
    '9780002238304',
    '9780002261982',
    '9780006163831',
    '9780006178736',
    '9780006280897',
    '9780006280934',
    '9780006353287',
    '9780006380832',
    '9780006470229'
]

#Cache definition
cache = SQLiteBackend(
    cache_name='./.cache/books_cache.sqlite',
    expire_after=timedelta(days=100),
)

GOOGLE_BOOKS_URL = "https://www.googleapis.com/books/v1/volumes?q=isbn:{id}"


def parse_resposne(response):
    """Extract fields from API's response"""
    if response:
        item = response.get("items", [{}])[0]
        isbn = item["volumeInfo"]["industryIdentifiers"][0]["identifier"]
        return isbn


async def fetch_api(session, url, format=None):
    try:
        response = await session.get(url)
        if response.ok:
            if format == "json":
                return await response.json()
            else:
                return await response.text()
        else:
            return None
    except Exception as err:
        print(f"An error has occured: {err}")


async def task(session, isbn):
    response = await fetch_api(session, GOOGLE_BOOKS_URL.format(id = isbn), format="json")
    second_id = parse_resposne(response)
    metadata = await fetch_api(session, GOOGLE_BOOKS_URL.format(id = second_id))
    return metadata


async def main():
    async with CachedSession(cache=cache) as session:
        books_metadata = await asyncio.gather(*(task(session, isbn) for isbn in isbn_list))

if __name__ == "__main__":
    asyncio.run(main(), debug=True)

I'd really appreciate help to find out why this error occurs. I thought of giving other backends a try since this seems to be specific to the sqlite backend. I'm hesitant thought because the other solutions seem to require more initial setup. I'm also wondering if it's possible to have a persistant cache with the redis backend. I never used redis. I'm just curious if it's worth giving it a try instead of sqlite and cache persistance is a requirement for my project.

Many thanks!

Complete DynamoDB backend and add integration tests

The current state is compatible with the BaseCache API, but missing a couple methods, does not run asynchronously, and is untested.

Note: boto3 does not have async support. aioboto3 may be worth looking into.

DELETE ME

EDIT: created this on the wrong repo. 🤦 This is what happens when you keep switching back and forth between two similar projects!

Refactor CacheClient to take a cache backend instance instead of init kwargs

Idea from #16.

Instead of top-level __init__() kwargs for cache settings, I think it would be better for CacheClient to take a cache backend instance. For example, instead of this:

session = CachedSession(backend='sqlite', expire_after=120, include_headers=True)

Do this:

cache = SQLiteBackend(expire_after=120, include_headers=True)
session = CacheClient(cache=cache)

This would improve usage when used as a mixin class instead of a direct subclass of ClientSession, since it would only add a single __init__() kwarg instead of 8+ kwargs.

Add general cache integration tests with httpbin container

There are currently a few things broken on httpbin.org, so I wouldn't want to rely on that for automated tests. The Docker container works just fine, though (both locally and in GitHub Actions), and would be very handy for integration tests.

Skip cache when request include no-cache, no-store, or max-age=0 headers

A client can request a non-cached response by including a no-cache or max-age=0 cache-control header. If a client includes these headers in the request, the cache lookup should be skipped. Ideally, we'd still want to update the cache with the fetched response.

Use cattrs to optimize response serialization

Since we're already using attrs, there would be several potential benefits from using cattrs, which is basically a fancier version of attrs.asdict(). It uses attrs definitions to recursively unstructure classes into dicts and builtin types, and vice versa. This would make both serialization and deserialization more efficient, and open up the possibility for other serialization formats besides just pickle (like JSON and BSON) since it does most of the work needed.

CachedResponse is unpicklable when `history` is non-null

When history exists for a ClientResponse object, the corresponding CachedResponse cannot be pickled when saving to the backend, since CachedResponse.history is being saved as a generator. I think the intention was to save it as a tuple, but the current syntax used represents a generator expression. Even if it's modified to become a tuple, it's still unpicklable since history will be a tuple of coroutines unless we await the recursive calls. Will try to get a PR out to fix this.

Add option to store redirects and responses in a single table/collection

Follow-up from #9. The purpose of this would be to eliminate the extra calls needed to check the redirects table, at the cost of some extra space used by duplicate response data. This would be useful for cases in which cache operations are higher-latency and the cost of storage is negligible.

Add GridFS backend

I doubt anyone has a need for this backend yet, but I ported it over from requests-cache anyway. The main use case is for responses that can be potentially very large (>16MB). Currently, GridFSCache has a couple missing functions, and the operations are blocking. It looks like Motor includes support for GridFS, so it shouldn't be too much work to finish this up if/when someone needs it.

Unit tests: Cache base classes

Both CacheController and BaseCache need unit tests. DictCache is probably trivial enough that it doesn't need tests.

Convert sqlite backend to aiosqlite

Depends on #3

Python's sqlite3 library does not have async support, so sqlite cache operations are currently synchronous. Fortunately, aiosqlite looks like it will do the trick.

Rewrite storage interface + implementations to be async-compatible

Currently, all the storage classes use a dict-like interface (collections.abc.MutableMapping). Unfortunately, there is no syntax support in python for async dict operations, e.g.:

class AsyncDict(MutableMapping):
    async def __getitem__(self, key):
        # ...
    async def __setitem__(self, key, value):
        # ...

my_dict = AsyncDict()
await my_dict['key'] = 'value'
await my_dict['key']

So, this will require a new base storage interface that uses regular methods rather than dict operations.

Ideally, it would be good to merge the storage base class with BaseCache, if there is an elegant way to do it. Currently, they are separate so that BaseCache can have separate references to a key -> response collection and a redirect_key -> key collection.

Add argument to use a tempfile for SQLite db

It can sometimes be useful to use a tempfile as a SQLite db, if you only care about short-term persistence, but beyond the scope of a single session object (i.e., where the default non-persistent cache isn't useful). This is easy enough to do manually, for example in the unit tests:
https://github.com/JWCook/aiohttp-client-cache/blob/e805e276acaa06ef22b62a262cc26f99125e6a41/test/conftest.py#L52-L58

But it would be convenient to have an argument to SQLiteBackend that would do this for you.

Add Docker-Compose config for storage backends

In addition to unit tests, it would be useful to have a docker-compose.yml to spin up instances of Redis, MongoDB, and DynamoDB for manual testing purposes (basically a lazy, non-automated integration test). This would also help users who would like to quickly test out one or more of these backends locally.

Fix filtering ignored parameters for request body

Currently, it looks like ignored_params aren't getting filtered out of request body (data and json) if it's in dict format.

Error in aiohttp_client_cache on Linux

The problem

The following error is thrown when trying to run my application on Linux. The application runs fine in Windows

Traceback (most recent call last):
File "dex.py", line 32, in
from aiohttp_client_cache import CachedSession, SQLiteBackend
File "/home/ben/.local/lib/python3.6/site-packages/aiohttp_client_cache/init.py", line 5, in
from aiohttp_client_cache.backends import *
File "/home/ben/.local/lib/python3.6/site-packages/aiohttp_client_cache/backends/init.py", line 5, in
from aiohttp_client_cache.backends.base import ( # noqa: F401
File "/home/ben/.local/lib/python3.6/site-packages/aiohttp_client_cache/backends/base.py", line 14, in
from aiohttp_client_cache.response import AnyResponse, CachedResponse
File "/home/ben/.local/lib/python3.6/site-packages/aiohttp_client_cache/response.py", line 38, in
LinkMultiDict = MultiDictProxy[MultiDictProxy[Union[str, URL]]]
TypeError: 'type' object is not subscriptable

Expected behavior

Behaviour is consistent with Windows environment

Steps to reproduce the behavior

Try running the following code in Ubuntu Linux:

#import gevent
#from gevent import monkey
#monkey.patch_all(thread=False)
#import grequests
import json
import kivy
from kivy.app import App
from kivy.uix.floatlayout import FloatLayout
from kivy.uix.gridlayout import GridLayout
from kivy.lang import Builder
from kivy.uix.screenmanager import ScreenManager, Screen
from kivy.base import runTouchApp
from kivy.uix.button import Button
from kivy.uix.behaviors import ButtonBehavior
from kivy.uix.label import Label
from kivy.uix.image import AsyncImage
from kivy.core.window import Window
from kivy.uix.scrollview import ScrollView
from kivy.clock import Clock
from kivy.network.urlrequest import UrlRequest
Clock.max_iteration = 152
from kivy.animation import Animation
import time
import certifi
import os
import threading
from functools import partial
import aiohttp
import asyncio
#import requests_cache
from aiohttp_client_cache import CachedSession, SQLiteBackend

BASE_URL = "https://pokeapi.co/api/v2/pokemon/"
NUM_POKES = 152
os.environ['SSL_CERT_FILE'] = certifi.where()
requests_cache.install_cache('poke_cache', backend='sqlite')
start_time = time.time()

def exception_handler(r, e):
	print('failed ', r.url, '\n')
	res = r.send().response
	return res

async def query_api(urls):
	now = time.ctime(int(time.time()))
	result = []
	#async with aiohttp.ClientSession() as session:
	async with CachedSession(cache=SQLiteBackend('poke_cache')) as session:
		for u in urls:
			async with session.get(u) as resp:
				pokemon = await resp.json()
				result.append(pokemon)

	print("--- %s seconds ---" % (time.time() - start_time))
	return(result)


def success(result, lst, *args):
	result.append(lst.result)

def failure(req, result):
	print(result.resp_status)

def make_list():
	urls  = []
	for p in range(NUM_POKES):
		if p != 0:
			url = '{0}{1}'.format(BASE_URL, str(p))
			urls.append(url)
	return urls


def get_pokemon_id(data):
	pokeId = data
	if pokeId  < 10:
		pokeId = '{0}{1}'.format('00',str(pokeId))
	elif pokeId > 9 and pokeId < 100:
		pokeId = str(0) + str(pokeId)
		
	return str(pokeId)

def get_pokemon_types(data):
	allTypes = data
	types = []
	
	for r in allTypes:
		types.append(r['type']['name'])
	
	if len(types) < 2:
		pokeType = types[0]
	else:
		pokeType = str(types[0] + "/" + types[1])
	
	return pokeType

def get_pokemon_abilities(data):
	allAbilities = data
	abilities  = []
				
	for a in allAbilities:
		abilities.append(a['ability']['name'])
	
	if len(abilities) == 1:
		pokeAbility =  abilities[0]
	elif len(abilities) == 2:
		pokeAbility = (abilities[0] + "/" + abilities[1])
	else:
		pokeAbility = (abilities[0] + "/" + abilities[1] + "/" + abilities[2])
	
	return pokeAbility


def return_pokemon(self):
	pokes = []
	for i in range(NUM_POKES):
		if i != 0:
			pokes.append(i)
		
	return pokes
    

class MenuScreen(Screen):
    Builder.load_string("""
<MenuScreen>:
    BoxLayout:
        Button:
            text: 'Go to dex'
            on_press: root.manager.current = 'Title'
        Button:
            text: 'Quit'
		""")

sm =  ScreenManager()
sm.add_widget(MenuScreen())

class MyGrid(Screen):
	
	def __init__(self, **kwargs):
		
		super(MyGrid, self).__init__(**kwargs)
		self.name = 'Title'
		t1 = threading.Thread(target=self.loading)
		t2 = threading.Thread(target=self.doAll)
		t1.start()
		t2.start()
		t1.join()

		
	def loading(self, **args):
		
		self.lbl = Label(text='loading')
		self.add_widget(self.lbl)
		
	
	def doAll(self, **args):
		
		self.scroller = ScrollView()
		
		self.outside = FloatLayout()
		self.outside.size_hint_y = 30
		
		self.add_widget(self.scroller)
		self.scroller.add_widget(self.outside)
		
		self.inside = GridLayout()
		self.inside.cols = 2
		self.outside.add_widget(self.inside)
		
		self.inner  = GridLayout()
		self.inner.cols = 2
		self.outside.add_widget(self.inner)
		
		urls = make_list()
		pokemon = asyncio.run(query_api(urls))
		
		
		for p in pokemon:
			if p is not None:
				#time.sleep(0.05)
				#data = json.loads(p)
				data = p
				
				self.pokegrid  = GridLayout()
				self.pokegrid.cols = 2
				self.pokegrid.size_hint = (1.9, 1.9)
				
				self.inside.add_widget(Button(size_hint=(0.3,0.3), background_color=(51,23,186,1)))
				self.inside.add_widget(Button(background_color=(255,0,0,1)))
				
				toSplit = str(data['sprites']['other']['official-artwork'])
				start = toSplit.index(':') + 3
				end = toSplit.index('.png') + 4
				path = str(toSplit[start:end])

				self.inner.add_widget(AsyncImage(source=str(path), size_hint_x=0.5))
				self.inner.add_widget(self.pokegrid)
				
				self.pokegrid.add_widget(Label(font_size=30, text=str(get_pokemon_id(data['id']))))
				self.pokegrid.add_widget(Label(font_size=50, text=str(data['name'])))											
				self.pokegrid.add_widget(Label(font_size=30, text=str(get_pokemon_types(data['types']))))
				self.pokegrid.add_widget(Label(font_size=30, text=str(get_pokemon_abilities(data['abilities']))))
		
		self.remove_widget(self.lbl)

sm.add_widget(MyGrid())

class MyMainApp(App):
	def build(self):
		return (sm)
		#return(MyGrid())

if __name__ == "__main__":
	MyMainApp().run()

Workarounds

Is there an existing workaround for this issue?
Works in Windows, but trying to build to Android through Linux

Environment

aiohttp-client-cache version: 0.4.0
Python version: 3.6.9
Platform: Ubuntu 18.04.5

Add option to set expiration for an individual request

This is a feature recently added to requests-cache that I'd like to add here. This will also do some of the prep work required for #30 and #32.

This will used via an expire_after param added to _request().

Thankfully, aiohttp.ClientSession passes along variadic **kwargs from request, get, put, , etc. to _request, so this won't require adding wrapper functions for all of those. E.g., CachedSession.get(url, expire_after=60) will work without adding a wrapper for ClientSession.get().

Set up integration tests to run both locally and in GitHub Actions

Add option to use Cache-Control:max-age header to set expiration time

This behavior would be enabled with an extra boolean option to CacheBackend, something like cache-control=True. The max-age value would then be used in CacheBackend.save_response() to set CachedResponse.expires.

If there is no max-age value in the response headers, it could fall back to using CacheBackend.expire_after, if specified.

Performance testing

Each storage backend should have some simple performance testing done, e.g. with the stdlib profile module and maybe a memory profiler.

For now, I mostly just want to look for any obvious, stupid mistakes. For example, if an operation contains blocking code when it shouldn't, or performs significantly worse than its requests-cache equivalent.

Redis transparent config

Please help to setup Redis with db args. Since RedisBackend accept **kwargs but its super ( CacheBackend ) did not, it couldn't setup Redis extra config bypassing kwargs with RedisCache and create_redis_pool function

Add integration tests for SQLite backend

The current state appears to be functional. Basic features have been tested manually, but it does not have any automated tests and could contain bugs.

Add a filesystem backend

This backend would store each response object as a file on the local filesystem, which is a useful option for particularly large responses. This should be doable with aiofiles. Related issue here: requests-cache/requests-cache#30

This would also benefit from a human-readable/editable serialization format (like JSON), as a possible future option.

How to combine with https://github.com/inyutin/aiohttp_retry

Hey! Thanks for the library, I was wondering if it's possible to use the retry and cache libraries together?

More thorough integration tests

Since all the backend classes implement the same API, it would be worthwhile to make some base integration tests that would then be run for all backends. Currently they are more or less just copy-pasted. These could use some more test coverage as well.

See related issue: requests-cache/requests-cache#212

Publish aiohttp-client-cache on conda-forge

This would be useful to have available to install with the Conda package manager. All dependencies are now on conda-forge as well, so this should be straightforward to add.

Refactor CacheClient to be usable as a mixin class

Idea from #16.

If there are additional mixins you wanted to use with aiohttp.ClientSession, it would be convenient (and cleaner) to use CacheClient as a mixin, for example:

class CustomSession(RetryMixin, CacheMixin, ClientSession):
    """Session class with retry + caching features"""

Specific URL raises error from chardet

The problem

When attempting to retrieve a result from the following URL with the following snippet:

session = CachedSession(cache=SQLiteBackend('demo_cache'))
url = "https://news.search.yahoo.com/search;_ylt=AwrXnCKM_wFfoTAA8HbQtDMD;_ylu=X3oDMTEza3NiY3RnBGNvbG8DZ3ExBHBvcwMxBHZ0aWQDBHNlYwNwYWdpbmF0aW9u?p=test&nojs=1&ei=UTF-8&b=21&pz=10&bct=0&xargs=0"
await session.get(url)

The following error raises:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-20-5ba3f9a63552> in <module>
----> 1 await session.get(url)

c:\users\james\appdata\local\programs\python\python38\lib\site-packages\forge\_revision.py in inner(*args, **kwargs)
    320                 # pylint: disable=E1102, not-callable
    321                 mapped = inner.__mapper__(*args, **kwargs)
--> 322                 return await callable(*mapped.args, **mapped.kwargs)
    323         else:
    324             @functools.wraps(callable)  # type: ignore

c:\users\james\appdata\local\programs\python\python38\lib\site-packages\aiohttp_client_cache\session.py in _request(self, method, str_or_url, **kwargs)
     33             new_response = await super()._request(method, str_or_url, **kwargs)  # type: ignore
     34             await new_response.read()
---> 35             await self.cache.save_response(new_response, actions)
     36             return set_response_defaults(new_response)
     37 

c:\users\james\appdata\local\programs\python\python38\lib\site-packages\aiohttp_client_cache\backends\base.py in save_response(self, response, actions)
    158 
    159         logger.debug(f'Saving response for key: {actions.key}')
--> 160         cached_response = await CachedResponse.from_client_response(response, actions.expires)
    161         await self.responses.write(actions.key, cached_response)
    162 

c:\users\james\appdata\local\programs\python\python38\lib\site-packages\aiohttp_client_cache\response.py in from_client_response(cls, client_response, expires)
     89         if client_response.history:
     90             response.history = (
---> 91                 *[await cls.from_client_response(r) for r in client_response.history],
     92             )
     93         return response

c:\users\james\appdata\local\programs\python\python38\lib\site-packages\aiohttp_client_cache\response.py in <listcomp>(.0)
     89         if client_response.history:
     90             response.history = (
---> 91                 *[await cls.from_client_response(r) for r in client_response.history],
     92             )
     93         return response

c:\users\james\appdata\local\programs\python\python38\lib\site-packages\aiohttp_client_cache\response.py in from_client_response(cls, client_response, expires)
     83         # The encoding may be unset even if the response has been read
     84         try:
---> 85             response.encoding = client_response.get_encoding()
     86         except RuntimeError:
     87             pass

c:\users\james\appdata\local\programs\python\python38\lib\site-packages\aiohttp\client_reqrep.py in get_encoding(self)
    997                 encoding = 'utf-8'
    998             else:
--> 999                 encoding = chardet.detect(self._body)['encoding']
   1000         if not encoding:
   1001             encoding = 'utf-8'

c:\users\james\appdata\local\programs\python\python38\lib\site-packages\cchardet\__init__.py in detect(msg)
     13         }
     14     """
---> 15     encoding, confidence = _cchardet.detect_with_confidence(msg)
     16     if isinstance(encoding, bytes):
     17         encoding = encoding.decode()

src\cchardet\_cchardet.pyx in cchardet._cchardet.detect_with_confidence()

TypeError: object of type 'NoneType' has no len()

Expected behavior

Successfully retrieve the page

Steps to reproduce the behavior

Covered prior

Workarounds

Patching the lib to check if _body is set if not return utf8. Not a good solution, really.

Environment

aiohttp-client-cache version: 0.4.1
Python version: 3.8.5
Platform: win10

Set different expiration times based on URL patterns

I'd like to add support for different expiration times per URL, ideally with support for glob patterns. E.g.:

cache = CacheBackend(expire_after={
    '*.site.com': 24,
    ' *.site2.com': 36,
})
session = CachedSession(cache=cache)

See related issues for requests-cache:
requests-cache/requests-cache#108
requests-cache/requests-cache#184

Add `links` property to CachedResponse

The links property of aiohttp's ClientResponse is missing from CachedResponse. This causes AttributeError: 'CachedResponse' object has no attribute 'links' exceptions when trying to use it. The information is still available directly through the Link header but for parity it would be nice to be made through the links property.
The aiohttp impl is complicated enough that it would be great to be able to use it directly, but since it's another MultiDict not sure how best to do that.

Complete MongoDB backend and add integration tests

The current state is compatible with the BaseCache API, but missing a couple methods, does not run asynchronously, and is untested.

Looks like there are multiple options for async operations with MongoDB, although this isn't something I personally have experience with.

Add optional serialization with itsdangerous

itsdangerous adds a layer of security to prevent malicious code execution during deserialization.

More info on python pickle docs: https://docs.python.org/3/library/pickle.html
And in this issue: requests-cache/requests-cache#105

Dynamically include backend connection kwargs

I would like to explicitly include keyword arguments for all backend connection functions in the documentation, as well as function help text. To do this without a lot of copy-pasting, Python-forge can be used to dynamically add these to the appropriate function signatures; for example, sqlite3.connect() kwargs will be added to SQLiteBackend.__init__().

Currently, combining function signatures when more than one function has variadic **kwargs causes some errors, for example:

SyntaxError: 'connection=None' of kind 'POSITIONAL_OR_KEYWORD' follows '**kwargs' of kind 'VAR_KEYWORD'

So a few adjustments are needed to work around this.

Rebase and squash commits from requests-cache (pre-fork)

By this point, there's almost no remaining code from the original requests-cache, and since forking I've also rewritten a large portion of requests-cache with improvements from this library. I'd like to rebase the main branch to squash those commits, so as to not artificially inflate the commit history.

Original contributors to requests-cache are linked in CONTRIBUTORS.md, and a copy of the original license is still included here.

For anyone who currently has a fork of this repo (and it looks like there are only a couple), you'll need to do a git reset --hard upstream/main, or delete it and create a new fork, if you prefer.

Still creating network traffic when using cached responses

so i have tried to cache the data that my python script requests, but it doesnt seem to be working. wireshark and task manager both say that python is still requesting data even though the URL is the same. here is my old code:

`async with aiohttp.ClientSession() as session:

        async with session.get(leaderboard_url) as res:

            leaderboard_json = await res.json(encoding='utf-8-sig')`

and this is how i have dropped in this project:

`async with aiohttp_client_cache.CachedSession() as session:

        async with session.get(leaderboard_url) as res:

            leaderboard_json = await res.json(encoding='utf-8-sig')`

here is what i have imported: import aiohttp_client_cache

and then i also started the cache with session = aiohttp_client_cache.CachedSession()

do you know why this is not caching my requests? i have tried using sqlite instead of memory, but that doesnt seem to work.

Make tests easier to run without Docker

This is relevant for downstream packaging (e.g., if this eventually gets packaged as an RPM or DEB). See discussion in requests-cache/requests-cache#221

Main changes:

Include tests in sdist
Add optional support for using pytest-httpbin instead of httpbin container
If backend services are not set up, make backend integration tests fail instead of skipping
If backend services are not set up, make backend integration tests fail quickly instead of hangning

Add a bulk_delete method to each backend to speed up delete_expired_responses()

Currently, all delete methods remove a single item at a time. remove_expired_responses() would be several times faster if each backend had a database-specific bulk delete method.

Add feature to enable caching only for specified hosts

See requests-cache/requests-cache#161.

It would be useful to have an easy way to enable the cache for some hosts and not others. This is currently possible using a custom filter_fn (as noted in the linked issue). This seems like a common enough pattern (and something I would use), so I'd like to add an allowable_hosts parameter to CachedSession to simplify this.

My current use case would only need simple host matching (exact matches), but someone out there might reasonably want regex matching support (e.g., allowable_hosts=['*.amazon.com']). That could be a separate issue.

Unlike allowed_codes and allowed_methods, this could potentially be a long list, and in some cases it may be easier to specify a blacklist (ignored_hosts) instead of a whitelist. Adding that as an additional parameter could either be done with this issue or in a separate issue.

Set absolute expiration time on response objects

This is useful to make the cache behave a bit more like a browser cache. Currently, expiration time is not persisted and depends on expire_after param that CacheBackend was initialized with.

Add a more complete user guide to documentation

Right now there are just a few basic examples in the Readme. More complete usage docs could be divided into:

Quickstart (in Readme)
General usage
Advanced usage

Unit tests: CachedSession and CachedResponse classes

These classes are mostly finished, but they are lacking unit tests.

Add session monkey-patching features

requests-cache comes with some features to globally patch out requests Sessions with install_cache(). It's preferable to avoid this and directly use CachedSession instead; but this may not always be feasible, and I'm sure someone will have a use case for the install_cache() approach.

Add option to return stale cache data if there is an error sending a new request

Similar to existing stale_if_error option in requests-cache.

This should also include support for Cache-Control: stale-if-error (allowed in both request and response headers).

requests-cache / aiohttp-client-cache Goto Github PK

aiohttp-client-cache's Introduction

Summary

Features

Quickstart

Patching

Headers and Expiration

Settings

Next Steps

aiohttp-client-cache's People

Contributors

Stargazers

Watchers

Forkers

aiohttp-client-cache's Issues

The problem

Expected behavior

Steps to reproduce the behavior

The problem

Expected behavior

Steps to reproduce the behavior

Workarounds

Environment

The problem

Expected behavior

Steps to reproduce the behavior

Workarounds

Environment

Recommend Projects

Recommend Topics

Recommend Org