Code Monkey home page Code Monkey logo

rush's Introduction

rush: A library for throttles

Build status Test coverage Documentation

This library is a small collection of algorithms that can be reused when throttling user interactions with a resource (e.g., an API).

This library strives to allow any limiter and backing store to be used together without needing to be worried about potential compatibility.

Installation

pip install rush
pipenv install rush

Features

  • A basic periodic interval rate limiter - N accesses per period of time. An example would be the GitHub API that limits authenticated users to 5,000 requests per hour.
  • A leaky bucket rate limiter based off of the Generic Cell Ratelimiting Algorithm (a.k.a, GCRA).
  • A Redis storage backend for rate limit results so that users can have state persisted across machines and application restarts.
  • A in-memory dictionary storage backend for quick prototyping and testing.
  • Type annotations built into the library, verified with mypy, and distributed to users.

Quality

  • 100% test coverage
  • Code auto-formatted by Black (CI will check if formatting wasn't run prior to push)
  • Commit messages following a uniform Kernel-like style
  • Flake8, pylint, mypy, and bandit linting
  • Complete type annotations
  • Complete documentation linted by doclint and strictly compiled by Sphinx

Contributing

  • All contributors are expected to read and follow our Code of Conduct.
  • To reduce the initial friction of submitting a pull request:

    • Please run tox prior to submitting your pull request.
    • After a commit, please run tox -e commitlint.
  • To make it easier to support you, please gather the following information prior to filing an issue:

    • How you installed rush and the versions of its dependencies (if you're using the Redis store, please include rfc3986 and redis version information).
    • What stores and limiters are you using?
    • An example that reproduces your problem

rush's People

Contributors

dependabot-preview[bot] avatar dependabot[bot] avatar judahrand avatar sigmavirus24 avatar singingwolfboy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

rush's Issues

[feature] CacheTTL Store option

Perhaps nice to offer a slightly more robust in-memory store to use? I say more robust as I assume that one of the concerns with using a simply dictionary is that it could blow up in size memory-size if the number of keys grew out of control. The cachetools.CacheTTL implementation is actually a TLRU implementation so that could be a good thing to use.

https://github.com/tkem/cachetools/

[feature] Decorators

Would there be interest in a decorator Class (perhaps ThrottleDecorator) which would allow a user to directly limit calls to a function? I'm thinking something similar to this. Shouldn't be too hard to implement and I think that it would make this project useful to a much wider range of users.

Incorrect behavior with GenericCellRatelimiter

I've tried using GenericCellRatelimiter in a simple fashion to throttle my client calls.

Here's some pytest code that reproduces the problem:

import time
from collections import defaultdict
from datetime import datetime, timezone

import gevent
import pytest
from rush.limiters.gcra import GenericCellRatelimiter
from rush.limiters.periodic import PeriodicLimiter
from rush.quota import Quota
from rush.stores.dictionary import DictionaryStore
from rush.throttle import Throttle


@pytest.mark.parametrize('overflow_size', (
    pytest.param(1, id='1 extra item'),
    pytest.param(5, id='5 extra items'),
))
@pytest.mark.parametrize('limiter_class', (PeriodicLimiter, GenericCellRatelimiter))
def test_rate_limit(limiter_class, overflow_size):
    call_seconds = defaultdict(lambda: 0)

    rate_per_sec = 30

    throttle = Throttle(
        rate=Quota.per_second(rate_per_sec),
        limiter=limiter_class(DictionaryStore()),
    )

    def ensure_we_are_not_calling_too_fast(key):
        while True:
            result = throttle.check(key, 1)
            if not result.limited:
                break
            time.sleep(result.retry_after.total_seconds())

    def make_request():
        ensure_we_are_not_calling_too_fast('my_request')
        second = datetime.now(timezone.utc).replace(microsecond=0)
        call_seconds[second] += 1
        time.sleep(0)  # instead of requesting, we'll just yield (as network I/O would in e.g. gevent)

    gevent.joinall([
        gevent.spawn(make_request) for _ in range(rate_per_sec + overflow_size)
    ])

    expected_secs = 2
    assert len(call_seconds) >= expected_secs, f'expected calls to span at least {expected_secs} seconds'

Am I holding it wrong?

[bug] GCRA race conditions

Hi there,

I've been reading through your implementation of GCRA and it occurred to me that there is a large potential for race conditions emerging.

On line 28 on gcra.py you get the value of the key and then proceed to calculate whether or not to rate limit the request. It isn't until line 75 that you update the key. Clearly, the computation in between lines 25 and 75 doesn't take zero time and so there is potential for another RateLimiter to have already updated the key and accepted a new request in the time it took to work out what to do.

Another very good implementation of GCRA is the throttled project in Go (I had started porting it to Python before I found this project!). Their solution is to perform an atomic compare-and-swap operation when they update the key. If the key has already changed then they try again until either they succeed at updating the key, discover they now need to rate limit, or reach a maximum number of retries. They basically just loop over the rate_limit function until their CAS succeeds.

This seems like a good compromise to me as it does need a long 'lock' on any Store but also ensures that up to date information is always used. The only downside is that a compare_and_swap function would need to be added to the BaseStore and all implementations.

This is something I am happy to look at.

redis_gcra retry times

Hi, I am having issues with the retry timers via the redis_gcra implementation.

-How you installed rush and the versions of its dependencies (if you're using the Redis store, please include rfc3986 and redis version information).
rush was installed via pip install rush[redis]
rfc3986 v2.0.0
redis v4.3.4

What stores and limiters are you using?
redis store and redis_gcra limiter

An example that reproduces your problem
For example, consider the following basic test code:

from rush.limiters import redis_gcra
        from rush.stores import redis
        from rush import quota
        from rush import throttle

 quota_per_min = quota.Quota.per_minute(2)

        limiter = redis_gcra.GenericCellRatelimiter(
            store=redis.RedisStore(REDIS_CONN)
        )

        throttle_min = throttle.Throttle(rate=quota_per_min, limiter=limiter)

        results = throttle_min.check(key="mins", quantity=1)

        print(results)

When ran 3x in a row, the following RateLimit objects are shown:

RateLimitResult(limit=2, limited=False, remaining=1, reset_after=datetime.timedelta(seconds=1800), retry_after=datetime.timedelta(days=-1, seconds=86399))

RateLimitResult(limit=2, limited=False, remaining=0, reset_after=datetime.timedelta(seconds=3597, microseconds=697241), retry_after=datetime.timedelta(days=-1, seconds=86399))

RateLimitResult(limit=2, limited=True, remaining=0, reset_after=datetime.timedelta(seconds=3595, microseconds=418555), retry_after=datetime.timedelta(seconds=1795, microseconds=418555))

Therefore, after 3 attempted calls in a minute (with a limit of 2 per minute), on the third call, it suggests a retry time of ~1800 seconds. I don't understand this behavior as that's largely padded against the 2 calls per minute ideal window. Furthermore, if I use hours instead of minutes, the buffer is even more egregious (to the tune of days)

[feature] MemCache Store option

Could be a nice alternative to Redis.

Could also use this interesting looking library which is another pure Python project.

Let me know if you think this would be a worthwhile addition.

Integrations

I think my first attempt at an example project (for Flask) highlighted how rough it can be to integrate anything into some of the web frameworks if you're not intimately familiar with them.

I'd be open to assembling some folks to maintain a bunch of integrations with

  • Flask
  • Klein/Twisted/etc
  • Falcon
  • Django/DRF
  • etc.

[feature] TTL on keys

Would you be interested in an implementation on time to live (TTL) on keys in the data store? Redis for example has this functionality and it is probably sensible to use it to avoid a build up of redundant key, value pairs being kept.

Perhaps something similar to the throttled project in Go?

Again this is something I'm happy to put in a PR for if you're likely to review and accept it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.