jsocol / django-ratelimit Goto Github PK

Cache-based rate-limiting for Django

Home Page: https://django-ratelimit.readthedocs.io/en/latest/

License: Other

Python 97.00% Shell 3.00%

django-ratelimit's Introduction

Django Ratelimit

Django Ratelimit provides a decorator to rate-limit views. Limiting can be based on IP address or a field in the request--either a GET or POST variable.

Code:	https://github.com/jsocol/django-ratelimit
License:	Apache Software License 2.0; see LICENSE file
Issues:	https://github.com/jsocol/django-ratelimit/issues
Documentation:	http://django-ratelimit.readthedocs.io/

django-ratelimit's People

Contributors

Stargazers

Watchers

Forkers

vikalp dguaraglia pcraciunoiu rbdcti gmcquillan vinilios bobbymanuel jphalip socialize bradbeattie morenopc kdahlhaus flesner lsemel mythmon jasonkeene stefankoegl rlr ratm moleculea mfelsche romanalexander ojarva buremba pabluk peterbe jlhurd hwkns clabra atn9 pmclanahan pixabay jieter calendar42 zxzhang cordery lwiecek carljm philsong ridereport adamchainz bitcity lastfm manyan kangkot jacobwegner midnightlynx markdoggen nitinbhojwani huytr4n danielkeller jke-zq devmediatree orioncx claudep feverup wolfws raminfp haobtc kirius facert jdk6979 teneighty mvantellingen gaussding rubbish822 hugorodgerbrown leoninnovate ehzhang twz915 kr4ng timrc lcd1232 nicomanso mbaechtold dmvass huimingz danmoz narurkar jontsai ryan-swe-liu edevil rajchandra3 muhammedkpln unacademy normoes ilosamart jwhitlock wahello 2208abhinav zyongbo alimp5 codermrz ofek-public lord-shibe mashuiping whs newlc devkral jasonssssshieh

django-ratelimit's Issues

Should blank values be allowed ?

If key values of the type post:<keyname> or get:<keyname> is an empty string, it is still counted towards the limit. Imagine a login form where you type your username and press enter in a hurry, you would reached the blocked page instead of seeing an error that says password field shouldn't be empty.

Perhaps this should be a an arg in the ratelimit decorator to county empty values towards limit or ignore them.

How's the period (expiry / timeout) information set in the cache

In a previous version (0.4.0), the cache expiry was set explicitly like this in helpers.py

cache.set_many(counts, timeout=timeout)

where timeout was derived by splitting the rate into count & period.

In the current version (0.6.0), the file utils.py still splits the rate into count (limit) & period, but the period information is not conveyed to the cache.

Django 1.10 Middleware Compatibility

The ratelimit middleware isn't compatible with Django 1.10's new middleware style

Counter reset function is needed

When protecting a site data against automated scraping, it's important to allow normal users to work with a site by entering captcha.
It would be nice to have a function to reset visits counter if user successfully passes captcha test.
Thanks!

Is it possible to increment rate limit on one view and consume on others?

I'm looking to use django-ratelimit to increment a rate limit when a user POSTs to a view, and show a notice if the limit has been exceeded, on either a POST or a GET. Is this feature supported?

My goal would be something like so:

from django.views.generic import FormView
from ratelimit.mixins import RatelimitMixin

class ExampleView(RatelimitMixin, FormView):
    ratelimit_key = 'ip'
    ratelimit_method = 'POST'
    ratelimit_rate = '3/m'

    def get_context_data(self, **kwargs):
        context = super(ExampleView, self).get_context_data(**kwargs)
        context['limited'] = self.request.limited

However, this raises an AttributeError when this view processes a GET, since the limited attribute is only applied on requests for a POST. I could change ratelimit_method to ALL, but I don't want to increment the counter for GETs.

Any way to whitelist/temporarily remove or alter ratelimit?

Say for example we limit account signups to 20 per hour per IP, but then a few months down the road we hold a conference that provokes say 100 signups within the hour. I know cache.clear() can be a crude way to reset the ratelimit restrictions, but is there anyway to tell this software, say, "For the next hour, don't limit IP nnn.nn.nn.nnn"? Or limit them to 1000/hr?

is_ratelimited increment=True does not seem to work

I tested with decorator usage and it worked as expected (with the exception of issue #73). However using the helper function in middleware as documented:

    def process_request(self, request):
        if is_ratelimited(request, group="all", key='user_or_ip', rate='2/m', method='ALL', increment=True):
            log.warn("Request is over rate-limit: %s", request)
        else:
            log.warn("Just checked")

Does not appear to work in that the "over limit" message is never displayed, despite increment=True.

I believe my caching is working fine since as I stated, the decorator behaves properly. I am moving forward using the decorator so it's not urgent for me.

A new release?

Now that Django 1.9 is out, it'd be nice to have an official release that supports it, so that we don't have to pull from repo in order to use this with Django 1.9. Any way that can happen?

Per-user ratelimits

One very common need is to customize ratelimits per user. E.g., an automated internal "user" may get 100x the requests of a free user, while a "pro" user might have negotiated for 20x and a researcher may get 10x. Callable rates make this possible, but you don't want to have to either a) hard-code users into source, or b) update and deploy every time a user changes some state or deal.

In my head, this looks something like:

# ratelimit/models.py

class Ratelimit(models.Model):
    group = models.CharField(db_index=True)
    user = models.ForeignKey(null=True)  # One option for "default"
    rate = models.CharField()

    @classmethod
    def get(cls, group, user=None):
        # use cache if possible
        try:
            return cls.objects.get(group=group, user=user)
        except cls.DoesNotExist:
            return cls.objects.get(group=group, user=None)

# ratelimit/utils.py

def per_user(group, request):
    return Ratelimit.get(group, request.user).rate  # Handle unauth, too

There are two ways I see of handling defaults: either in the DB with a null user, as above, or in definitions, e.g.:

# in settings
RATELIMIT_GROUPS = {
    'mygroup': {
        'default_rate': '100/h',
        # ...
    },
}

# at call-site
@ratelimit(key='user', rate=per_user, default_rate='100/h')
def myview(request):
    pass

In in that case, we'd let per_user return some sentinel value, since None still means "no limit", and then fallback to default. Not keeping the default in the DB sounds like a nicer idea to me, even if it means relying more heavily on configuration in settings.

Cache TIMEOUT should be higher than the largest period specified in rate

Django specifies a default timeout of 300 seconds. If there are ratelimit decorators that have a rate with a period that exceeds this number e.g. 5/h (i.e. 5 requests per hour), would the key expire after Timeout & be eventually overwritten depending on the cache expiration policy ? I tried to re-create this scenario using backends.locmem.LocMemCache however after the expiry time, the key was still accessible. Perhaps its marked as stale and is not reliable. I'm not sure how other backends will behave.

It would be nice if django-ratelimit can look at the settings.CACHES and alert if TIMEOUT is lesser than the largest period specified in rate.

Intermittent CI fail with ratelimit

We're using ratelimit for our login page:

@ratelimit(key='ip', rate=settings.LOGIN_LIMIT)
@ratelimit(key='post:username', rate=settings.LOGIN_LIMIT)
def login(request):
    ...

(in settings LOGIN_LIMIT = '4/m')

And test the rate limit is working on tests:

    @override_settings(RATELIMIT_ENABLE=True)
    def test_login_form_captcha(self):
        """
        Test repeated login attempts yield a captcha.
        """
        # check that captcha is not here initially
        response = self.client.get(self.login_url)
        self.assertEqual(response.status_code, 200)
        self.assertNotIn('captcha', response.content)
        # try to login many times
        wrong_credentials = {'username': '[email protected]', 'password': 'wrong'}
        for _ in range(3):
            response = self.client.post(self.login_url, wrong_credentials)
            self.assertEqual(response.status_code, 200)
            self.assertFalse(response.context['login_form'].is_valid())
            self.assertNotContains(response, 'captcha')
        # one time more and captcha appears
        response = self.client.post(self.login_url, wrong_credentials)
        self.assertEqual(response.status_code, 200)
        self.assertFalse(response.context['login_form'].is_valid())
        self.assertContains(response, 'captcha')

This always works locally and works about 95% of the time on CI but just occationally fails with

AssertionError: Couldn't find 'captcha' in response

I'm aware there's a good chance this could be a weird error unique to our system but I thought it was worth asking whether there's any pointer as to what's causing ratelimit not to kick in?

Database error

After install django-ratelimit I'm getting the error:

sql 
u'SELECT cache_key, value, expires FROM "cache_django" WHERE cache_key = \':1:rld740ef2ea296561a2c63c04605639082\''
...
query   
u'SELECT cache_key, value, expires FROM "cache_django" WHERE cache_key = %s'
self    
<django.db.backends.postgresql_psycopg2.base.CursorWrapper object at 0x1112ef550>
args    
[u':1:rld740ef2ea296561a2c63c04605639082']

InternalError('current transaction is aborted, commands ignored until end of transaction block\n',)

Installed via:

pip install django-ratelimit

Using Django 1.5 with settings.py:

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.db.DatabaseCache',
        'LOCATION': 'cache_django', # create this db table with python manage.py createcachetable cache_django
    }
}

The table 'cache_django' exists in DB. If I type in a Postgres client (Note: without escape cache_key single quotes):

SELECT cache_key, value, expires FROM "cache_django" WHERE cache_key = '1:rl:d740ef2ea296561a2c63c04605639082'

I don't get any error, but if I type (scaping single quotes):

SELECT cache_key, value, expires FROM "cache_django" WHERE cache_key = \':1:rld740ef2ea296561a2c63c04605639082\'

I get a syntax error

Document how to send a 429 response instead of 403

As defined on the RFC6585, a 429 HTTP response stands for Too many requests.

If that's ok for you, I'll submit a PR.

New feature: reset rate limite counters

I'd like to set certain rate limits on my views and then throw a captcha if they are violated (something like Stackoverflow human verification ). When I can ascertain that it's a human on the other end, I'd like to reset the counters i.e. start counting up ratelimit afresh for that IP or key. Where would be a good place to modify the current code to achieve that?

key callable only takes one argument but docs say two

Actually it looks like a callable passed in is only passed the request, but a dotted string which results in a callable is passed the group also. I thought the docs were wrong, but maybe it's the fact that a callable is only being passed a single argument. Perhaps for backward compatibility you can accept an alternate second param and shuffle if it remains None.

I should have a PR shortly with a recommendation.

Finish documentation from #86

In #86, we decided that more documentation was necessary:

otherwise, i agree with the TODOs in that list, which are all documentation:

put a ..warning: on the common keys list

update the description of that behavior to be accurate, and probably add a prose note about empty values

add a section to the security chapter on trusting GET/POST/header data

Drop `compat.get_caches`, use new `caches` api

Leaving a note for future change:

Django 1.7 has deprecated get_cache in favor of caches

This line here would need a change sometime in the near future.

Ratelimits for 1/h appear to allow 2, but no more

Using the helper as I don't want to increment all requests, only ones where a successful submission has taken place:

class RateLimitedFormView(FormView):
    ratelimit_key = 'ip'
    ratelimit_block = True
    ratelimit_rate = '1/h'
    ratelimit_group = None

    def dispatch(self, *args, **kwargs):
        ratelimited = is_ratelimited(request=self.request,
                                     group=self.ratelimit_group,
                                     key=self.ratelimit_key,
                                     rate=self.ratelimit_rate,
                                     increment=False)
        if ratelimited and self.ratelimit_block:
            raise Ratelimited()
        return super(RateLimitedFormView, self).dispatch(*args, **kwargs)


class RegistrationView(RateLimitedFormView):
    template_name = 'accounts/register.html'
    form_class = EmailUserCreationForm
    ratelimit_group = 'registration'

    def form_valid(self, form):
        # Saves the form and does login here ... [ snip ]
        # Calls is_ratelimited here to increment the counter
        is_ratelimited(request=self.request, group=self.ratelimit_group, key=self.ratelimit_key,
                       rate=self.ratelimit_rate, increment=True)
        return super(RegistrationView, self).form_valid(form)

One would expect the above to increment the counter by 1 for every form_valid call, and therefore get ratelimited at the second dispatch(). However, this does not happen:

{'': {}}
{'': {u':1:rl:bd2e7d391ec4f6cc09024ab9b8f38395': '\x80\x02K\x01.'}}
{'': {u':1:rl:bd2e7d391ec4f6cc09024ab9b8f38395': '\x80\x02K\x02.'}}
{'': {u':1:rl:bd2e7d391ec4f6cc09024ab9b8f38395': '\x80\x02K\x02.'}}

The above are the four lines of output from a test, printing the contents of the cache on each subsequent request to the RegistrationView. At round one, it adds the cache key, but at round two, the dispatch override returns False from is_ratelimited, even though the rate is 1/h, and then allows a second request to go through, before ratelimiting on the third.

Why is this?

Observing a harsher rate limit than defined

I'm using the following settings with the ratelimit decorator to allow 5 requests per second.

@ratelimit(ip=True, block=True, method=None, rate='5/s')

The log below shows the time lapsed in seconds. As I go from 38s mark to 40s mark, requests start getting blocked. My understanding according to the log below is that my request rate isn't exceeding 2 or 3 per second. Why do I see requests getting blocked in that case if the ratelimit should allow 5 per second?

Secondly, once I start getting a 403, I've to wait for 1 second (or whatever period was set in rate) to resume. Is it possible to ignore the requests once 403 is raised until next successful request goes through (something like what this issue talks about #11)

P.S. Using Django's default cache backend i.e. 'django.core.cache.backends.locmem.LocMemCache'

[30/Aug/2014 00:34:38] "GET /my_site/ HTTP/1.1" 200 412
[30/Aug/2014 00:34:38] "GET /my_site/ HTTP/1.1" 200 294
[30/Aug/2014 00:34:39] "GET /my_site/ HTTP/1.1" 200 269
[30/Aug/2014 00:34:39] "GET /my_site/ HTTP/1.1" 200 269
[30/Aug/2014 00:34:39] "GET /my_site/ HTTP/1.1" 200 269

Forbidden (Permission denied): /my_site/
[30/Aug/2014 00:34:40] "GET /my_site/ HTTP/1.1" 403 22

Forbidden (Permission denied): /my_site/
[30/Aug/2014 00:34:40] "GET /my_site/ HTTP/1.1" 403 22

Forbidden (Permission denied): /my_site/
[30/Aug/2014 00:34:41] "GET /my_site/ HTTP/1.1" 403 22

Forbidden (Permission denied): /my_site/
[30/Aug/2014 00:34:41] "GET /my_site/ HTTP/1.1" 403 22

Forbidden (Permission denied): /my_site/
[30/Aug/2014 00:34:42] "GET /my_site/ HTTP/1.1" 403 22

What does "annotate" request mean?

Hello,

I don?t understand what happens with a request if block is False.

I created a stackoverflow thread for it:
http://stackoverflow.com/revisions/32608010/2

README says wrong license

Readme says it's a BSD license. LICENSE file says Apache.

Deprecated importlib module and deprecated get_cache

Django's importlib module will be removed with the next version (1.9).
Also, 'get_cache' is deprecated in favor of 'caches'.

Please replace in utils.py:

from django.utils.importlib import import_module

With the Python importlib:

from importlib import import_module

You can modify the get_cache import like that, so it works with all Django versions:

try:
----from django.core.cache import caches
except:
----from django.core.cache import get_cache as caches

Then load the cache like so:

cache = hasattr(caches, 'call') and caches(cache_name) or caches[cache_name]

Wonderful tool by the way! Thank you :-)

Use the decorator with Class Based Views

How do I use the decorator with CBV? I cannot seem to pass arguments to method_decorator(). Any clue about this?

Increment optionally

I'm wondering what the developers think to a 'optional increment' flag for the ratelimit decorator. This would have the advantage of allowing one to flexibly increment, or merely check the ratelimit, in a similar manner to django-brake?

The underlying code exists - there is an increment flag on is_ratelimited, is the idea that if required, you just roll your own version of the decorator?

Rate-limiting all views

This is a feature request / solicitation for advice.

I was looking for something like this except I don't necessarily want to apply it per view but for my whole site. I didn't see that this is possible with django-ratelimit but it doesn't seem like that would be a difficult extension. Is there a reason this isn't an option? Is it a bad idea?

Does not work with override_settings

Ratelimit does not appear to work with override settings in tests:

def test_settings(self):
    self.assertFalse(settings.RATELIMIT_ENABLE)

@override_settings(RATELIMIT_ENABLE=True)
def test_settings_overridden(self):
    self.assertTrue(settings.RATELIMIT_ENABLE)
    ... ratelimited view here fails to limit ...

This is a pain for testing purposes, where I need ratelimit enabled at times, and disabled in others.

ip ratelimiting when using a proxy like cloudflare

If you are using the 'ip' ratelimitnig key and a proxy like cloudflare, then it will always return the same ipaddress, which could be disastrous.

A simple cloudflare solution would be for users of this library to create their own callable for the key as such:

def get_client_ip(request):
    return request.META.get('HTTP_CF_CONNECTING_IP') or request.META['REMOTE_ADDR']

@ratelimit(key=get_client_ip, rate='10/m')
def dummy_view(request):
    # view code in here

However, I wonder if there's a more general solution for using X-FORWARDED-FOR or if that's too easily spoofed? A change in utils.py like:

def _ip(request):
    return (request.META['HTTP_X_FORWARDED_FOR'].split(',')[-1]
            if request.get('HTTP_X_FORWARDED_FOR') else request.META['REMOTE_ADDR'])

def user_or_ip(request):
    return str(request.user.pk) if request.user.is_authenticated() else _ip(request)

_SIMPLE_KEYS = {
    'ip': _ip,
    'user': lambda r: str(r.user.pk),
    'user_or_ip': user_or_ip,
}

Does anyone know what risks there could be in using x-forwarded-for?

ratelimit broken for Django 1.9

Django removes utils.importlib, as this is now present in the standard library as of Python 2.7. This is a planned deprecation from Django 1.7. See related PR #96 for a suggested resolution.

Question on example in docs

I'm looking at the usage examples in docs,

The one below says if same username OR IP is used ... , Shouldn't this be just username (no IP address) ?

@ratelimit(key='post:username', rate='5/m', method=['GET', 'POST'])
def login(request):
    # If the same username OR IP is used >5 times/min, this will be True.
    # The `username` value will come from GET or POST, determined by the
    # request method.
    was_limited = getattr(request, 'limited', False)
    return HttpResponse()

is_ratelimited is broken if group is None and fn is None

subject says it all. The docs don't indicate that one or the other is required. Will crash on try to look for module on fn which is null.

  if group is None:
        if hasattr(fn, '__self__'):
            parts = fn.__module__, fn.__self__.__class__.__name__, fn.__name__
        else:
            parts = (fn.__module__, fn.__name__)
        group = '.'.join(parts)

Stacking decorators shouldn't have a common key

In the example of stacking rate limits, ip=True by default. This adds an additional common key to all 3 decorators (e.g. ip:127.0.0.1) which increments at thrice the rate. Shouldn't you set ip=False explicitly in this example below. The test doesn't catch the error because the specified rate is hit anyway after first request. If you increase the rate of first decorator to say 3/m, the test would fail at the second request.

Example:

@ratelimit(keys=lambda x: 'min', rate='1/m')
@ratelimit(keys=lambda x: 'hour', rate='10/h')
@ratelimit(keys=lambda x: 'day', rate='50/d')

Update: Is the end user expected to pass keys that are unique per IP ? e.g. in the above, shouldn't the key be something like keys=lambda x: 'min-'+<ip_addr>, rate='1/m'

Limit per parameter

i've a url

mydomain.com/contactme?email='[email protected]'

@ratelimit(rate='5/m',block=True,field='email', method=['GET'])
def contactme(request):
retrurn pass

when i hit simply hit

mydomain.com/contactme

y its blocking after 5 attempts per minute . There is no parameter 'email' . Only need to block when the same email value for parameter email 5/min

Rate limiting countdown resets on failed retries

If I set my rate to 1/h, try once successfully, retry a second time and get throttled, I have to wait an hour for my cache to clear to make another successful attempt.

If I don't wait an hour, but instead keep retrying every 45 minutes, my cache will never clear due to the period not being adjusted in https://github.com/jsocol/django-ratelimit/blob/master/ratelimit/backends/cachebe.py#L32

I've created an alternate caching project (
https://github.com/bradbeattie/django-cache-throttle/blob/master/cache_throttle/utils.py) with a throttling mechanism you might be interested in copying. Instead of storing the number of attempts, it instead stores how tired it is of seeing the key and when the key was last seen. It can then calculate how much of the key stamina should be regenerated based on the time difference.

Should be doable to modify cachebe to have this behaviour if you're interested. :)

Is it possible to limit based on ('ip' and 'request.user.username')

I want to set ratelimit per IP per logged-in user for views that are accessible by logged-in users only. To explain it futher, if the limit is 5/m, then user-a and user-b, both can make 5 request per minute from the same IP address. Maybe it's possible with keys argument since it says for example, use an authenticated user ID or unauthenticated IP address, but I can't figure out how.

Always add `limited` attr

Right now, checking for the limited attr has to be done with getattr(request, 'limited', False). That's just silly.

Rate per second doesn't work

If I set @ratelimit(block=True, method=['GET','POST'], rate='1/m') it blocks but if I set @ratelimit(block=True, method=['GET','POST'], rate='59/s') it doesn't block. (rate='60/s' doesn't work too)

group name isn't correct for CBVs

Inside is_ratelimited function, the value of group is always django.utils.decorators.bound_func instead of the actual function name.

This is how I'm using the ratelimit decorator:

rl = method_decorator(ratelimit(key='post:username', rate='1/s'))
class Expensive():
    @rl 
    def dispatch():
    ...

Configure via settings/groups

One benefit of the group= kwarg I identified in #48 is that it acts as a natural key to use to define at least default values for the decorator elsewhere, i.e.: settings. E.g.:

# settings.py
RATELIMIT_GROUPS = {
    'mygroup': {
        'key': 'ip',
        'rate': '100/h',
        'method': 'POST',
        'block': True,
    },
    'some.mod.view': {
        'key': 'user-or-ip',
        'rate': 'some.mode.view_rate',
    }
}

# some/mod.py
@ratelimit()
def view(request):
    pass

@ratelimit(group='mygroup')
def someview(request)
    pass

def someotherview(request):
    if is_ratelimited(request, group='mygroup'):
        # This gets much easier.
        pass

The setting would override the defaults but could be overridden by the call site, so the precedence is:

call site (either @ratelimit decorator or is_ratelimited helper)
RATELIMIT_GROUPS setting
ratelimit's defaults.

It makes it much, much easier to do a few things:

Update a shared ratelimit everywhere
Confidently use a shared ratelimit in multiple contexts
Temporarily disable ratelimits with fewer touch points

The way counters are constructed, overriding any of the values in the decorator would cause the group= to count separately—but that's true now, so it's probably something that just needs better documentation.

Drop Django 1.3 test runs, add 1.6

Django 1.3 is not a supported target, but 1.6 is already a release candidate and needs to be tested.

Rate limit based on session ID

I could not find any discussion in the issues on support of session ID for ratelimiting. Wouldn't it be a more suitable parameter rather than IP address. This would especially be useful for visitors that share a common IP address e.g. a corporation.

django-ratelimit does not check limit with call is_ratelimited(increment=False)

There is a bug in the is_ratelimited function. When it is called with the purpose of just checking if a view/request is to be blocked based on previous requests this can be done with:

block = django-ratelimit(increment=False)

However this doesn't seem to work.
Looking at the implementation of is_ratelimited it becomes apparent that in this case the limited variable will always be false:

Function implementation:

def is_ratelimited(request, increment=False, ip=True, method=['POST'],
                   field=None, rate='5/m', keys=None):
    count, period = _split_rate(rate)
    cache = getattr(settings, 'RATELIMIT_USE_CACHE', 'default')
    cache = get_cache(cache)

    request.limited = getattr(request, 'limited', False)
    if (not request.limited and increment and RATELIMIT_ENABLE and
            _method_match(request, method)):
        _keys = _get_keys(request, ip, field, keys)
        counts = _incr(cache, _keys, period)
        if any([c > count for c in counts.values()]):
            request.limited = True
    return request.limited

Flow:
The request object doesn't (initially) contain an increment attribute. getattr will therefore always return False. Since increment is set to false it won't be changed later and the return value will be false

django.utils.importlib deprecated

Getting warning:

C:\...\ratelimit\utils.py:9: RemovedInDjango19Warning:
django.utils.importlib will be removed in Django 1.9.
    from django.utils.importlib import import_module

My understanding is that this should just be changed to from importlib import import_module

Relevant docs link

Feature Request: combine key with specific values

Sometimes it is required to check for some keys together for throttling requests e.g.

 if ('post:username' == 'admin' and 'post:password' is not '' ):
    forward_to_captcha()

It would be useful to (i) combine multiple keys (ii) have ratelimit only for specific key values

A way to use this wait Tastypie

Hi, I was just wondering if there was anything in the pipeline that will make it work with Tastypie as well?

For some field data, Memcache backend raises MemcachedKeyCharacterError: Control characters not allowed

Pull request to follow.

The stack trace ends with this:
File "/var/www/sowink/so_wink/vendor/src/django-ratelimit/ratelimit/decorators.py" in _wrapped

```
            _backend.count(request, ip, field, period)
```
File "/var/www/sowink/so_wink/vendor/src/django-ratelimit/ratelimit/backends/cachebe.py" in count
```
    counters.update(cache.get_many(counters.keys()))
```
File "/var/www/sowink/so_wink/vendor/src/django/django/core/cache/backends/memcached.py" in get_many
```
    ret = self._cache.get_multi(new_keys)
```
File "/var/www/sowink/so_wink/vendor/packages/python-memcached/memcache.py" in get_multi
```
    server_keys, prefixed_to_orig_key = self._map_and_prefix_keys(keys, key_prefix)
```
File "/var/www/sowink/so_wink/vendor/packages/python-memcached/memcache.py" in _map_and_prefix_keys
```
        self.check_key(str_orig_key, key_extra_len=key_extra_len)
```
File "/var/www/sowink/so_wink/vendor/packages/python-memcached/memcache.py" in check_key

                        "Control characters not allowed")

Exception Type: MemcachedKeyCharacterError at /messages/new
Exception Value: Control characters not allowed

Ratelimited method should be POST by default

The default method=None includes GET, which I'd venture to say is probably unexpected by most. POST ought to be the default IMO.

ImportError: No module named importlib

Django 1.9 No longer supports django.utils.importlib import_module

django-ratelimit/ratelimit/middleware.py

Line 2 in f9b1e16

from django.utils.importlib import import_module

django.utils.importlib is a compatibility library for when Python 2.6 was still supported. It has been obsolete since Django 1.7, which dropped support for Python 2.6, and is removed in 1.9 per the deprecation cycle.

Use Python's import_module function instead:

from importlib import import_module
The reason you can import it from django.utils.module_loading is that importlib.import_module is imported in that module, it is not because module_loading in any way defines the actual function.

Since django.utils.module_loading.import_module is not part of the public API, it can be removed at any time if it is no longer used - even in a minor version upgrade.

Source - https://stackoverflow.com/questions/32761566/django-1-9-importerror-for-import-module

support multiple limits

The docs suggest to use the keys argument to support multiple limits:

@ratelimit(keys=lambda x: 'min', rate='1/m')
@ratelimit(keys=lambda x: 'hour', rate='10/h')
@ratelimit(keys=lambda x: 'day', rate='50/d')
def post(request):
    # Stack them.
    # Note: once a decorator limits the request, the ones after
    # won't count the request for limiting.
    return HttpResponse()

Without that keys argument, then all three limits use the same cache key and therefore whichever gets cleared first clears them all.

It'd be nice if stacking ratelimits worked better with cache keys.

Simpler and more powerful @ratelimit decorator

I have a new, hopefully better, idea of how ratelimit can work, internally, so I want to write it down here, and get @willkg's opinion and sanity check (also also if anyone else has comments) before I dive into it.

Problems

Multiple cache keys per decorator

Right now, each @ratelimit decorator can create several keys, all of which are then treated with the same expiration (and see below about that). Except, if you stack decorators, they may generate the same key (e.g. for the IP address) and then things just stop making any sense at all.

So what we're doing is updating counters but in a really non-intuitive way, and we break stacking, which seems to be a natural way people try to use the library. It also costs us atomicity because we can't use cache.incr.

Good actors can get stuck in sliding windows

Because we push back the TTL on every increment, once a client gets ratelimited, they are stuck until they wait the full period to reset, so if the limit is 1/h and they wait 59 minutes, they then have to wait another hour, not just one minute, because they jumped the gun a little.

Per-method (or group) is a pain

Implementing per-method or method-group rate limits would require something keys=lambda r: 'group'+r.META['REMOTE_ADDR'] everywhere.

Solution

This is a big, backwards-incompatible change. Fortunately, it's pre-1.0, so whatever. This would, hopefully, be a step toward something we'd call 1.0.

One counter per decorator

The biggest change: each decorator should result in a single counter (cache key, whatever). So

def user_or_ip(r):
    u = request.user
    return u.getusername() if u.is_authenticated() else r.META['REMOTE_ADDR']

@ratelimit(key=user_or_ip, rate='100/h')
def some_view(request):

Would result in exactly one counter, that uses a combination of the method name, the key value, and the rate (and probably the current time, but hang on) probably MD5ed together to get a name.

Then, if you wanted to do, say, a burst limit, you could do:

@ratelimit(key=user_or_ip, rate='10/s')
@ratelimit(key=user_or_ip, rate='100/h')
def some_view(request):

Since the rate is part of the key, these two get incremented independently. And if the next method had:

@ratelimit(key=user_or_ip, rate='10/s')
@ratelimit(key=user_or_ip, rate='100h')
def another_view(request):

the default behavior would be to ratelimit these views independently.

Fixed windows

I've come around to the view that each attempt shouldn't completely reset the clock on the timer. We should be creating windows. The window needs to be staggered somehow by key, so that we don't, for example, open all the flood gates every hour on the hour. (We can definitely skip staggering that for seconds and possibly for minutes.)

So, for example, if the rate is 100/h then we'd do something like:

def k(value, period=3600):
    ts = int(time.time())
    m = ts - (ts % period) + (zlib.crc32(value) % period)
    if m < ts:
        return m + period
    return m

Then we append the counter name/cache key with this value. The value should be different for every key and should change every period, but its staggered within the period according to the wall clock.

We can even somehow return this to the view to allow it to send Retry-After if we want.

The new signature

It should be simpler, and better reflect that each decorator is an individual counter, while still providing shortcuts for common use cases:

@ratelimit(
    group=None,
    key=None,
    rate=None,
    method=['POST'],
    block=False)

Most of this should be straightforward, but:

group defaults to the dotted name of the method (e.g. myapp.views.myview). That limits each view individually, but you can set it to, e.g. group='myviewgroup' to count a number of views together.
rate works as now ('X/Yu' where X and Y are integers and u is a unit from {s,m,h,d}) or rate is a callable (or a dotted path to a callable) that is passed the group and the request and returns either a rate string, or a tuple: (limit, period-in-secods). I think this is a better method of handling skip_if because the callable could return None for "no limit" (or 0 for "never allow). And, it opens up a whole new thing that would be, I think, very useful (see below).
key is one of a few well-known strings, a callable, or a dotted path to a callable. Callables would get the group and the request. Well-known strings would include at least:
- 'ip' - request.META['REMOTE_ADDR]
- get:X - request.GET['X']
- post:X - request.POST['X']
- field:X - d = request.POST if request.method == 'POST' else request.GET; d['X']
- header:what-ever - request.META['HTTP_WHAT_EVER']
- user - request.user
- user_or_ip - request.user if request.user.is_authenticated() else request.META['REMOTE_ADDR'] (very common use case, nice to have)
method works as now, a method or a list of methods, or None for all
block works as now, True to raise a Ratelimited exception, False to annotate the request.

Generating the counter name/cache key

We combine all of this to get a key that we increment:

cache_key = PREFIX + md5(group + rate + key_value + window)

We don't worry about expiring it. We just do limited = cache.incr(cache_key) > limit and call it good. The values age out of the LRU.

Current use cases

Login forms

A very common form to protect right now is a login form, which can be done with one decorator:

# old
@ratelimit(ip=true, field='username', rate='5/m')

# new
@ratelimit(key='ip', rate='10/m')
@ratelimit(key='field:username', rate='5/m')

You'll need two decorators to provide the same (one-IP/many-users and many-IPs/one-user) protections, but you get more control. If you expect users to be behind NAT, you can allow a higher single-IP rate, while still preventing dictionary attacks against a username.

New stuff

This opens up some cool stuff. Stacking now works as intended/expected, like in the burst rate examples above. But there's more that could happen in subsequent versions:

Callable `rates`

A pretty trivial use case for callable rates is customizing them by user or user type.

def get_rate_limit(group, request):
    if request.user.is_authenticated():
        return '1000/h'
    return '100/h'

A cool thing to build in would be per-user limiting, e.g.:

class UserRateLimit(models.Model):
    group = models.CharField(db_index=True)
    user = models.ForeignKeyField(null=True)  # 'null' for default
    rate = models.CharField()

    @classmethod
    def get_for_user(cls, group, request):
        user = request.user
        if not user.is_authenticate():
            rate = cls.objects.get(group=group, user=None)
            return rate.rate
        try:
            rate = cls.objects.get(group=group, user=user)
            return rate.rate
        except cls.DoesNotExist:
            default = cls.objects.get(group=group, user=None)
            return default.rate

(Or something, maybe better supporting anonymous users with user=0 or similar. And of course with caching.)

Definitions in settings

Once group is a thing, it becomes easy to do something like this:

# settings.py
RATE_LIMIT_CONFIG = {
    '*': {  # global default
        'key': 'user_or_ip',
        'rate': '10/m',
    },
    'some_group': {
        'key': 'ip',
        'rate': '100/m',
    },
}

The decorator (or helper methods) could pull these settings unless they're specifically overridden by the invocation.

That makes it less error-prone to use a helper like is_ratelimited and also change things (like temporary remove limits) in settings without messing with actual source modules.

Ratelimit POST only if successful

Ratelimiting POST methods is a simple way of controlling POST-ing to a given (e.g. Register) url, however, if a user is failing to validate the form correctly (read broken email, mismatched passwords) this still iterates the ratelimit counter. It would be really neat if it could allow you to check whether the POST was successful or not, and ratelimit if it was.