Code Monkey home page Code Monkey logo

Comments (15)

aeneasr avatar aeneasr commented on June 18, 2024

Thank you for the report! Can you pin-point which version introduced this regression? It would make the search for the regression much easier!

from oathkeeper.

Withel avatar Withel commented on June 18, 2024

I'm not sure if I understand you correctly, as I wrote in the description, we have upgraded from 39.4 to 40.7 or do you mean something else?

from oathkeeper.

marcinfigiel avatar marcinfigiel commented on June 18, 2024

It's worth mentioning that this is our second attempt with the upgrade to 0.40.x. First time we tried with 0.40.6 and had the same effect.

from oathkeeper.

aeneasr avatar aeneasr commented on June 18, 2024

Since there are a couple of versions between 39.4 and 40.6 I wanted to know if you specifically can pin point which version exactly introduced the issue, making it easier to find the root cause

from oathkeeper.

marcinfigiel avatar marcinfigiel commented on June 18, 2024

Since there are a couple of versions between 39.4 and 40.6 I wanted to know if you specifically can pin point which version exactly introduced the issue, making it easier to find the root cause

Unfortunately not, we've only tried these two versions :(

from oathkeeper.

Withel avatar Withel commented on June 18, 2024

@aeneasr we've managed to pin-point exact version which introduces this issue. This happens between v0.39.4 and v0.40.0. Hope this helps, please let me know if you need anything else.

from oathkeeper.

aeneasr avatar aeneasr commented on June 18, 2024

I think your max_cost is just way too low. I see that we also use a misleading configuration in ristretto. The default max_cost is set to 100000000 which I think is way too high.

Try playing around with that value to see if it has an impact.

from oathkeeper.

aeneasr avatar aeneasr commented on June 18, 2024

Fix for some of the config values: 2373057

from oathkeeper.

aeneasr avatar aeneasr commented on June 18, 2024

Basically before we were using the internal cost which is I think the key length + cost function. Since your maxcost is like 100 the cache probably ran out of space after one or two keys so it's constantly evicting your values.

The fixes ignore the internal cost so you actually get 1 cost = 1 token instead of 1 token = 1 cost + cost of key

from oathkeeper.

Withel avatar Withel commented on June 18, 2024

Soooo, with the help of @Demonsthere, we checked following values of the max_cost value:

~15:05 CET: 100M (default) -> 300-400 req/m with initial spike 
~15:35 CET: 50M -> 300-400 req/m
~15:55 CET: 5M -> for a second dropped to ~270 req/m, then increased to ~300-400 req/m

And here's the graph:
image

I hope this helps, for the time being we will go back to the 39.4 and we'll wait for the further updates. Please let us know if you need anything else.

from oathkeeper.

aeneasr avatar aeneasr commented on June 18, 2024

So what you're saying is that it doesn't have an effect?

from oathkeeper.

Withel avatar Withel commented on June 18, 2024

So, I must admitt that I made a mistake that I realised just after posting last comment. Unfortunately, I didnt keep an eye on the pods after deployment and turned out that they were not restarting after changing max_count. From what I know oathkeeper does not have configuration hot reload, thus I will have to redo those experiments. Nevertheless I’m not sure when I’ll be able to perform those experiments since right now Im out of the office, but Ill do this as soon as possible. Apologies for that.

from oathkeeper.

aeneasr avatar aeneasr commented on June 18, 2024

Hot reloading only works for things that can be changed during runtime. Caches unfortunately are large memory objects that are allocated at process start and can not be changed at runtime.

from oathkeeper.

Withel avatar Withel commented on June 18, 2024

So, I have redo the tests (now making sure that after each update, pods are restarted correctly) and here are the results. Values same as before:

Before - 39.4 with max_count set to 100
9:30 CET - 40.7 with max_count set to 100 
~9:55 CET - 40.7 with max_count removed (set to 100M, default)
~10:20 CET - 40.7 with max_count set to 50M
~10:50 CET - 40.7 with max_count set to 5M

Results (thanks to @tricky42 curtesy):
image

Also, we were suspecting that maybe oathkeeper containers are running out of memory, but we confirmed that it's not the case:
image

from oathkeeper.

aeneasr avatar aeneasr commented on June 18, 2024

OK so increasing the cache size fixes the problem?

from oathkeeper.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.