Code Monkey home page Code Monkey logo

Comments (11)

yangminzhu avatar yangminzhu commented on July 20, 2024

cc @lambdai

from envoy.

lambdai avatar lambdai commented on July 20, 2024

Also want to add that L4 RBAC is likely work with a TLS connection, which is cpu extensive (1ms cpu time)
The delayed deny as a start point of back pressure propogation potentially highly reduce the CPU at both envoy and envoy's downstream

from envoy.

kyessenov avatar kyessenov commented on July 20, 2024

Wouldn't it be better to apply pressure earlier, e.g. by not reading bytes and starting TLS handshakes when there's a flood of connections? A delayed deny would mean Envoy has to maintain the memory structures for the connection when we'd want to shed them quickly.

from envoy.

yangminzhu avatar yangminzhu commented on July 20, 2024

Wouldn't it be better to apply pressure earlier, e.g. by not reading bytes and starting TLS handshakes when there's a flood of connections? A delayed deny would mean Envoy has to maintain the memory structures for the connection when we'd want to shed them quickly.

@kyessenov I think we will need both. The dealyed deny in RBAC is specific to connections to be closed due to permission error, and will be more effective to reduce the CPU usage on Envoy in some situations, for example, some gRPC clients retry in a busy for-loop when it is closed by RBAC, this creates signigicant number of new connections (e.g. 400 per second per client) on Envoy for more CPU usage (we are not really worried about the memory as it doesn't look to be an issue in either case).

A delayed deny will naturaully reduce how fast the client is to retry, then siginificantly reduce the CPU usage since there is much less new connection being created/closed at the same time.

from envoy.

kyessenov avatar kyessenov commented on July 20, 2024

SG, although this principle of delayed close should probably be applied uniformly: TLS handshake failures, protocol errors, etc all fails in the same error domain. It might be better to handle it at listener or the HCM level.

CC @yanavlasov

from envoy.

wufanqqfsc avatar wufanqqfsc commented on July 20, 2024

"When the RBAC policy evaluation result is DENY. The RBAC network filter will close the TCP connection immediately. This doesn't handle very well for some clients that just retry with a new connection at a very high rate, and that could overload the Envoy proxy to high CPU usage."

1.what's the real reason here for envoy proxy to high CPU usage ? Just for handler some clients DENY and close the tcp connection ? As you had said some clients doesn't handler well and just retry with new connection. So can we think this is client issue ?

2.From Envoy Pov, may be some protection policy or connection limit should be assigned to the same client not just only add connection close delay.

from envoy.

lambdai avatar lambdai commented on July 20, 2024

@wufanqqfsc The TLS handshake is CPU intensive. Both client and server. Imagine you have a service behind envoy with huge fan-in.

It is a client issue, but you don't nessessarily have the full control of the clients.

from envoy.

wufanqqfsc avatar wufanqqfsc commented on July 20, 2024

@wufanqqfsc The TLS handshake is CPU intensive. Both client and server. Imagine you have a service behind envoy with huge fan-in.

It is a client issue, but you don't nessessarily have the full control of the clients.

Yes, so i mean may be some connection limit policy can be assigned to the same client to avoid same client retry with the connection & closed by envoy. Such as we can limit the connection frequency for the same client if the RBAC is failed during some time slots. Anyway, add delayed deny may help but also cost memory to keep the connection context.

from envoy.

yangminzhu avatar yangminzhu commented on July 20, 2024

Yes, so i mean may be some connection limit policy can be assigned to the same client to avoid same client retry with the connection & closed by envoy. Such as we can limit the connection frequency for the same client if the RBAC is failed during some time slots. Anyway, add delayed deny may help but also cost memory to keep the connection context.

yeah, we can definitily enable different protections at multiple layer and places depending on the actual situation, it's not a one for all solution. The memory cost should be very small and the benefit (as compared to not having the delayed deny) is well worth it.

from envoy.

github-actions avatar github-actions commented on July 20, 2024

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

from envoy.

github-actions avatar github-actions commented on July 20, 2024

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

from envoy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.