Comments (15)
I have a beginning of something there with this approach. [1]
According to the first tests, it behaves as expected.
https://github.com/clj-commons/aleph/pull/641/files?diff=unified&w=1
from aleph.
(I will come with more examples and some pinpoints on Netty later today/in the evening)
from aleph.
GRPC is also a likely good inspiration, see https://github.com/grpc/grpc-java/blob/master/netty/src/main/java/io/grpc/netty/NettyServer.java#L344-L371
from aleph.
looks like you guys are well underway
Don't look at me, I stop my servers with Ctrl+C, the way god intended 😁
from aleph.
Oh this is great! The default timeouts are particularly egregious when running integration tests. This actually made me remove the netty/wait-for-close
call at some point. Glad to hear that there is a way to parameterize this.
from aleph.
While those parameters are now configurable, they will work as expected when you are trying to decrease them, not increase them.
Here is an example:
(defn handler [{:keys [body]}]
(let [res (s/reduce (fn [acc _]
(prn acc)
(Thread/sleep 1000)
(inc acc))
0 body)]
{:body (str @res)}))
(def server (http/start-server
#'handler
{:port 9999
:raw-stream? true
:shutdown-timeout 200
:shutdown-quiet-period 100}))
Then upload a huge file
curl localhost:9999 --data-binary @/tmp/100GiB
Wait five seconds
1
2
3
4
5
Stop the server
(.close server)
It's stopped after 5 seconds (instead of 200 you could expect)
6
7
8
9
10
Connection is closed client side
curl: (55) Send failure: Broken pipe
But this is the expected behaviour of Netty according to this comment : netty/netty#3706 (comment)
cc @pyr
Closing the issue as it has been addressed by #639
from aleph.
While those parameters are now configurable, they will work as expected when you are trying to decrease them, not increase them.
Hm when reading your example scenario, it appears that it's rather the other way around, i.e. decreasing isn't possible? 🤔
from aleph.
Oh this is great! The default timeouts are particularly egregious when running integration tests. This actually made me remove the netty/wait-for-close call at some point. Glad to hear that there is a way to parameterize this.
With the PR that just got merged, you will be able to solve your problem by specifying a shutdown-timeout
(which defaults to 15 seconds). This parameter will guarantee your EventLoopGroup
will be stopped (netty/wait-for-close
) after at most X seconds.
This totally works the way you expect.
By default
(def server (http/start-server
#'handler
{:port 9999
:raw-stream? true}))
(do (.close server)
(netty/wait-for-close server))
Even without any traffic, it might take a few seconds for netty/wait-for-close
to return.
With 0 second
(def server (http/start-server
#'handler
{:port 9999
:raw-stream? true
:shutdown-quiet-period 0 ;; shutdown-quiet-period should be less than shutdown-timeout
:shutdown-timeout 0}))
(do (.close server)
(netty/wait-for-close server))
It will be closed instantly.
With 200 second
(def server (http/start-server
#'handler
{:port 9999
:raw-stream? true
:shutdown-quiet-period 200 ;; shutdown-quiet-period should be less than shutdown-timeout
:shutdown-timeout 200}))
(do (.close server)
(netty/wait-for-close server))
This will guarantee the netty/wait-for-close
will return after 200 seconds.
All good!
Now, the issue is with the shutdown-quiet-period
which, according the documentation :
Unlike shutdown(), graceful shutdown ensures that no tasks are submitted for 'the quiet period' (usually a couple seconds) before it shuts itself down. If a task is submitted during the quiet period, it is guaranteed to be accepted and the quiet period will start over.
This is actually confusing... and can be translated in different ways.
One way is what just got merged :
optional period in seconds for which new connections will still be serviced after a scheduled shutdown via
java.io.Closeable#close
. Defaults to 2 seconds.
But according to netty/netty#3706 (comment), this understanding is totally inaccurate.
The EventLoopGroup
will continue to accept tasks! But all the EventExecutor
will be closed as soon as shutdownGracefully
has been called. [1]
@pyr shout on the issue to understand what should be done to support what we could call a graceful shutdown (continue serving current requests) [2]
If we have no answer from Norman, we should probably have a look at the other Netty based HTTP servers on the Java ecosystem (Vertx, Micronaut, Undertow, Akka?)
I will reopen the issue, we should at least revisit the docstring associated to the shutdown-quiet-period
.
[1] : https://github.com/netty/netty/blob/4.1/transport-classes-epoll/src/main/java/io/netty/channel/epoll/EpollEventLoop.java#L420
[2] netty/netty#3706 (comment)
from aleph.
Quickly, here are the pointers for VertX [1] and Micronaut [2].
[1] : https://github.com/eclipse-vertx/vert.x/blob/11ab144725e551be591118831377101015e1392c/src/main/java/io/vertx/core/net/impl/VertxEventLoopGroup.java#L100-L107
[2] : https://github.com/micronaut-projects/micronaut-core/blob/231cd469962c88f7b457b5b09e539b5655e568e5/http-server-netty/src/main/java/io/micronaut/http/server/netty/NettyHttpServer.java#L589-L619
from aleph.
The call to ServerListener::shutdown
in GRPC seems indicate it's the right inspiration: https://github.com/grpc/grpc-java/blob/master/core/src/main/java/io/grpc/internal/ServerListener.java#L33-L37
from aleph.
Anyway, this issue can be closed either whether:
- We change the docstring of
shutdown-quiet-period
to match Netty behaviour - We implement exactly what the docstring mentions
I can't see any reason why aleph users would expect 1. I would really strive for 2.
from aleph.
Just for completeness, there is one situation where it's working the way you would expect it => if you don't leave the EventLoop
.
One obvious way is to pass executor :none
:
(defn handler [{:keys [body]}]
(let [res (s/reduce (fn [acc v]
(prn (.getName (Thread/currentThread)))
(Thread/sleep 1000) ;; blocking the EventLoop !!!
(inc acc)) 0 body)]
(d/chain' res (fn [n]
{:body (str n)}))))
(def server (http/start-server #'handler
{:raw-stream? true
:port 9999
:executor :none}))
(do (.close server)
(netty/wait-for-close server))
Output
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
The other way is to pass a custom executor which defines :onto? false
(defn handler [{:keys [body]}]
(let [res (s/reduce (fn [acc v]
(prn (.getName (Thread/currentThread)))
(Thread/sleep 1000) ;; blocking the EventLoop !!!
(inc acc)) 0 body)]
(d/chain' res (fn [n]
{:body (str n)}))))
(def executor (flow/utilization-executor 0.9 512
{:onto? false}))
(def server (http/start-server #'handler
{:raw-stream? true
:port 9999
:executor executor}))
(do (.close server)
(netty/wait-for-close server))
Output
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
"aleph-netty-server-event-pool-2"
from aleph.
Thinking loudly...
Maybe we should track the active Channels
by using a ChannelGroup
[1].
Before calling .shutdownGracefully
, we wait for the ChannelGroup
to be empty as it removes
closed channels automatically.
Simplify shutdown process with ChannelGroup
If both ServerChannels and non-ServerChannels exist in the same ChannelGroup, any requested I/O operations on the group are performed for the ServerChannels first and then for the others.
This rule is very useful when you shut down a server in one shot:
This seems to be exactly what we need.
Someone wrote a guide about how to write a chat with Netty which broadly describes the idea. [2]
WDYT?
[1] : https://netty.io/4.0/api/io/netty/channel/group/ChannelGroup.html
[2] : https://github.com/pyr/net/blob/a2120bf61b6fd480707ea6b217545b3d640288d0/examples/guides/chat.md?plain=1#L32-L34
from aleph.
With the PR that just got merged, you will be able to solve your problem by specifying a
shutdown-timeout
(which defaults to 15 seconds). This parameter will guarantee yourEventLoopGroup
will be stopped (netty/wait-for-close
) after at most X seconds.
Just for the record: I can confirm that setting both new options to 0 works as adverstised 👍 Thanks!
I haven't fully wrapped my head around the remainig issue but looks like you guys are well underway 😄
from aleph.
Addressed by : #642
from aleph.
Related Issues (20)
- Improve wrap-future
- Netty resource leak detector too unreliable? HOT 20
- Missing aleph.utils.PluggableDnsAddressResolverGroup HOT 3
- TCP server: Spurious NPE on bogus TLS connection establishment HOT 1
- TCP server invokes handler when connection closes prematurely during TLS handshake HOT 13
- `aleph.udp-test/test-echo` broken on macs HOT 1
- Look into intermittently failing slow client test HOT 11
- Update examples and docs
- Add CONTRIBUTING.md HOT 2
- Increase TCP SSL Buffer HOT 9
- Wrong number of args (5) passed to: manifold.executor/thread-factory HOT 3
- `test-failed-ssl-handshake` broken on Java 8 HOT 5
- The client ignores content type when using byte arrays in multipart data HOT 3
- Impossible to set custom file name when using memory data in a multipart upload
- "Possibly confusing dependencies found" regression in 0.6.1 HOT 5
- Add support for `wrap-decompression` middleware HOT 9
- Please bump netty versions because of CVE-2023-34462 HOT 1
- Enable endpoint identification (aka hostname verification) for TLS clients by default? HOT 9
- Look into new Ring websocket support for Aleph HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aleph.