Comments (16)
It's not clear in your description why the second channel is erroneously closing. If it closes for a non-normal reason, you will necessarily receive error logs about the reason. Do you have any relevant logs for BChannel? We link to the tracker because our presence is necessarily tied to the tracker. If it goes down, we should also fails and when we recover and the client reconnects, the presence will be re-tracked.
from phoenix_pubsub.
The channels are closing because the Application is being stopped. The issue is that they're closing with different reasons - AChannel is exiting with reason :shutdown, whereas BChannel exits with reason :killed.
I believe this is because (please correct me if I'm wrong), when you create a channel using this guide, by default it isn't part of a supervision tree. So when you stop the application, the processes are just killed. However, when you link the channel process to a presence process, the presence process is part of a supervision tree. So when the supervisor stops the presence process with reason :shutdown, that causes the channel process to also get stopped with reason :shutdown.
By itself this isn't the worst thing, but it causes unintuitive behavior where tracking presence in a channel will cause the client not to reconnect if the Application is restarted.
The way I repro-ed this was:
- Create a client that connects to both AChannel and BChannel
- Run the server with
iex -S mix
- Set up traces on the AChannel and BChannel processes (I just used the observer)
- In iex, run
Application.stop(:myapp)
- The trace logs will show
(This is AChannel's process - ie. linked to a presence tracker)
13:45:07:849561 (<0.747.0>) in_exiting 0
13:45:07:849589 (<0.747.0>) exit shutdown
13:45:07:849603 (<0.747.0>) out_exited 0
(This is BChannel's process - no presence)
13:45:12:854318 (<0.744.0>) in_exiting 0
13:45:12:854329 (<0.744.0>) exit killed
13:45:12:854332 (<0.744.0>) out_exited 0
from phoenix_pubsub.
Thanks for following up, that has been very well detailed.
If we are going to fix this, we will likely do it by making BChannel's exist with shutdown reason too. Therefore, could you let us know why is your application being restarted?
from phoenix_pubsub.
For deployment I build releases using distillery. We're not doing hot upgrades, so to deploy I just untar the new release and run:
bin/myapp restart
This cleanly stops the application and then starts it with the new code. My interim solution to this problem is to just kill
the beam process and then run bin/myapp start
Incidentally - the behavior I'd prefer is for AChannel to exit with :killed :) That way after the restart the client will reconnect to the channel. I can see why that doesn't make sense in a clean shutdown, I just wonder how many people already rely on that reconnect behavior and would get thrown off by it not reconnecting.
Perhaps an alternate solution is to have the Presence process monitor the channel process instead of linking. Then if the channel process dies, the presence process can also kill itself, but if the presence process does the supervisor can restart it and it can monitor the channel process again?
from phoenix_pubsub.
Maybe the answer is to force a client reconnect unless the channel exits with normal reason.
from phoenix_pubsub.
I put together what I think is the least invasive approach that makes sense. You can see the change here: hanrelan/phoenix@2a3b38e
Essentially I treat {:shutdown, :closed}
and {:shutdown, :left}
as special cases which don't send phx_error
, whereas all other :shutdown
types do. I think this makes sense because you want those shutdowns to propagate over links (eg. to end the presence process), so changing those to :normal
would break in other places. All tests still pass with this change.
Happy to submit this as a PR in a topic branch if you think this makes sense.
from phoenix_pubsub.
Maybe the answer is to force a client reconnect unless the channel exits with normal reason.
This should already happen today. If the channel exits ungracefully, the client will auto rejoin with expo back off. @hanrelan are you using the phoenix.js channels client?
from phoenix_pubsub.
I am using the phoenix.js channels client
I think Jose meant :normal
in the sense of the signal. What's happening is that this code:
https://github.com/phoenixframework/phoenix/blob/master/lib/phoenix/socket/transport.ex#L288
causes the server to send a phx_close
event to the client when the process receives the :shutdown
command. Once the client receive phx_close, it doesn't try to reconnect.
However, :shutdown
is used by supervisors to stop processes, so in this case during an application restart it's stopping the Presence process which is propagating the :shutdown
to the channel process. This results in the client not reconnecting if the server is restarted (in the sense of a release restart).
The diff I sent before changes it so only explicitly closing the channel (via the client's leave()
function or the server's close
) is considered a clean shutdown. All other shutdowns - including one enforced by a supervisor or via a link - is considered unclean and the client will try to reconnect.
This keeps the behavior between processes that are linked to a presence and processes that aren't linked to a presence the same with respect to shutdowns and reconnect attemps.
Happy to discuss on Slack as well if that's easier, I'm @hanrelan in the phoenix channel
from phoenix_pubsub.
Ah I see. We could change the behavior of the phx_close event to differentiate :normal
and :shutdown
, but I'm unsure what makes the most sense. For example, in your channel code today, you could do:
{:stop, {:shutdown, :normal}, socket}
What should happen in that case? We won't be able to determine the difference between this graceful exit and a "graceful restart" where the client should re-establish a connection.
from phoenix_pubsub.
@chrismccord I think we should consider it the same as a link. Which means that we will break on everything that is not :normal
. I will push the changes to Phoenix master today. :)
from phoenix_pubsub.
So I attempted a fix at this and we use {:shutdown, ...}
for clean shutdown in many cases. I am not sure what is the alternative. We could special case :shutdown
but that may be confusing.
from phoenix_pubsub.
yes, that is my concern that {:shutdown, :normal}
is a clean shutdown, which we would need to propagate to the client as a phx_error, which is wrong. A pre app shutdown task might be the way to go, that forcefully kills transport pids, shuts down the transport, then issues the app shutdown. That way restarts would leave clients in rejoin mode? I see no other way around it.
from phoenix_pubsub.
Another thought, but obviously you guys are much more familiar with this than I am so feel free to ignore if this makes no sense.
What if instead of linking to the channel process, Phoenix.Tracker
just set up a monitor of the channel process instead. Then if the channel process dies, the Tracker can just stop itself.
The other case to deal with is if the Tracker process errors out for some reason, you don't want presence to silently stop working for a channel process. My understanding is a fuzzier here, but maybe this could be resolved by using a transient
restart strategy in the presence process' supervisor so it gets restarted if it errors?
from phoenix_pubsub.
@hanrelan the tracker trap exits exactly because we don't want the tracker to crash if a "client" crashes. But we need to link to solve your last paragraph otherwise there is no way for the "presence client" to be brought down if the presence server crashes.
The client could monitor and raise but links are more appropriate here. I believe @chrismccord is working on a solution though. :)
from phoenix_pubsub.
Ah got it I misunderstood the purpose of the link. Thanks for the explanation and the quick response to this issue @chrismccord and @josevalim!
from phoenix_pubsub.
This has been fixed on phoenix master and will ship with the 1.3 release. Check the PR commit details for details but tldr; is your app restarts will be seen as errors by the client and they will reconnect. Cheers!
from phoenix_pubsub.
Related Issues (20)
- Optimize pool_size of 0 for PG2 to allow publisher only operation
- Latest Version of Phoenix.PubSub in Hex HOT 1
- broadcast_from!/4 is undefined or private HOT 1
- Network traffic up for 20 minutes after a server restart
- Release v1.1.3 suppressing warnings HOT 1
- Where is version 2.0? HOT 1
- It may be helpful for users, if parallel option can be provided when starting Registry. Without it partitions in registry executes sequentially. HOT 1
- unknown registry: XXX.PubSub HOT 4
- RFC: Provide a way to mass update presences HOT 1
- topic forced to be string HOT 1
- Race condition causing data inconsistency when nodes are coming up
- Does Phoenix.PubSub itself support subscribing to wildcard topics? HOT 1
- PubSub registry fails to start or race condition when testing? HOT 3
- [Feature Request] Be able to transform a module into a Pub.Sub HOT 6
- Unsubscribe all HOT 2
- v2.1 and v2.0 are incompatible
- Odd Jason Encoding error after new release HOT 2
- Broken Build Status link
- Presence list keep growing when using Presence.update HOT 3
- Presence stops working after ~1 week HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from phoenix_pubsub.