Comments (35)
I tried to investigate this and in my tests I found an interesting hint: changing the "serializer" changes the memory footprint by a lot.
Since both RedisChannelLayer
and RedisChannelLayer
use msgpack I tried overriding the serialization in the first one with standardlib json and I got a very different memory profile:
import json
import random
from channels_redis.core import RedisChannelLayer as _RedisChannelLayer
class RedisChannelLayer(_RedisChannelLayer):
### Serialization ###
def serialize(self, message):
"""
Serializes message to a byte string.
"""
message = json.dumps(message).encode('utf-8')
if self.crypter:
message = self.crypter.encrypt(message)
# As we use an sorted set to expire messages we need to guarantee uniqueness, with 12 bytes.
random_prefix = random.getrandbits(8 * 12).to_bytes(12, "big")
return random_prefix + message
def deserialize(self, message):
"""
Deserializes from a byte string.
"""
# Removes the random prefix
message = message[12:]
if self.crypter:
message = self.crypter.decrypt(message, self.expiry + 10)
return json.loads(message.decode('utf-8'))
As you can see the memory in the JSON test return back to a "normal" level (there is still some memory which was not released, but much less than with msgpack).
I tested this on python 3.10 inside an alpine-docker container.
Also seems that there is a problem with msgpack and python 3.12: msgpack/msgpack-python#612
I would like to have more time to perform furhter tests and learn how to better use the memory profile, for the moment I hope that this may help someoneelse to find a solution.
from channels.
Hey @mitgr81! No updates yet.
We're using garbage collection on every message or new connection. This has helped a bit, but the memory still slowly increases and hits the max in about a week. We usually deploy and restart automatically the machines 2-3 times a week, which temporarily fixes the issue.
I hope the Channels team can look into this to see if it's a general problem with memory leaks. cc @carltongibson
from channels.
Hey @bigfootjon!
Running this repository: https://github.com/cacosandon/django-channels-memory-leak, you'll notice the memory leaks.
If you remove the sending of large messages, then the problem disappears unless you open/close connections fast enough to make the memory go up again. It's literally the basic setup of Django Channels, so I don't know what else I should remove.
I think the next step is going deeper into Django Channels source code and start modifying things there. I don't have a lot of time now to do this, so we have mitigated it by monitoring and restarting our servers (for now).
from channels.
Cross linking #1948 which is a long-standing known memory leak in channels.
from channels.
Sure! I'll try to find time today to prepare a report on memray --leaks
for each protocol server and test the PubSub layer. I'll get back to you soon, thanks!
from channels.
So I tried multiple combinations. All HTML reports from memray
are here:
reports.zip
But below there are screenshots from them.
First, tried with Redis Channels (not PubSub) to get memory leaks.
With uvicorn
PYTHONMALLOC=malloc memray run --force -o output.bin -m uvicorn core.asgi:application
+
memray flamegraph output.bin --force --leaks
So, the leaks report include memory that was never released back, but I don't know how to interpret it correctly. Seems like AuthMiddleware
was leaking but after removing it, the results are almost the same.
redis-channels-uvicorn-leaks.html
Here is the screenshot of the uvicorn
+ leaks without AuthMiddleware
:
redis-channels-uvicorn-without-authmiddleware-leaks.html
Then tried with daphne
PYTHONMALLOC=malloc memray run --force -o output.bin -m daphne core.asgi:application
+
memray flamegraph output.bin --force --leaks
redis-channels-daphne-leaks.html
The interesting part is that hypercorn
showed no memory leaks (or maybe memray
is not working here(?))
PYTHONMALLOC=malloc memray run --force -o output.bin -m hypercorn core.asgi:application
+
memray flamegraph output.bin --force --leaks
redis-channels-hypercorn-leaks.html
Then, I tried with garbage collect for uvicorn
and daphne
. Same story for both.
memray run --force -o output.bin -m uvicorn core.asgi:application
+
memray flamegraph output.bin --force
redis-channels-uvicorn-gccollect.html
memray run --force -o output.bin -m daphne core.asgi:application
+
memray flamegraph output.bin --force
redis-channels-daphne-gccollect.html
And finally tried with PubSub
for uvicorn
and daphne
memray run --force -o output.bin -m uvicorn core.asgi:application
+
memray flamegraph output.bin --force
memray run --force -o output.bin -m daphne core.asgi:application
+
memray flamegraph output.bin --force
Just in case, I also removed all @profile
above functions so the memory leaks were not affected by the memory-profiler
library.
Hope all these reports help understanding the constant memory increase.
Right now I am trying to move my application to hypercorn
so I can test it on staging, but websocket messages are empty π€ . If I manage to solve it, I'll post the results here!
from channels.
Hey @mitgr81! No updates yet.
We're using garbage collection on every message or new connection. This has helped a bit, but the memory still slowly increases and hits the max in about a week. We usually deploy and restart automatically the machines 2-3 times a week, which temporarily fixes the issue.
I hope the Channels team can look into this to see if it's a general problem with memory leaks. cc @carltongibson
Hi @cacosandon, Are you using uvicorn or daphne in production? or hypercorn?
Hey! uvicorn for now.
from channels.
@carltongibson Already rebased both channels
and channels-redis
PRs. These are the updated dependencies:
Django==5.0.2
channels @ git+https://github.com/fosterseth/channels.git@clean_channels
channels-redis @ git+https://github.com/fosterseth/channels_redis.git@clean_channels
uvicorn[standard]==0.20.0
memory-profiler==0.61.0
memray==1.12.0
and these are the results for Uvicorn with PubSub:
It seems the problem persists. I believe @sevdog's investigation around the serializer is likely the root cause, given its generic nature (whether using PubSub or not, and regardless of uvicorn, daphne, or hypercorn, even with a minimal example).
I can test with other settings later. Let me know!
from channels.
@cacosandon OK, thanks for trying it.
As to root cause, I still need to get a minimal reproduce nailed down here, but yes maybe...
We're getting closer to it I suppose π
from channels.
Does the same thing happen with other protocol servers, such as hypercorn and Daphne?
from channels.
I've tested daphne
and hypercorn
alongside uvicorn
. All three show a similar pattern of memory usage, increasing steadily up to around 160 MiB. Despite this, they continue to consume more memory indefinitely, as monitored by memory-profiler
.
The interesting thing is, while uvicorn
shows a continuous rise in memory usage on the memray
graph, the graphs for daphne
and hypercorn
are flat at 80 MiB. This discrepancy makes it unclear which tool provides more reliable data.
Here are the commands I used for each:
- Uvicorn:
memray run --force -o output.bin -m uvicorn core.asgi:application
- Daphne:
python -m memray run -o output.bin --force ./manage.py runserver
- Hypercorn:
memray run --force -o output.bin -m hypercorn core.asgi:application
from channels.
And can you see from any of the tools, memray perhaps, which objects are consuming the memory?
(I'd expect a gc.collect()
to help here TBH)
from channels.
@cacosandon Also, can you try with the PubSub layer, and see if the results are different there? Thanks.
from channels.
I've made it to make hypercorn
work!
For some reason the websocket messages that were bytes-only were sent as {"text": None, "bytes": ... }
just in hypercorn so the function of AsyncWebsocketConsumer
always called the text handler.
Added a PR for that: #2097
async def websocket_receive(self, message):
"""
Called when a WebSocket frame is received. Decodes it and passes it
to receive().
"""
- if "text" in message:
+ if "text" in message and message["text"] is not None:
await self.receive(text_data=message["text"])
else:
await self.receive(bytes_data=message["bytes"])
Testing now in staging π€
from channels.
Still there is a memory leak in my application with hypercorn
π
It seems that memray
just doesn't work with it, because memory-profiler
does show a constant non-stop increase with any server protocol.
from channels.
Hi @cacosandon
Looking at the uploaded report, for e.g. redis-pubsub-daphe
, the memory usage rises and the stabilises:
The redis-channels-uvicorn-leaks
report peaks at 168MB then falls to 151MB.
from channels.
Hey @carltongibson, thank you for taking a look.
Yep, but if you zoom in redis-pubsub-daphe
it just decelerates the memory increase (click on the graph). I think the first rise is just the correct memory usage, and then u see the memory leak.
On the other hand, redis-channels-uvicorn-leaks
experiences memory drops at intervals due to the PYTHONMALLOC=malloc
flag; however, the overall memory usage continues to increase. If you examine each drop, you'll notice that the memory level after each fall is higher than before, without stopping.
from channels.
@carltongibson, do you have any clue about what's happening? Or what else can I try? I'm willing to try anything!
from channels.
@cacosandon Given that you report it happening with the pub sub layer and different servers, not really. You need to identify where the leak is happening. Then it's possible to say something.
from channels.
@carltongibson all my samples are from using RedisChannelLayer
or RedisPubSubLayer
, with uvicorn
, daphne
or hypercorn
, with the tutorial example. My app has the problem too but I think it's a generalized problem.
Some things I've noticed:
- Memory increases constantly when there are large messages (>0.5MiB)
- Memory increases constantly when there are multiple connects/disconnects (every handshake adds memory)
- Memory leak is not present using
InMemoryChannelLayer
- Using explicit
del
andgc.collect()
decelerates the increase of memory.. but the leak is still present - Creating large objects in Django Views does not leak the memory (every request kinds of clean up)
I don't know how nobody else is having this problem. Maybe they just don't send large messages π€
from channels.
Hi @cacosandon β are you able to identify where the leak is happening? Alas, I haven't had time to dig into this further for you. Without that it's difficult to say too much.
If you can identify a more concrete issue, there's a good chance we can resolve it.
from channels.
@carltongibson no :( that's actually the thing that I'm struggling on: finding the memory leak π
I really tried every tool to detect it, but nothing noticeable or strange in the reports..
from channels.
I don't know how nobody else is having this problem. Maybe they just don't send large messages
I wouldn't assume that. π I've been silently watching and hoping you find more than I did when I looked. We had some success changing servers from daphne
to uvicorn
. We're still seeing some leakiness, but have resolved to using tools to monitor memory and restart services.
Here are some other things I've watched:
- ansible/awx#7720
- #1181
- https://github.com/yuriymironov96/django-channels-leak (from django/daphne#373)
from channels.
@mitgr81 what tools do you use to monitor and restart? For now I would love to implement that.
Will take a look on those resources!
from channels.
@mitgr81 what tools do you use to monitor and restart? For now I would love to implement that.
We're rocking a bespoke monitor for docker containers. It's pretty simple; essentially we label each container with a valid restart time and a memory limit (among other rules); and the "container keeper" looks for them.
from channels.
@cacosandon - Just curious if you've had any more luck than I have on this.
from channels.
We're running into the same issue. Daphne process used up over 50gigs of RAM on our server before it crashed.
from channels.
Hey @mitgr81! No updates yet.
We're using garbage collection on every message or new connection. This has helped a bit, but the memory still slowly increases and hits the max in about a week. We usually deploy and restart automatically the machines 2-3 times a week, which temporarily fixes the issue.
I hope the Channels team can look into this to see if it's a general problem with memory leaks. cc @carltongibson
Hi @cacosandon, Are you using uvicorn or daphne in production? or hypercorn?
from channels.
@cacosandon: What are the variables you haven't changed? It sounds like you've swapped everything out (including your application's business logic) and the problem still exists which is troubling.
Have you tried simplifying the code down until the problem doesn't exist? AIUI from this thread the channel-layer concept seems to be the cause, but have you tried to stub out the channel-layer code in various ways to see where the problem originates? (if it's not the channel-layer, then the same principle applies: just keep axing code until you've got the simplest program possible that still repros the problem)
(investing in a test harness that artificially generates problematic conditions might aid in discovering the problem by speeding up the testing cycle, if you haven't already done so)
from channels.
If you find some time to investigate I think removing code from channels is the right approach. If the memory charts arenβt doing it then the opposite (finding ways NOT to allocate memory) is the only path forward
from channels.
Thanks @acu192. You're absolutely right. There's an unresolved chain of thought, and a likely fix sat there (for life reasons on my part I suppose.)
@cacosandon if you could test the linked PRs and feedback, that would help greatly.
from channels.
Hey!
Yes, would love to help. Tried and raised some errors, surely because it's outdated and needs a rebase. Will ping him in the PR!
from channels.
@cacosandon do note there's two related PRs. One for channels and one for channels-redis. You'll need to apply them both.
from channels.
Related Issues (20)
- Unable to collect code coverage when using `ChannelsLiveServerTestCase` HOT 14
- error in channel layer explanations HOT 2
- adding a check if group_name exists method for channel_layer HOT 1
- Small messages are getting concatenated HOT 2
- How can I change the schema in the database? HOT 1
- How to test send_json() contents when passing close=True HOT 2
- Handler declaration by decorator/annotation (in order to prevent potential method leaks) HOT 1
- django-q2 and django-channels together HOT 1
- Channel Layer's group_send() Not Working When Called From Django View HOT 3
- Add CORs middleware HOT 1
- Breakage in routing.py on django/main pre-5.1.
- Use Django async-native APIs where possible HOT 9
- WARNING daphne.server ASGIStaticFilesHandler WebRequest took too long to shut down and was killed HOT 4
- Use group_send() when group is empty HOT 1
- Session Data Corrupted HOT 1
- Error with send_json Function in AsyncJsonWebsocketConsumer HOT 6
- Getting TypeError: SSEConsumer() missing 2 required positional arguments: 'receive' and 'send'
- django.core.exceptions.SynchronousOnlyOperation
- Minor error in documentation HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from channels.