Comments (9)
In this pacp file you can see the association requests and responses from UPF 10.0.0.11, which previously failed. You can see that the SMF continues sending heartbeats to UPF 10.0.0.12, and tries to re-associate the failed UPF. However, although the association requests are correctly processed by the UPF, the responses are not processed by the SMF, and there is no re-association.
starved_association.pcap.zip
Hint: a re-association attempt starts at packet 2113, for example. I did not change the timeout and retry value for the association, meaning there are 3 attempts with a timeout of 3s each.
I let this process run for 10 minutes and there was no re-association.
from free5gc.
Here is a screenshot of the logs (I have adapted the log messages, so this is not the standard output):
This is also a better pcap file showing just the relevant association messages on the N4 interface. The heartbeat at that time is stopped completely on both UPFs, so the SMF really only sends the association setup messages. No UEs are connected.
You can see that the retry mechanism indeed seems to work, a retry is sent every 3 seconds. However, the SMF does not receive the responses from the transactions' EventChannel
.
starved_association_5.pcap.zip
To me, it seems like the transaction channel is closed or read by some other transaction instead.
from free5gc.
I think I just found the bug: It is this operation that blocks indefinitely. There is no error thrown however, so the PFCP server receive goroutine just blocks indefinitely.
I would suggest to fix this problem using a timer like this:
select {
case tx.EventChannel <- pfcp.ReceiveEvent{
Type: pfcp.ReceiveEventTypeValidResponse,
RemoteAddr: msg.RemoteAddr,
RcvMsg: pfcpMsg,
}:
case <-time.After(2 * time.Second):
logger.PfcpLog.Errorf("Worker %d timed out on sending to EventChannel", id)
}
This allows the goroutine to continue if the channel is closed. Maybe the problem is that the channel is closed in the middle of writing to it, I cannot say for sure. But this fixes my problem.
I am going to create a PR with this fix.
from free5gc.
See free5gc/pfcp#16 for a fix to my issue :)
from free5gc.
Hi @LaumiH,
I am interested in this issue, but I have encountered some problems while attempting to reproduce the scenario. Are both UPFs designated as PSA-UPF, or is there one PSA-UPF and one I-UPF? Does each UPF connect to a separate SMF, or do both UPFs connect to a single SMF?
Thanks for your assistance
from free5gc.
Hi @tzuchiehhh,
both the UPFs are PSAs, I have no i-UPF in my setup. I have a single SMF.
Essentially, I found that the heartbeat implementation altogether lacks accuracy and is influenced a lot by the session management process. I have a fix for this, introducing worker threads to the PFCP server of the SMF to handle incoming PFCP messages. Also, I noticed that the SMF and UPF do not use the same PFCP implementation, which is sth. I don't understand yet. Maybe this has historic reasons.
I think the parallel processing in the PFCP server of the SMF mainly causes the issue. Maybe the code is not thread-safe.
from free5gc.
Also, I noticed that the SMF and UPF do not use the same PFCP implementation, which is sth. I don't understand yet. Maybe this has historic reasons.
Yes, upf had been refactored. We're going to refactor smf as upf and replace pfcp with go-pfcp.
from free5gc.
Hi, @LaumiH,
I configured the heartbeat interval to 10ms in smf config file and set NumOfResend=0, ResendRequestTimeOutPeriod =10 and ResendResponseTimeOutPeriod = 10 in pfcp transaction.go. However, repeated association failures didn't happen. I've noticed that there is no "new node" message on the UPF. Is there anything I might be missing?
Thanks for your assistance
from free5gc.
Hi,
I investigated more into the problem. In fact, I did make changes to the heartbeat behavior previously (and forgot about them ...), because I noticed its inaccuracy. I am sending heartbeats in a separate thread, decoupled from receiving the actual reply from the UPF.
The way it is implemented currently, a new heartbeat is only sent after receiving a reply from the UPF, which causes the heartbeat interval to fluctuate above the configured interval, e.g. 12ms instead of 10ms.
You have no new node message in the UPF this was something I added for debugging purposes.
If you are currently refactoring the PFCP behavior of the SMF anyways, I might wait before issuing a PR with my changes/ proposal, so as not to have double work in the end :)
Best,
LaumiH
from free5gc.
Related Issues (20)
- INSUFFICIENT_RESOURCES_FOR_SPECIFIC_SLICE_AND_DNN HOT 10
- [Feat]How can I set a static IP address for a UE? HOT 3
- [Bugs] N3IWUE fails to stablish network connection with N3IWF in v3.4.0 HOT 11
- Test free5GC HOT 6
- Duplicate Definition of t3555 Timer in amfcfg.yaml HOT 1
- [Feat]Options for Creating an sslkey.log HOT 1
- [Bug] UPF not working in Kubernetes Cluster HOT 5
- Profile B not implemented HOT 1
- [Bugs] Wrong Interface Type in PFCP Session Establishment Request for IP Based PDU Session HOT 5
- [Bugs] Sometime accept for the SCTP socket fails
- [Bugs] SMF panic on external (3rd party) CHF integration HOT 31
- [Bugs] HTTP PRI method appearing in distributed tracing HOT 1
- [Bugs] LGPL module is used in CHF HOT 1
- [Bugs] CHF uses unknown license module (github.com/fclairamb/afero-snd) HOT 1
- [Bugs] Registration Status Update Error when connecting to a physical UE HOT 6
- [Feat] Layer 2 support for gtp-u tunnel (upfgtp) HOT 4
- [Bugs] N1N2 Transfer Fail while testing 45K UE in a loop . HOT 11
- 请问怎么测试N3iwf这个网元 HOT 2
- [Feat]Support URSP (User Router Selection Policy) allocated by PCF to UE
- [Question] Is it possible to connect SMSF to Free5GC (support for N20/21 interface)? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from free5gc.