Code Monkey home page Code Monkey logo

Comments (9)

LaumiH avatar LaumiH commented on June 2, 2024

In this pacp file you can see the association requests and responses from UPF 10.0.0.11, which previously failed. You can see that the SMF continues sending heartbeats to UPF 10.0.0.12, and tries to re-associate the failed UPF. However, although the association requests are correctly processed by the UPF, the responses are not processed by the SMF, and there is no re-association.
starved_association.pcap.zip

Hint: a re-association attempt starts at packet 2113, for example. I did not change the timeout and retry value for the association, meaning there are 3 attempts with a timeout of 3s each.

I let this process run for 10 minutes and there was no re-association.

from free5gc.

LaumiH avatar LaumiH commented on June 2, 2024

Here is a screenshot of the logs (I have adapted the log messages, so this is not the standard output):
starved_association_5

This is also a better pcap file showing just the relevant association messages on the N4 interface. The heartbeat at that time is stopped completely on both UPFs, so the SMF really only sends the association setup messages. No UEs are connected.
You can see that the retry mechanism indeed seems to work, a retry is sent every 3 seconds. However, the SMF does not receive the responses from the transactions' EventChannel.
starved_association_5.pcap.zip

To me, it seems like the transaction channel is closed or read by some other transaction instead.

from free5gc.

LaumiH avatar LaumiH commented on June 2, 2024

I think I just found the bug: It is this operation that blocks indefinitely. There is no error thrown however, so the PFCP server receive goroutine just blocks indefinitely.

I would suggest to fix this problem using a timer like this:

select {
case tx.EventChannel <- pfcp.ReceiveEvent{
	Type:       pfcp.ReceiveEventTypeValidResponse,
	RemoteAddr: msg.RemoteAddr,
	RcvMsg:     pfcpMsg,
}:
case <-time.After(2 * time.Second):
	logger.PfcpLog.Errorf("Worker %d timed out on sending to EventChannel", id)
}

This allows the goroutine to continue if the channel is closed. Maybe the problem is that the channel is closed in the middle of writing to it, I cannot say for sure. But this fixes my problem.

I am going to create a PR with this fix.

from free5gc.

LaumiH avatar LaumiH commented on June 2, 2024

See free5gc/pfcp#16 for a fix to my issue :)

from free5gc.

tzuchiehhh avatar tzuchiehhh commented on June 2, 2024

Hi @LaumiH,

I am interested in this issue, but I have encountered some problems while attempting to reproduce the scenario. Are both UPFs designated as PSA-UPF, or is there one PSA-UPF and one I-UPF? Does each UPF connect to a separate SMF, or do both UPFs connect to a single SMF?

Thanks for your assistance

from free5gc.

LaumiH avatar LaumiH commented on June 2, 2024

Hi @tzuchiehhh,

both the UPFs are PSAs, I have no i-UPF in my setup. I have a single SMF.

Essentially, I found that the heartbeat implementation altogether lacks accuracy and is influenced a lot by the session management process. I have a fix for this, introducing worker threads to the PFCP server of the SMF to handle incoming PFCP messages. Also, I noticed that the SMF and UPF do not use the same PFCP implementation, which is sth. I don't understand yet. Maybe this has historic reasons.

I think the parallel processing in the PFCP server of the SMF mainly causes the issue. Maybe the code is not thread-safe.

from free5gc.

tim-ywliu avatar tim-ywliu commented on June 2, 2024

@LaumiH

Also, I noticed that the SMF and UPF do not use the same PFCP implementation, which is sth. I don't understand yet. Maybe this has historic reasons.

Yes, upf had been refactored. We're going to refactor smf as upf and replace pfcp with go-pfcp.

from free5gc.

tzuchiehhh avatar tzuchiehhh commented on June 2, 2024

Hi, @LaumiH,
I configured the heartbeat interval to 10ms in smf config file and set NumOfResend=0, ResendRequestTimeOutPeriod =10 and ResendResponseTimeOutPeriod = 10 in pfcp transaction.go. However, repeated association failures didn't happen. I've noticed that there is no "new node" message on the UPF. Is there anything I might be missing?

transaction

smf

upf

Thanks for your assistance

from free5gc.

LaumiH avatar LaumiH commented on June 2, 2024

Hi,

I investigated more into the problem. In fact, I did make changes to the heartbeat behavior previously (and forgot about them ...), because I noticed its inaccuracy. I am sending heartbeats in a separate thread, decoupled from receiving the actual reply from the UPF.
The way it is implemented currently, a new heartbeat is only sent after receiving a reply from the UPF, which causes the heartbeat interval to fluctuate above the configured interval, e.g. 12ms instead of 10ms.

You have no new node message in the UPF this was something I added for debugging purposes.

If you are currently refactoring the PFCP behavior of the SMF anyways, I might wait before issuing a PR with my changes/ proposal, so as not to have double work in the end :)

Best,
LaumiH

from free5gc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.