Code Monkey home page Code Monkey logo

Comments (12)

mdavidsaver avatar mdavidsaver commented on June 9, 2024

Would you be able to provide a packet capture of (at least) the TCP part of this exchange? Preferably with wireshark?

from pvxs.

mdavidsaver avatar mdavidsaver commented on June 9, 2024
    epicsThreadGetCPUs() -> 7

Unrelated to the issue reported. What kind of system has an odd number of CPU cores/hyperthreads? Is this some kind of VM?

from pvxs.

karlosp avatar karlosp commented on June 9, 2024

Yes I am running this in VirtualBox and I intentionally assign one core less than I have so that commands like make -j $(nproc) does not entirely "kill" my laptop.

I hope this Wireshark log will help.

I had a running CSS with PV Formula: pva://topic1 and then I run ./example/O.linux-x86_64/mailbox topic1
mailbox-topic1.zip

Maybe not relevant but this is an error from CSS

2021-01-12T09:19:40.304+01 SEVERE [Thread 1] org.csstudio.logging.PluginLogListener (logging) - Unhandled event loop exception
java.lang.NullPointerException
	at org.diirt.support.pva.PVAChannelHandler.getProperties(PVAChannelHandler.java:314)
	at org.csstudio.diag.pvmanager.probe.DetailsPanel.setChannelProperties(DetailsPanel.java:214)
	at org.csstudio.diag.pvmanager.probe.DetailsPanel$1$1.run(DetailsPanel.java:194)
	at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:40)
	at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:185)
	at org.eclipse.swt.widgets.Display.runAsyncMessages(Display.java:5026)
	at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:4582)
	at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine$5.run(PartRenderingEngine.java:1173)
	at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:338)
	at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine.run(PartRenderingEngine.java:1062)
	at org.eclipse.e4.ui.internal.workbench.E4Workbench.createAndRunUI(E4Workbench.java:155)
	at org.eclipse.ui.internal.Workbench.lambda$3(Workbench.java:644)
	at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:338)
	at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:566)
	at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:150)
	at org.csstudio.utility.product.Workbench.runWorkbench(Workbench.java:99)
	at org.csstudio.startup.application.Application.startApplication(Application.java:265)
	at org.csstudio.startup.application.Application.start(Application.java:119)
	at org.csstudio.iter.css.product.ITERApplication.start(ITERApplication.java:120)
	at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:203)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:137)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:107)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:400)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:255)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:661)
	at org.eclipse.equinox.launcher.Main.basicRun(Main.java:597)
	at org.eclipse.equinox.launcher.Main.run(Main.java:1476)
	at org.eclipse.equinox.launcher.Main.main(Main.java:1449)

from pvxs.

mdavidsaver avatar mdavidsaver commented on June 9, 2024

I hope this Wireshark log will help.

It looks like you captured only the UDP (search) traffic. The relevant part is the TCP traffic. I've added a section on packet capture to the documentation. Please let me know if this is helpful (and correct).

from pvxs.

mdavidsaver avatar mdavidsaver commented on June 9, 2024

I may have an idea of what is going wrong. Can you re-test with the master branch (at e9ce808)? If this doesn't fix the issue, I've also added some more detail to the error message which will hopefully give some further clue.

from pvxs.

karlosp avatar karlosp commented on June 9, 2024

I can confirm that the issue is fixed now.

I do not know if it is somehow related but I noticed one error in Log Messages in CSS while running the same example as described in my first post, which pops up exactly every 60s.

2021-01-13T12:17:29.262+01 WARNING [Thread 188] org.epics.pvaccess.impl.remote.codec.AbstractCodec (processHeader) - Invalid header received from client /10.0.2.15:59504, disconnecting...

I started capturing data a few seconds before the event and stopped about a second after the event.
Invalid header received from client.pcapng.gz

Maybe another issue should be opened for this?

from pvxs.

mdavidsaver avatar mdavidsaver commented on June 9, 2024

I can confirm that the issue is fixed now.

Good.

Invalid header received from client /10.0.2.15:59504, disconnecting...

I think this error message is itself in error. It indicates a protocol framing error. Based on your last packet capture, and some local tests, I think the actual cause is that the server is timing out and closing the connection.

I can see an unacknowledged CMD_ECHO from the client, and a ~200us later the server RSTs the connection. I guess this abnormal close somehow isn't handled properly in pvAccessJava and maybe junk in the RX buffer is being processed?

If I set export PVXS_LOG=*=DEBUG (or WARN) for the mailbox server I see eg.

2021-01-13T09:58:59.610581953 WARN pvxs.tcp.io Client 192.168.210.1:55892 connection timeout

I don't see this every time though.

The long story of inactivity timeouts with pvAccessCPP is laid out in epics-base/pvAccessCPP#139. The short story is that originally C++ clients were not sending CMD_ECHO, and C++ servers would never timeout. I tried to address this with epics-base/pvAccessCPP#144 .

I knew that pvAccessJava clients were sending CMD_ECHO, but it looks like I misinterpreted the meaning of the timeout configuration parameter. pvAccessJava clients are sending a echo every 30 seconds and timeout out after 60 seconds, while pvAccessCPP (and now PVXS) servers timeout after 30 seconds.

So with a C++ server, and Java client, there is a tight race between the client sending CMD_ECHO, and the server timing out. On my laptop it seems that the client echo won often enough that I didn't notice this at the time. I do sometimes see the "Invalid header" message now though.

I guess the only reasonable course of action is to increase the timeout in pvAccessCPP and PVXS from 30 seconds to 60, while leaving the echo interval at 15 seconds?

@kasemir fyi.

from pvxs.

mdavidsaver avatar mdavidsaver commented on June 9, 2024

6861f03 increases the inactivity timeout to 40 seconds. A future change will make this configurable.

from pvxs.

karlosp avatar karlosp commented on June 9, 2024

@mdavidsaver thanks for your quick response and detailed explanations.

With the latest commit, Invalid header received warning does not show up any more.

Should we tag the latest commit with 0.1.1?
Or at least the commit which fixed the original error.

from pvxs.

mdavidsaver avatar mdavidsaver commented on June 9, 2024

Should we tag the latest commit with 0.1.1?

Since you didn't find a third issue today, sure!

from pvxs.

mdavidsaver avatar mdavidsaver commented on June 9, 2024
2021-01-14 18:34:59.061 SEVERE [Thread 1] org.csstudio.logging.PluginLogListener (logging) - Unhandled event loop exception
java.lang.NullPointerException
        at org.epics.pvaccess.client.impl.remote.ChannelImpl.getRemoteAddress(ChannelImpl.java:558)
        at org.diirt.support.pva.PVAChannelHandler.getProperties(PVAChannelHandler.java:313)
        at org.csstudio.diag.pvmanager.probe.DetailsPanel.setChannelProperties(DetailsPanel.java:214)
...

Also, I was seeing, and continue to see, a log message message similar to #13 (comment). So I don't think it is related to the issue with processing of CMD_GET_FIELD requests (aka Introspect). Thinking about null is what led me to 0356eee though.

from pvxs.

mdavidsaver avatar mdavidsaver commented on June 9, 2024

In https://github.com/mdavidsaver/pvxs/releases/tag/0.1.1

from pvxs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.