Code Monkey home page Code Monkey logo

Comments (26)

rockowitz avatar rockowitz commented on September 25, 2024

Please execute ddcutil detect --verbose --trace ddc --trace i2c and submit the attachment.

It looks like you're using the newly open sourced Nvidia driver. Is that correct?

from ddcutil.

KeyofBlueS avatar KeyofBlueS commented on September 25, 2024

No that is not correct, i'm using the proprietary legacy driver version 340.108

Here it is:
ddcutil_0.9.9_detect.txt
ddcutil_1.3.0_detect.txt

Thanks

from ddcutil.

rockowitz avatar rockowitz commented on September 25, 2024

As a first step, as a hunch I've modified the i2c writer function to allocate a structure on the heap instead of using the stack, as is done in the reader function. Please build from branch 1.4.0-dev and execute ddcutil detect --verbose --trace i2c. Thanks.

from ddcutil.

KeyofBlueS avatar KeyofBlueS commented on September 25, 2024

Ok done 👍

ddcutil_1.4.0-dev_detect.txt

from ddcutil.

KeyofBlueS avatar KeyofBlueS commented on September 25, 2024

I don't know if it can help you, but i've build the following branches also:
1.0.0-release
1.1.0-release
1.2.0-release
1.2.1-release
1.2.2-release
1.2.3-dev

all work till 1.2.2-release, starting from 1.2.3-dev ddcutil has this issue.

UPDATE: More precisely starting from this f6c72c6

from ddcutil.

rockowitz avatar rockowitz commented on September 25, 2024

Background: Driver i2c-dev has two interfaces: an ioctl() interface and a higher level interface using read() and write(). Internally, the read()/write() interface maps to the same code as the ioctl() interface. However, using the read()/write() interface can lead to EBUSY errors because of the way that the i2c slave address is "predeclared" outside of the read()/write() calls.

Release 1.3.0 eliminates all use of the read()/write() interface. Previously the EDID was read using the read()/write() interface.

There's something about the proprietary nvidia driver that doesn't like the arguments passed on the ioctl() call. I've added debug code and also moved buffers from the stack to the heap as a guess that that's the locus of the problem.

Please build from 1.4.0-dev again and again run ddcutil detect --verbose --trace i2c. If that doesn't solve or at least identify the problem, I need to install a Nvidia card on a test system along with the proprietary driver to try to understand what is happening. Unfortunately, there's not enough time to do that along with everything else that needs to happen before I leave for vacation Wednesday, which is unfortunate because I do regard this as a significant bug. Thank you for reporting it and your help in diagnosing it.

from ddcutil.

KeyofBlueS avatar KeyofBlueS commented on September 25, 2024

Sadly the issue isn't solved, hope this will help at least to identify the problem:
ddcutil_1.4.0-dev_detect(2).txt

...if not, yeah it can be a significant bug, but there is no rush. In any case i'll be here to help if you need.

Thanks.

from ddcutil.

rockowitz avatar rockowitz commented on September 25, 2024

Another thing to try, if you would, again with 1.4.0-dev.

Enable kernel tracing as per this article. Then execute ddcutil detect --verbose --trace i2c as usual, and send both the ddcutil trace output and the kernel debug output. Thanks.

from ddcutil.

KeyofBlueS avatar KeyofBlueS commented on September 25, 2024

Sure, here they are:
ddcutil_1.4.0-dev_detect(3).txt
i2c_kernel_trace.txt

from ddcutil.

rockowitz avatar rockowitz commented on September 25, 2024

The kernel trace is as expected, indicating that the data structures passed on the ioctl() calls is correct.

So why is the nvidia driver returning -EINVAL? The slave address is specified in the ioctl(I2C_RDWR) calls. ioctl(I2C_SLAVE_FORCE) is a way to specify the slave address "out of band" when using the i2c-dev write()/read() interface - there's no way to include it on write()/read(). It should have no effect when using the ioctl(I2C_RDWR) interface. However, it may be that nvidia driver requires it. So as a test, the latest changes in branch 1.4.0-dev add ioctl(I2C_SLAVE_FORCE) calls.

Please build from the current 1.4.0-dev branch and execute ddcutil detect --verbose --trcfunc i2c_ioctl_writer --trcfunc i2c_ioctl_reader. Thank you.

from ddcutil.

KeyofBlueS avatar KeyofBlueS commented on September 25, 2024

For the sake of completeness, i want to give you the output of both version 1.4.0-dev and 1.2.2 (as well as i2c kernel trace).
I'm not an expert, but i want to point out a difference in the i2c kernel trace between 1.4.0-dev and 1.2.2, specifically the value of f:

i2c_kernel_trace_(ddcutil_1.4.0-dev), the value is f=0200:

#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |
         ddcutil-50583   [000] .....  1013.525510: i2c_write: i2c-1 #0 a=050 f=0200 l=1 [00]
         ddcutil-50583   [000] .....  1013.525514: i2c_result: i2c-1 n=1 ret=-22
.........................................................

i2c_kernel_trace_(ddcutil_1.2.2), the value is f=0000:

#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |
         ddcutil-62269   [003] .....  1358.013381: i2c_write: i2c-1 #0 a=050 f=0000 l=1 [00]
         ddcutil-62269   [003] .N...  1358.013894: i2c_result: i2c-1 n=1 ret=1
.........................................................

Is this normal?

ddcutil_1.4.0-dev_detect(4).txt
i2c_kernel_trace_(ddcutil_1.4.0-dev).txt

ddcutil_1.2.2_detect.txt
i2c_kernel_trace_(ddcutil_1.2.2).txt

Thanks

from ddcutil.

rockowitz avatar rockowitz commented on September 25, 2024

@KeyofBlueS @stephvnm

Your observation hit the nail on the head. f is the flags halfword in the ioctl call. x0200 is flag I2C_M_DMA_SAFE. It is set within i2c-dev when handling the ioctl call, and is used only within kernel space (i.e. ddcutil does not set it). read()/write() take a different code path and the flag does not appear to be set in that case. I've looked in the Nvidia driver code at github rrepo NVIDIA/open-gpu-kernel-modules and depending how it is compiled the nvidia driver may reject calls with the flag set and return -EINVAL.

So there's bug in the drivers, and probably some finger pointing. i2c-dev assumes that the video driver can handle the flag, and amdgpu, nouveau, etc. do, but nvidia does not.

The problem arose in ddcutil 1.3.0 because I changed the code that reads the EDID from using read()/write() to using ioctl(), and it was not caught for the release candidates.

I'm leaving on vacation momentarily, and while I'll have email and web access, I won't be able to work on the code base. So for now, just use release 1.2.2.

Thank you @KeyofBlueS for all your help in diagnosing the bug.

from ddcutil.

KeyofBlueS avatar KeyofBlueS commented on September 25, 2024

Sure, enjoy your well deserved vacation, have a good time and thank you for all your work.

Regards.

from ddcutil.

rockowitz avatar rockowitz commented on September 25, 2024

@KeyofBlueS, @stephvnm Branch 1.3.3-dev reverts the method for reading the EDID to the way it was done in 1.2.2. Please build from this version and let me know if it resolves the problem. Thank you.

from ddcutil.

KeyofBlueS avatar KeyofBlueS commented on September 25, 2024

Hi and welcome back!

With 1.3.3-dev displays are detected, but will give DDC communication failed.
The relevant bus is i2c-1. Disregard i2c-4, it's a TV.

edit: SORRY I'VE MESSED WITH THE LOG, please redownload it!

ddcutil_1.3.3_dev_detect.txt

from ddcutil.

rockowitz avatar rockowitz commented on September 25, 2024

@KeyofBlueS , @stephvnm I've re-installed the general code from 1.2.2 that uses read() and write() for i2c communication. (The amount of code that had to go back in was painful.)

On the latest branch of 1.3.3-dev, by default ddcutil still uses ioctl() for talking to slave address x37. If utility option --f1 is specified read()/write() is used. Let me know if DDC communication works with --f1.

from ddcutil.

KeyofBlueS avatar KeyofBlueS commented on September 25, 2024

No, sadly not. From the log it seems ioctl is still used for address x37 even with --f1 option.

ddcutil_1.3.3_dev_detect(2).txt

from ddcutil.

rockowitz avatar rockowitz commented on September 25, 2024

My oops. I missed a one spot (function i2c_detect_x37()) that had to be modified. I've uploaded the latest change to 1.3.3-dev.

from ddcutil.

KeyofBlueS avatar KeyofBlueS commented on September 25, 2024

Sorry but it still give DDC communication failed

ddcutil_1.3.3_dev_detect(3).txt

from ddcutil.

rockowitz avatar rockowitz commented on September 25, 2024

Well, now I'm baffled.

Are you building the nvidia driver using DKMS, or using a pre-built copy? If the former, can you determine which copy of i2c.h it is using and whether constant I2C_M_DMA_SAFE is defined?

What nvidia packages are installed?

Please execute sudo ddcutil interrogate and send the output. Thanks.

from ddcutil.

KeyofBlueS avatar KeyofBlueS commented on September 25, 2024

With DKMS.

For i2c.h, what comes with the kernel headers i guess (actually i use 5.18.5). I've attached two i2c.h files founded inside the kernel headers:

i2c.h(1).txt
i2c.h(2).txt

in the first one i see:
#define I2C_M_DMA_SAFE 0x0200 /* use only in kernel space */

nvidia packages installed:

glx-alternative-nvidia/unstable,now 1.2.1 amd64 [installed,automatic]
libegl1-nvidia-legacy-340xx/unstable,now 340.108-15 amd64 [installed,automatic]
libegl1-nvidia-legacy-340xx/unstable,now 340.108-15 i386 [installed,automatic]
libgl1-nvidia-legacy-340xx-glx/unstable,now 340.108-15 amd64 [installed,automatic]
libgl1-nvidia-legacy-340xx-glx/unstable,now 340.108-15 i386 [installed]
libgles1-nvidia-legacy-340xx/unstable,now 340.108-15 amd64 [installed,automatic]
libgles1-nvidia-legacy-340xx/unstable,now 340.108-15 i386 [installed]
libgles2-nvidia-legacy-340xx/unstable,now 340.108-15 amd64 [installed,automatic]
libgles2-nvidia-legacy-340xx/unstable,now 340.108-15 i386 [installed]
libnvidia-legacy-340xx-cfg1/unstable,now 340.108-15 amd64 [installed,automatic]
libnvidia-legacy-340xx-cfg1/unstable,now 340.108-15 i386 [installed,automatic]
libnvidia-legacy-340xx-cuda1-i386/unstable,now 340.108-15 i386 [installed,automatic]
libnvidia-legacy-340xx-cuda1/unstable,now 340.108-15 amd64 [installed,automatic]
libnvidia-legacy-340xx-cuda1/unstable,now 340.108-15 i386 [installed,automatic]
libnvidia-legacy-340xx-eglcore/unstable,now 340.108-15 amd64 [installed,automatic]
libnvidia-legacy-340xx-eglcore/unstable,now 340.108-15 i386 [installed,automatic]
libnvidia-legacy-340xx-encode1/unstable,now 340.108-15 amd64 [installed]
libnvidia-legacy-340xx-encode1/unstable,now 340.108-15 i386 [installed,automatic]
libnvidia-legacy-340xx-glcore/unstable,now 340.108-15 amd64 [installed,automatic]
libnvidia-legacy-340xx-glcore/unstable,now 340.108-15 i386 [installed,automatic]
libnvidia-legacy-340xx-ml1/unstable,now 340.108-15 amd64 [installed,automatic]
libnvidia-legacy-340xx-nvcuvid1/unstable,now 340.108-15 amd64 [installed,automatic]
libnvidia-legacy-340xx-nvcuvid1/unstable,now 340.108-15 i386 [installed,automatic]
nvidia-installer-cleanup/unstable,now 20220217+1 amd64 [installed,automatic]
nvidia-kernel-common/unstable,now 20220217+1 amd64 [installed,automatic]
nvidia-legacy-340xx-alternative/unstable,now 340.108-15 amd64 [installed,automatic]
nvidia-legacy-340xx-driver-bin/unstable,now 340.108-15 amd64 [installed,automatic]
nvidia-legacy-340xx-driver-libs/unstable,now 340.108-15 amd64 [installed,automatic]
nvidia-legacy-340xx-driver-libs/unstable,now 340.108-15 i386 [installed]
nvidia-legacy-340xx-driver/unstable,now 340.108-15 amd64 [installed]
nvidia-legacy-340xx-kernel-dkms/unstable,now 340.108-15 amd64 [installed]
nvidia-legacy-340xx-kernel-support/unstable,now 340.108-15 amd64 [installed,automatic]
nvidia-legacy-340xx-smi/unstable,now 340.108-15 amd64 [installed,automatic]
nvidia-legacy-340xx-vdpau-driver/unstable,now 340.108-15 amd64 [installed,automatic]
nvidia-modprobe/unstable,now 515.48.07-1 amd64 [installed,automatic]
nvidia-persistenced/unstable,now 470.129.06-1 amd64 [installed,automatic]
nvidia-settings-legacy-340xx/unstable,now 340.108-6 amd64 [installed,automatic]
nvidia-support/unstable,now 20220217+1 amd64 [installed,automatic]
xserver-xorg-video-nvidia-legacy-340xx/unstable,now 340.108-15 amd64 [installed]

ddcutil_1.3.3_dev_interrogate.txt

i2c_kernel_trace_(ddcutil_1.3.3-dev).txt

from ddcutil.

KeyofBlueS avatar KeyofBlueS commented on September 25, 2024

I've compiled 1.3.3-dev with your recent changes (5b18ee3 da77d55 351913d) and it's working now with --f1 option 👍

from ddcutil.

rockowitz avatar rockowitz commented on September 25, 2024

I've updated the proof of concept "--f1" code in branch 1.3.3-dev to something with a coherent user interface. The changes required were extensive.

To recap, ddcutil has to navigate schylla and charibdis in its calls into driver i2c-dev.

  • Using the ioctl() interface to read and write the I2C bus has the advantage of avoiding EBUSY errors. While in principle they can have many causes, in fact they has only been observed when using the read()/write() interface. In particular, they commonly occur when driver ddcci is loaded. The EBUSY errors can be addressed using option --force-slave-address, but this may affect other users of the I2C bus.
  • The ioctl() interface avoids (most/all) EBUSY errors, but has the disadvantage that, depending on how the proprietary nvidia driver has been built, it may trigger an incompatibility (aka bug) between driver i2c-dev and nvidia. All access fails with EINVAL error. I can see the iftest in the both the interface code that must be compiled by DKMS for the driver, and also in the correponding file in open-gpu-kernel-modules. The ioctl() interface cannot be used in this context.
  • The read()/write() interface has the advantage that it works with the nvidia driver, but EBUSY errors are possible.

I've added 2 additional options --use-file-io and --use-ioctl-io. The --f1 option no longer works. The former option causes ddcutil to use the write()/read() interface. The latter causes the ioctl() interface to be used. If neither is specified, the ioctl() inferface is used. However, if the nvidia/i2c-dev bug is encountered, ddcutil switches to the write()/read() interface. (FWIW, I'm not particularly keen on the option names. Suggestions welcome.)

Please exercise the updated 1.3.3-dev branch. It should work for you both when use-file-io is specified, or if neither option is specified. As I said, the changes required were extensive. No doubt some bugs remain.

from ddcutil.

KeyofBlueS avatar KeyofBlueS commented on September 25, 2024

Confirmed detect is working as expected.
With no option passed or with --use-ioctl-io, it tries with ioctl() inferface, then fallback into using write()/read() interface.
With option --use-file-io will use write()/read() interface straight.

So i guess there is no way to use ioctl() inferface with problematic (nvidia) drivers, maybe by force removing the I2C_M_DMA_SAFE flag.

P.S. options names are ok 👍

from ddcutil.

rockowitz avatar rockowitz commented on September 25, 2024

Thank you for the quick testing.

Modifying the i2c-dev driver to not set the bit would be a bit of work unless you regularly build the kernel. What would probably be easier would be to modify two of the nvidia driver files that DKMS compiles by adding the define for I2C_M_DMA_SAFE. Here's a link that to a bug report that I just posted on developer.nvidia.com and that contains (most) of the relevant line numbers. grepping the nvidia code will find you the rest.

from ddcutil.

rockowitz avatar rockowitz commented on September 25, 2024

The workaround for the Nvidia bug in included in release 1.4.1.

from ddcutil.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.