Code Monkey home page Code Monkey logo

Comments (11)

stefano-garzarella avatar stefano-garzarella commented on June 13, 2024 1

I think we need to split the issue in 2 small issues and address them separated:

  1. Fix the potential UB in this crate
  2. Extend the specification to use packed or an explicit padding

For 1, we should provide a fix that can work with any QEMU/VMM, also old ones.
I don't understand why if we add just the padding here and we set it to 0 every time we read from the socket is still UB.

For 2, we need to update the spec and add a feature for that. I'm not sure if it is worth it, but I can't see how we can change that (adding packed, or adding a payload) without breaking compatibility.

from vhost.

Ablu avatar Ablu commented on June 13, 2024

Fix qemu (and frontend) adding an explicit padding field guaranteeing that it will be initialized to 0:

I still do not understand why we need this to be 0 initialized. At least from a Rust perspective we should be fine... We will read "any" padding, but it should not be undefined behaviour. When reading from volatile memory we only care that we write valid values to the type. If the padding gets fixed to an int type, any value is valid.

So IMHO we can just add the padding field and are fine...

from vhost.

stefano-garzarella avatar stefano-garzarella commented on June 13, 2024

Fix qemu (and frontend) adding an explicit padding field guaranteeing that it will be initialized to 0:

I still do not understand why we need this to be 0 initialized. At least from a Rust perspective we should be fine... We will read "any" padding, but it should not be undefined behaviour. When reading from volatile memory we only care that we write valid values to the type. If the padding gets fixed to an int type, any value is valid.

So IMHO we can just add the padding field and are fine...

Yep, I agree on that.
In any case, I don't think we should ask QEMU to put it at 0, just because in Rust it might be UB. But at most we have to do that when we read the bytes from the socket, we can just overwrite them to 0.

from vhost.

germag avatar germag commented on June 13, 2024

Fix qemu (and frontend) adding an explicit padding field guaranteeing that it will be initialized to 0:

I still do not understand why we need this to be 0 initialized. At least from a Rust perspective we should be fine... We will read "any" padding, but it should not be undefined behaviour. When reading from volatile memory we only care that we write valid values to the type. If the padding gets fixed to an int type, any value is valid.
So IMHO we can just add the padding field and are fine...

Yep, I agree on that. In any case, I don't think we should ask QEMU to put it at 0, just because in Rust it might be UB. But at most we have to do that when we read the bytes from the socket, we can just overwrite them to 0.

Why not?, it's not a performance-critical message (I guess), and it's not only Rust also C++ (although it has special rules for char), the C standard is more fuzzy on that aspect

from vhost.

germag avatar germag commented on June 13, 2024

Fix qemu (and frontend) adding an explicit padding field guaranteeing that it will be initialized to 0:

I still do not understand why we need this to be 0 initialized. At least from a Rust perspective we should be fine... We will read "any" padding, but it should not be undefined behaviour.

This is not correct it definitely is UB, adding the explicit padding probably will work in the current rustc implementation for this specific architecture, but is not part of the language for rust language type reading uninit memory is instant UB even if you cheat the compiler

When reading from volatile memory we only care that we write valid values to the type. If the padding gets fixed to an int type, any value is valid.

uninit (i.e., undef/poison) is not a valid value for any type, but volatile memory has nothing to do with uninit memory

So IMHO we can just add the padding field and are fine...

from vhost.

germag avatar germag commented on June 13, 2024

Fix qemu (and frontend) adding an explicit padding field guaranteeing that it will be initialized to 0:

I still do not understand why we need this to be 0 initialized. At least from a Rust perspective we should be fine... We will read "any" padding, but it should not be undefined behaviour. When reading from volatile memory we only care that we write valid values to the type. If the padding gets fixed to an int type, any value is valid.
So IMHO we can just add the padding field and are fine...

Yep, I agree on that. In any case, I don't think we should ask QEMU to put it at 0, just because in Rust it might be UB. But at most we have to do that when we read the bytes from the socket, we can just overwrite them to 0.

Why not?, it's not a performance-critical message (I guess), and it's not only Rust also C++ (although it has special rules for char), the C standard is more fuzzy on that aspect

Maybe we can just forget about qemu(*), and just add the explicit padding for this crate implementation of the frontend, but the backend will always skip the padding, so we need 2 definitions of the struct with/without the padding. I'm afraid the code will not be pretty :|

(*) I still think that qemu should be fix making that struct packed

from vhost.

Ablu avatar Ablu commented on June 13, 2024

This is not correct it definitely is UB, adding the explicit padding probably will work in the current rustc implementation for this specific architecture, but is not part of the language for rust language type reading uninit memory is instant UB even if you cheat the compiler

I do not think this is true. I understand that there is undefined behaviour if a struct is not fully assigned. I also understand that we need to add the padding. But we do not need to mandate a value. Regardless of the padding being initialized or not when put into shared-memory, we DO initialize it in Rust. Maybe it was not initialized in C - thats a problem when QEMU access it of course - but we do not care. We read size_of::<Struct>() into the struct. Assuming the struct is all integers and the padding is fixed, I do not see how those bytes would cause undefined behaviour. The struct got completely initialized. The padding value may be anything of course (just like any other field) but it is not undefined behaviour.

Or stated differently: As long as whatever implements ByteValued is packed and only contains integer fields, the ByteValued abstraction should be safe.

So: Could you elaborate where you think the struct may not be fully assigned? 🤔

from vhost.

germag avatar germag commented on June 13, 2024

I think we need to split the issue in 2 small issues and address them separated:

1. Fix the potential UB in this crate

2. Extend the specification to use packed or an explicit padding

For 1, we should provide a fix that can work with any QEMU/VMM, also old ones. I don't understand why if we add just the padding here and we set it to 0 every time we read from the socket is still UB.

Maybe I didn't explain myself correctly, it's UB only if there is unit mem, whether the padding is implicit or explicit. If we read 4 bytes as MaybeUninit<u32> is not UB

For 2, we need to update the spec and add a feature for that. I'm not sure if it is worth it, but I can't see how we can change that (adding packed, or adding a payload) without breaking compatibility.

I agree
I probably got ahead of myself when talking about qemu, sorry for the confusion.

from vhost.

stefano-garzarella avatar stefano-garzarella commented on June 13, 2024

For 1, we should provide a fix that can work with any QEMU/VMM, also old ones. I don't understand why if we add just the padding here and we set it to 0 every time we read from the socket is still UB.

Maybe I didn't explain myself correctly, it's UB only if there is unit mem, whether the padding is implicit or explicit. If we read 4 bytes as MaybeUninit<u32> is not UB

The thing that's not clear to me is why only the padding has to be MaybeUninit<u32> and not all the fields at this point.

Also unclear to me is why if the remote application sending bytes puts the padding at 0 it's not UB, while if it puts random values in it is UB.
Because in the end we're passing a buffer to the read() syscall and the kernel anyway initializes the padding to some values. I still don't understand why if those are 0 or they're something else, it changes something. What the kernel does or the remote application does is not under rustc control.
But if it is the case, I think we can't trust the remote peer in order to have our code not UB. So IMHO we should don't trust the remote application (and compiler and kernel at all) and if there is something that could be UB, we need to handle in our read() wrapper.

from vhost.

Ablu avatar Ablu commented on June 13, 2024

The thing that's not clear to me is why only the padding has to be MaybeUninit<u32> and not all the fields at this point.

Also unclear to me is why if the remote application sending bytes puts the padding at 0 it's not UB, while if it puts random values in it is UB.

This is the same concern that I have and what @germag and me have been haggling about :)

Because in the end we're passing a buffer to the read() syscall and the kernel anyway initializes the padding to some values.

This is true for what we read from sockets. But it should not really matter. The story should be the same for reading from shared memory. We may read arbitrary data, but we never leave anything uninitialized.

Ultimately, the bug that we have is that VhostUserInflight is not packed (which the safety contract of ByteValued mandates). Of course fixing that may have implications on backwards-compatibility. I think a quick-fix may be to mark it as packed and add a manual padding int. That is a major bump since currently all members are public, but it should preserve current behavior overall.

The best fix - as you suggested - is of course to fix the ambiguity in the spec.

from vhost.

Ablu avatar Ablu commented on June 13, 2024

rust-vmm/vm-memory#246 would help with these kind of issues...

from vhost.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.