Code Monkey home page Code Monkey logo

vhost's Introduction

vhost

The vhost workspace hosts libraries related to the vhost and vhost-user protocols. It currently consists of the following crates:

  • vhost -> A pure rust library for vDPA, vhost and vhost-user.
  • vhost-user-backend -> It provides a framework to implement vhost-user backend services.

vhost's People

Contributors

ablu avatar aesteve-rh avatar alyssais avatar andreeaflorescu avatar arronwy avatar bjzhjing avatar catdum avatar dependabot-preview[bot] avatar dependabot[bot] avatar dgreid avatar eryugey avatar gaelan avatar germag avatar harshanavkis avatar jiangliu avatar jynnantonix avatar keiichiw avatar lauralt avatar likebreath avatar liubogithub avatar obviyus avatar rbradford avatar roypat avatar shadowcurse avatar slp avatar stefano-garzarella avatar stsquad avatar vireshk avatar wllenyj avatar yangzhon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vhost's Issues

Support for VHOST_USER_GPU_SET_SOCKET, and the protocol on this socket

Hi,
me and @dorindabassey are working on a vhost-user-gpu implementation, see: rust-vmm/vhost-device#598.

We are aiming at being compatible with QEMU's displays, which means supporting the VHOST_USER_GPU_SET_SOCKET message and the protocol on this socket.
For specification see: https://www.qemu.org/docs/master/interop/vhost-user-gpu.html.

The VHOST_USER_GPU_SET_SOCKET is currently defined in the FrontendReq enum:

GPU_SET_SOCKET = 33,
)

but is not being handled here:

/// Main entrance to server backend request from the backend communication channel.
///
/// Receive and handle one incoming request message from the frontend. The caller needs to:
/// - serialize calls to this function
/// - decide what to do when error happens
/// - optional recover from failure
pub fn handle_request(&mut self) -> Result<()> {

The protocol on this socket is nearly the same as the normal vhost user protocol and the header structure is exactly the same as the existing VhostUserMsgHeader. The problem are the flags, because only the REPLY flag is specified, not even a VERSION flag is used.

I would like to implement this protocol, but my main problem is I don't know if adding new device specific protocol code into this project is ok. I see that there are FS specific things, but they might be removed (see #213)

I can think of 3 solution with diferent levels genericness:

  • creating a GpuBackend struct and passing it to a set_gpu_socket when VHOST_USER_GPU_SET_SOCKET is received in a similar fashion to the existing Backend and set_backend_req_fd callback

    Ok(FrontendReq::SET_BACKEND_REQ_FD) => {
    self.check_proto_feature(VhostUserProtocolFeatures::BACKEND_REQ)?;
    self.check_request_size(&hdr, size, hdr.get_size() as usize)?;
    let res = self.set_backend_req_fd(files);
    self.send_ack_message(&hdr, res)?;
    }
    fn set_backend_req_fd(&mut self, files: Option<Vec<File>>) -> Result<()> {
    let file = take_single_file(files).ok_or(Error::InvalidMessage)?;
    // SAFETY: Safe because we have ownership of the files that were
    // checked when received. We have to trust that they are Unix sockets
    // since we have no way to check this. If not, it will fail later.
    let sock = unsafe { UnixStream::from_raw_fd(file.into_raw_fd()) };
    let backend = Backend::from_stream(sock);
    self.backend.set_backend_req_fd(backend);
    Ok(())
    }

  • having a set_gpu_socket but passing it a UnixStream and let the user parse the messages

    • here it would be nice to make some utilities public from vhost-user, such as Endpoint so there is no code duplication
  • not handling VHOST_USER_GPU_SET_SOCKET in this project at all, but providing a fully generic way to handle custom messages from the frontend, for messages that are not handled by the handle_request referenced above.

I am not sure on what is the stance on device-specific and/or non-standard features in this project, any suggestions?

Potential regression for vDPA support

With upgrading Cloud Hypervisor's dependencies of various rust-vmm crates [1], our CI reported the vDPA integration tests (that uses in-kernel vDPA device simulators) are failing with the latest release vhost v0.7.0 [2].

I found the breaking commit is 7e76859, which essentially switches to use host virtual address (HVA) instead of guest physical address (GPA) with the ioctl VHOST_SET_VRING_ADDR. Considering the in-kernel vDPA device simulators were working fine with GPA, I wonder if this commit brings regressions to vDPA support?

[1] cloud-hypervisor/cloud-hypervisor#5451
[2] https://cloud-hypervisor-jenkins.westus.cloudapp.azure.com/blue/organizations/jenkins/cloud-hypervisor/detail/PR-5451/6/pipeline/169

/cc @sboeuf @rbradford

Missing Input Parameter Validation

From @y-x41

While reviewing the vhost crate, it was found that the function VHostUserHandler::set_vring_base() lacks validation of the input parameter index, which is used for indexing an internal vector and could potentially lead to a Out-of-Bounds write resulting in an application panic. The below listing depicts the vulnerable code segment.

impl<S, V, B> VhostUserBackendReqHandlerMut for VhostUserHandler<S, V, B>
where
    S: VhostUserBackend<V, B>,
    V: VringT<GM<B>>,
    B: NewBitmap + Clone,
{

    // [...]

    fn set_vring_base(&mut self, index: u32, base: u32) -> VhostUserResult<()> {
        let event_idx: bool = (self.acked_features & (1 << VIRTIO_RING_F_EVENT_IDX)) != 0;

        self.vrings[index as usize].set_queue_next_avail(base as u16);
        self.vrings[index as usize].set_queue_event_idx(event_idx);
        self.backend.set_event_idx(event_idx);

        Ok(())
    }
}

X41 advises performing thorough input parameter validation to prevent any possibility of encountering potential Out-of-Bounds read/writes.

Vring addresses are not inside guest memory range if placed behind a vIOMMU

I think there's a limitation for supporting vIOMMU use case because of the following lines:

When an IOTLB is involved, vhost in the kernel considers desc_table_addr, used_ring_addr and avail_ring_addr as IOVAs. That means it could be either a GPA or a GVA. In case it's a GPA, that's fine to check if the address is in range, but if that's a GVA, it doesn't make any sense.

Could we introduce a boolean or flag to let VhostKernBackend know about this use case, which would avoid the incorrect check?

@stefano-garzarella I've identified this issue while implementing vDPA for Cloud Hypervisor, so I was thinking maybe an alternative would be to implement is_valid() from VhostKernVdpa, which would override the default implementation. WDYT?

/cc @jiangliu

Regenerate vhost_binding.rs and move custom changes in another file

In a PR #185, we fixed some new clippy warnings, but there is one related to __IncompleteArrayField that should be auto-generated by bindgen.

We used an old version that generated also Copy/Clone implementations, but now it looks like they removed it: rust-lang/rust-bindgen#1431

@jiangliu agreed on regenerate it with a new version of bindgen.
We also modified a bit the vhost_binding.rs generated file, so it may be better to move the new struct in another file, so we can update it easily next time

Check non-default feature code and tests in vhost-ci/clippy-x86

The current clippy-x86 builder in vhost-ci doesn't cover code for non-default features (e.g. vhost-kern, vhost-user, etc). So, we have some clippy errors there now, which will be fixed by #23.
To improve code quality, it would be nice if we can pass more flags in clippy there like clippy --all-targets --all-features.

Inconsistent licensing

Cargo.toml says "Apache-2.0 OR BSD-3-Clause", but many files contain lines like the following, which suggests that only the Apache-2.0 license applies:

// SPDX-License-Identifier: Apache-2.0

It's therefore unclear whether using this code under the terms of only the BSD-3-Clause license is permitted. If the intention is to use the BSD-3-Clause for code originally taken from crosvm, but the Apache-2.0 license for new code (like Cloud Hypervisor does), then Cargo.toml should say "Apache-2.0 AND BSD-3-Clause".

ByteValued implementation for VhostUserInflight is UB

The VhostUserInflight struct implements ByteValued, but it has 4 bytes of padding at the end, so reading that uninitialized memory is UB, making the API unsound

/// Single memory region descriptor as payload for ADD_MEM_REG and REM_MEM_REG
/// requests.
#[repr(C)]
#[derive(Copy, Clone, Default)]
pub struct VhostUserInflight {
    /// Size of the area to track inflight I/O.
    pub mmap_size: u64,
    /// Offset of this area from the start of the supplied file descriptor.
    pub mmap_offset: u64,
    /// Number of virtqueues.
    pub num_queues: u16,
    /// Size of virtqueues.
    pub queue_size: u16,
}
...
unsafe impl ByteValued for VhostUserInflight {}

the problem is basically reading that uninitialized memory as &[u8], since type reading uinit memory is UB and a reference to uninit mem is also UB.

Possible solution

Although the right thing to do is to fix qemu and have this struct be packed, it will be difficult not to break backward compatibility.

Perhaps, we can fix this while keeping the backwards compatibility:
Make VhostUserInflight #[repr(C, packed)], after reading VhostUserInflight, read MaybeUninit<u32>, we can add a new method Endpoint<R>::skip(len: usize). But, we need to skip it at the beginning of BackendReqHandler::handle_request()

pub fn handle_request(&mut self) -> Result<()> {
  let (size, buf) = match hdr.get_size() {
            0 => (0, vec![0u8; 0]),
            len => {
                let (size2, rbuf) = self.main_sock.recv_data(len as usize)?; // the UB happens inside recv_data()
                if size2 != len as usize {
                    return Err(Error::InvalidMessage);
                }
                (size2, rbuf)
            }
        };

Fix qemu (and frontend) adding an explicit padding field guaranteeing that it will be initialized to 0:

typedef struct VhostUserInflight {
    uint64_t mmap_size;
    uint64_t mmap_offset;
    uint16_t num_queues;
    uint16_t queue_size;
    uint32_t padding; // MUST be zero initialized
} VhostUserInflight;

but this forces us to have two VhostUserInflight definitions, one for the backend and one for the frontend, because the frontend must continue sending the unpacked version with padding.

This issue is to continue the discussion started on #208

Add CODEOWNERS file

Please add a CODEOWNERS file with the details of the maintainers. Please use the following format as outlined in: https://help.github.com/en/articles/about-code-owners

Most projects can simply follow the wildcard syntax. e.g.

* @owner1 @owner2

Not only does this provide a way to see who is responsible or this repository they will also automatically be notified of incoming PR reviews.

Remove non-standard BackedReq types

The following back-end request types are not part of the vhost-user protocol and should be removed:

pub enum BackendReq {
    ...
    /// Virtio-fs draft: map file content into the window.
    FS_MAP = 6,
    /// Virtio-fs draft: unmap file content from the window.
    FS_UNMAP = 7,
    /// Virtio-fs draft: sync file content.
    FS_SYNC = 8,
    /// Virtio-fs draft: perform a read/write from an fd directly to GPA.
    FS_IO = 9,
    ...
}

These are not in the vhost-user standard, which in itself is confusing, but even conflict with values that are defined in the standard (see Back-end message types), such as: VHOST_USER_BACKEND_SHARED_OBJECT_ADD, VHOST_USER_BACKEND_SHARED_OBJECT_REMOVE and VHOST_USER_BACKEND_SHARED_OBJECT_LOOKUP.

This issue is to continue the discussion started on #205

Release vhost-v0.11.0 and vhost-user-backend-v0.14.0

We recently added several new features mostly related to live migration (internal state migration, POSTCOPY support, SET_LOG_BASE message support), so I think we can release a new version of both crates soon.

Comments/objections?

Clarification of return type of `handle_event`

VhostUserBackendMut::handle_event has a bool return type that is not documented. Drilling into the code reveals that

                // handle_event() returns true if an event is received from the exit event fd.

[1]

However, it looks like this is handled at [2]. Are there still valid use-cases where someone else should return a true there? Or shall we drop the bool from VhostUserBackend and VhostUserBackendMut?

@jiangliu: Maybe you can comment on this?

[1] https://github.com/rust-vmm/vhost/blame/40006d0b3981504320ae66e165c9fd36ce9cfb18/crates/vhost-user-backend/src/event_loop.rs#L196
[2] https://github.com/rust-vmm/vhost/blame/40006d0b3981504320ae66e165c9fd36ce9cfb18/crates/vhost-user-backend/src/event_loop.rs#L207

Goal for the crate

Looking for feedback and guidance here.
The README from this crate says:

A crate to implement the vHost ioctl interfaces and vHost-user protocol.

Do we really want to put together the vhost part that is about defining traits that can be used both for vhost and vhost-user devices, and the part about vhost-user protocol?
I think we could get the vhost-user-protocol living in its own crate since it could be reused by a vhost-user daemon. This would not force this daemon to pull the vhost/vhost-user traits code.

Mac build support

Unable to build on mac Monterey 12.2.1. Any chance to get around this?

โžœ  virtiofsd git:(main) cargo build --release
   Compiling vhost v0.3.0
   Compiling futures-executor v0.3.19
   Compiling structopt v0.3.26
error[E0432]: unresolved import `vmm_sys_util::eventfd`
  --> /Users/myuser/.cargo/registry/src/github.com-1ecc6299db9ec823/vhost-0.3.0/src/backend.rs:16:19
   |
16 | use vmm_sys_util::eventfd::EventFd;
   |                   ^^^^^^^ could not find `eventfd` in `vmm_sys_util`

error[E0432]: unresolved import `vmm_sys_util::sock_ctrl_msg`
  --> /Users/myuser/.cargo/registry/src/github.com-1ecc6299db9ec823/vhost-0.3.0/src/vhost_user/connection.rs:18:19
   |
18 | use vmm_sys_util::sock_ctrl_msg::ScmSocket;
   |                   ^^^^^^^^^^^^^ could not find `sock_ctrl_msg` in `vmm_sys_util`

   Compiling futures v0.3.19
error[E0599]: no method named `send_with_fds` found for struct `UnixStream` in the current scope
   --> /Users/myuser/.cargo/registry/src/github.com-1ecc6299db9ec823/vhost-0.3.0/src/vhost_user/connection.rs:142:19
    |
142 |         self.sock.send_with_fds(iovs, rfds).map_err(Into::into)
    |                   ^^^^^^^^^^^^^ method not found in `UnixStream`

error[E0599]: no method named `recv_with_fds` found for struct `UnixStream` in the current scope
   --> /Users/myuser/.cargo/registry/src/github.com-1ecc6299db9ec823/vhost-0.3.0/src/vhost_user/connection.rs:317:45
    |
317 |         let (bytes, _) = unsafe { self.sock.recv_with_fds(&mut iovs, &mut [])? };
    |                                             ^^^^^^^^^^^^^ method not found in `UnixStream`

error[E0599]: no method named `recv_with_fds` found for struct `UnixStream` in the current scope
   --> /Users/myuser/.cargo/registry/src/github.com-1ecc6299db9ec823/vhost-0.3.0/src/vhost_user/connection.rs:349:38
    |
349 |         let (bytes, fds) = self.sock.recv_with_fds(iovs, &mut fd_array)?;
    |                                      ^^^^^^^^^^^^^ method not found in `UnixStream`

Some errors have detailed explanations: E0432, E0599.
For more information about an error, try `rustc --explain E0432`.
error: could not compile `vhost` due to 5 previous errors
warning: build failed, waiting for other jobs to finish...
error: build failed

vhost-user: Implement VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS

(This issue was originally created in @dagrh here cloud-hypervisor/cloud-hypervisor#1460)

The vhost user protocol has a limit of ~8 memory slots it can handle; that's an issue when you add lots of DIMMs, e.g. in Kata when you have lots of containers in a pod: (Related to kata-containers/runtime#2795 )
QEMU recently got support for VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS which does away with the limit on the slots.
We need to get the same feature.

Easiest way to trigger it is to start a vhost_user_fs and then connect a qemu with:

-m 2G,maxmem=16G,slots=16 -object memory-backend-memfd,id=mem1,size=256M,share=on -device pc-dimm,id=dimm1,memdev=mem1 -object memory-backend-memfd,id=mem2,size=256M,share=on -device pc-dimm,id=dimm2,memdev=mem2 -object memory-backend-memfd,id=mem3,size=256M,share=on -device pc-dimm,id=dimm3,memdev=mem3 -object memory-backend-memfd,id=mem4,size=256M,share=on -device pc-dimm,id=dimm4,memdev=mem4 -object memory-backend-memfd,id=mem5,size=256M,share=on -device pc-dimm,id=dimm5,memdev=mem5 -object memory-backend-memfd,id=mem6,size=256M,share=on -device pc-dimm,id=dimm6,memdev=mem6 -object memory-backend-memfd,id=mem7,size=256M,share=on -device pc-dimm,id=dimm7,memdev=mem7 -object memory-backend-memfd,id=mem8,size=256M,share=on -device pc-dimm,id=dimm8,memdev=mem8

vhost_kern: add vhost-net support

Hi,

vhost-net support is missing in vhost_kernel mod here.

I'd like to make a pull request later. For better network performance, we have added vhost-net support in our private repository. It has been working well for a while.

vhost: Migration of vhost_user device using ovs_dpdk as backend fails using Cloud Hypervisor on Ubuntu 22.04 host

We have noticed an issue in vhost while we migrating VMs using Cloud Hypervisor. We are attempting to migrate Docker container from Ubuntu 20.04 to Ubuntu 22.04 and noticed this issue. Details about our attempts are here cloud-hypervisor/cloud-hypervisor#5877 .

This particular test from https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/tests/integration.rs#L9616 live_migration::live_migration_sequential::test_live_migration_ovs_dpdk test has been stuck while running on Ubuntu 22.04 host.

On further debugging it has been struck while sending request SET_LOG_BASE to the backend at https://github.com/rust-vmm/vhost/blob/main/vhost/src/vhost_user/connection.rs#L538. It never returns from here (neither error or success).

Screenshot from 2023-11-29 12-27-58

Attached backtrace for further details.

Steps to reproduce this steps.

Install openvswitch-switch-dpdk and 

$  sudo modprobe openvswitch
$  echo 6144 | sudo tee /proc/sys/vm/nr_hugepages
$  sudo service openvswitch-switch start
$  sudo ovs-vsctl init
$  sudo ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true
$  sudo service openvswitch-switch restart
$  sudo ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
$  sudo ovs-vsctl add-port ovsbr0 vhost-user2 -- set Interface vhost-user2 type=dpdkvhostuserclient options:vhost-server-path=/tmp/dpdkvhostclient2
$  sudo ip link set up dev ovsbr0
$  sudo service openvswitch-switch restart

Clone Cloud Hypervisor and build

$ git clone [email protected]:cloud-hypervisor/cloud-hypervisor.git
$ cd cloud-hypervisor
$ cargo build --debug
$ sudo setcap cap_net_admin+ep ./target/debug/cloud-hypervisor

Build custom kernel from here : https://github.com/cloud-hypervisor/cloud-hypervisor/tree/main#custom-kernel-and-disk-image
Get the Guest image from here : https://cloud-hypervisor.azureedge.net/focal-server-cloudimg-amd64-custom-20210609-0.qcow2
and follow the steps : https://github.com/cloud-hypervisor/cloud-hypervisor/tree/main#disk-image

Once you have custom kernel and guest image, run below command from different terminals

$ target/debug/cloud-hypervisor --api-socket /tmp/cloud-hypervisor.sock --cpus boot=2 --memory size=0,shared=on --memory-zone id=mem0,size=1G,shared=on,host_numa_node=0 --kernel <path to custom kernel>/vmlinux --cmdline "root=/dev/vda1 console=hvc0 rw systemd.journald.forward_to_console=1" --disk path=<path to focal>/focal-server-cloudimg-amd64-custom-20210609-0.raw path=/tmp/ubuntu-cloudinit.img --net tap=,mac=12:34:56:78:90:01,ip=192.168.1.1,mask=255.255.255.0 vhost_user=true,socket=/tmp/dpdkvhostclient2,num_queues=2,queue_size=256,vhost_mode=server -v

$ target/debug/cloud-hypervisor --api-socket /tmp/cloud-hypervisor.sock.dest -v

$ target/debug/ch-remote --api-socket /tmp/cloud-hypervisor.sock.dest receive-migration unix:/tmp/live-migration.sock

$ target/debug/ch-remote --api-socket /tmp/cloud-hypervisor.sock send-migration unix:/tmp/live-migration.sock

vhost_user: Use File instead of RawFd in VhostUserSlaveReqHandler trait

Currently, some methods in the VhostUserSlaveReqHandler trait, such as set_vring_kick, takes RawFds as their arguments. However, this design doesn't allow a trait implementor to know whether the given FDs are valid and who its owner is. This means it cannot call from_raw_fd() for the given FD safely.

So, it'd be nice if the VhostUserSlaveReqHandler trait is updated to pass around Files instead of RawFds. Then, the ownership of the underlying FD will become clearer.

Add support for Inflight I/O tracking

The vhost-user implementation is missing the Inflight I/O tracking feature in order to recover fully from a backend crash or restart while the VM/guest is running.

Offset in VHOST_USER_GET_CONFIG/SET_CONFIG mismatch with qemu

The vhost crate considers offsets between the config offset and length as valid:

} else if self.offset < 0x100 {

I believe the literal 0x100 is the config offset.

qemu, on the other hand, considers offsets valid from 0 to length, and assumes the offset uses CONFIG_OFFSET as a base.
See get_config in vhost-user.c, it sets config to offset 0 and length to max to cover the entire area.
https://source.codeaurora.org/quic/qemu/qemu/tree/hw/virtio/vhost-user.c?h=qemu.org/master#n2068

This mismatch causes the qemu GET_CONFIG message to be considered invalid by the vhost crate.

I didn't find anything in the spec indicating which is correct:
https://qemu.readthedocs.io/en/latest/interop/vhost-user.html

Deferring to qemu seems like the best thing for compatibility with qemu, but it might break existing users.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.