Code Monkey home page Code Monkey logo

nemu's Introduction

Archived

NEMU is no longer active and is archived. Cloud Hypervisor is the successor https://github.com/cloud-hypervisor/cloud-hypervisor

NEMU, a cloud hypervisor

NEMU is an open source hypervisor specifically built and designed to run modern cloud workloads on modern 64-bit Intel and ARM CPUs.

Rationale

Modern guest operating systems that host cloud workloads run on virtual hardware platforms that do not require any legacy hardware. Additonally modern CPUs used in data centers have advanced virtualization features that have eliminated the need for most CPU emulation.

There currently is no open source hypervisor solutions with a clear and narrow focus on running cloud specific workloads on modern CPUs. All available solutions have evolved over time and try to be fairly generic. They attempt to support a wide range of virtual hardware architectures and run on hardware that has varying degree of hardware virtualization support. This results in a need to provide a large set of legacy platforms and device models requiring CPU, device and platform emulation. As a consequence they are built on top of large and complex code bases.

NEMU on the other hand aims to leverage KVM, be narrow focused on exclusively running modern, cloud native workloads, on top of a limited set of hardware architectures and platforms. It assumes fairly recent CPUs and KVM allowing for the the elimination of most emulation logic.

This will allow for smaller code base, lower complexity and a reduced attack surface compared to existing solutions. It also gives more space for providing cloud specific optimizations and building a more performant hypervisor for the cloud. Reducing the size and complexity of the code allows for easier review, fuzz testing, modularization and future innovation.

QEMU base

QEMU is the current de facto standard open source cloud hypervisor. It has a rich set of features that have been developed and tested over time. This includes features such as live migration, PCI, Memory, NVDIMM and CPU hotplug, VFIO, mediated device passthrough and vhost-user. QEMU also has been the code base on which significant effort and innovation has been invested to create multiple performant I/O models

It also comes with a very large support for legacy features, for platforms and devices and is capable of running on a large number of hardware platforms. It also allows for cross platform emulation. One of its fundamental goal is about being as generic as possible and run on a large set of hardware and host a diversity of workloads. QEMU needed emulation support to be build into the code as hardware lacked critical virtualization features.

QEMU allows for build time configuration of some of its rich feature set. However there is quite a large amount of the code base that cannot be compiled out as the emulated platforms make assumptions about certain legacy devices being always present. QEMU also has abstractions within the code to support all of these legacy features.

NEMU

NEMU is based off QEMU and leverage its rich feature set, but with a much narrower focus. It leverages the performant, robust and stable QEMU codebase without the need to supporting the myriad of features, platforms and hardware that are not relevant for the cloud.

The goal of NEMU is to retain the absolute minimal subset of the QEMU codebase that is required for the feature set described below. The QEMU code base will also be simplified to reduce the number of generic abstractions.

Requirements

NEMU provides a PCI virtio platform with support for vfio based device direct assigment and mediated device assigment support. It also aims to retain support for live migration, vhost-user and a build time configurable device hotplug support for PCI, memory, NVDIMM and CPU. NEMU will need to emulate a small subset of features including PCI host brige.

NEMU also introduces a new QEMU x86-64 machine type: virt. It is a purely virtual platform, that does not try to emulate any existing x86 chipset or legacy bus (ISA, SMBUS, etc) and offloads as many features to KVM as possible. This is a similar approach as the already existing AArch64 virt machine type and NEMU will only support the two virt machine types.

Below is a list of QEMU features that NEMU will retain and add.

High Level

  • KVM and KVM only based
  • Minimal emulation
  • Low latency
  • Low memory footprint
  • Low complexity
  • Small attack surface
  • 64-bit support only
  • Optional and build time configurable CPU, memory, PCI and NVDIMM hotplug
  • Machine to machine migration

Architectures

NEMU only supports two 64-bit CPU architectures:

  • x86-64
  • AArch64

Guest OS

  • 64-bit Linux

Guest Platforms

  • virt (x86-64) QEMU x86-64 virtual machine
  • virt (AArch64) QEMU AArch64 virtual machine

Host Platforms

  • Linux

Firmware and boot

  • UEFI
  • ACPI
    • Hardware Reduced ACPI
    • Optional hotplug support
      • CPU
      • Memory
      • NVDIMM
      • PCI devices
      • VFIO
      • vhost-user

Boot methods

  • UEFI boot

Memory

  • QEMU allocated memory
  • File mapped memory
  • Huge pages
  • Memory pinning

Devices

Models

  • virtio
    • blk
    • console
    • crypto
    • pci-net
    • rng-pci
    • scsi
      • virtio
      • vhost
    • 9pfs
    • vhost-user-scsi
    • vhost-user-net
    • vhost-user-blk
    • vhost-vsock-pci
  • vfio
    • network
    • mediated device
    • storage
    • rdma
  • NVDIMM
  • TPM
    • vTPM
    • Host TPM passthrough
  • SCSI controller
  • PCI controller (pci-lite)

Block

  • cdrom
  • nvme
  • ceph/rbd

Guest Image Formats

  • QCOW2
  • RAW
  • VHD

Migration

  • Network based over TLS
  • File based (Local migration)

Monitoring

  • QMP
  • QAPI

To be discussed

  • 64-bit Windows Server (headless)
  • qboot
  • Graphic Console
  • virtio-block-crypto
  • QEMU client support as modules
    • iscsi
    • nbd
    • nfs
    • gluster
  • RDMA live migration
  • SLIRP
  • Guest agent

nemu's People

Contributors

afaerber avatar agraf avatar aliguori avatar aurel32 avatar avikivity avatar balrog-kun avatar berrange avatar blueswirl avatar bonzini avatar dagrh avatar dgibson avatar ebblake avatar edgarigl avatar ehabkost avatar elmarco avatar gkurz avatar huth avatar jan-kiszka avatar jnsnow avatar kevmw avatar kraxel avatar mstsirkin avatar philmd avatar pm215 avatar rth7680 avatar stefanharh avatar stsquad avatar stweil avatar vivier avatar xanclic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nemu's Issues

SMBIOS for virt

@sameo @rbradford @yangzhon @mcastelino
Two key points in conclustion.

  1. Do we need open CONFIG_SMBIOS for virt platform in order to use DMI?
  2. If yes, there're 2 sub issues.
    1. CONFIG_SMBIOS=y is set in default-configs/x86_64-softmmu.mak, but actually when hw/i386/fw.c fw_build_smbios() check it, it doesn't show as expect. So Nemu doesn't build smbios tables for guest and dmesg shows [ 0.000000] DMI not present or invalid.
    2. After fixing i by workaround, only q35 guest can show correct DMI dmesg, virt still shows [ 0.000000] DMI not present or invalid.

Just a few questions

(You dont have to be able to answer all questions to reply)

  1. Will 32bit OSs ever be supported?

  2. Can Ceph be used as NEMU's block storage? KVM/QEMU supports it.

  3. Would it be possible to run OSs other than windows and linux? Or NEMU will reject them?

  4. Is anyone using this in production? Can it be?

Several questions about nemu's development

Hello all,

Here I have some questions about nemu's development status.

  1. As the README says, it only for kvm, but why there contains the TCG code?
  2. Is the no-related arch/file or some others removed over, or still in progress, do you
    still accept the PR such as removing the unrelated things?
  3. Where goes the discussion about some more concrete features, such as I don't know why the vnc
    is not enable by default?

Thanks,
Li Qiang

Q35 ACPI warnings

[    0.016000] ACPI BIOS Error (bug): Failure creating [\_GPE._HID], AE_ALREADY_EXISTS (20180313/dswload2-316)
[    0.021017] ACPI Error: AE_ALREADY_EXISTS, During name lookup/catalog (20180313/psobject-221)
[    0.024016] ACPI BIOS Error (bug): Failure creating [\_GPE._E04], AE_ALREADY_EXISTS (20180313/dswload2-316)
[    0.029016] ACPI Error: AE_ALREADY_EXISTS, During name lookup/catalog (20180313/psobject-221)
[    0.033016] ACPI BIOS Error (bug): Could not resolve [\_SB.NVDR], AE_NOT_FOUND (20180313/psargs-330)
[    0.037016] ACPI Error: Ignore error and continue table load (20180313/psobject-604)

Observed on serial port after rebooting.

Seen on topic/virt-x86 branch. Need to check if still present in 3.0 version and if so whether present on upstream 3.0

NVDIMM support

We need to implement the NVDIMM support for cases that would require the image being passed as a fake NVDIMM.

metrics: Compare virt(x86) and pc machine types

I am starting this issue as a thread to discuss and agree on how we should properly compare virt and pc to show that we improved different metrics such as binary size, memory footprint, boot time, ...

What matters?
We need a list of metrics where we think we should make a difference with virt.

  • vm exits: from the meeting this morning, we discussed about VM exits that might occur less frequently since virt emulates less devices than pc.

  • binary size: the work is being done on the branch topic/virt-x86 right now, which means that no code reduction has been applied. We need to start reducing the code on this virt-x86 branch if we want to be able to assess the gain regarding binary size.

  • boot time: we might have some boot time improvement since we have less devices being emulated, but this is hypothetical and might actually depend on the way the hypervisor will be used.

  • memory footprint: I don't think we did any work to reduce the copies in memory, and this might be something we want to do in the future, but I don't think we can expect some real memory footprint improvement with what we have with nemu/virt right now.

  • Anything else ?

How to measure?
For every metrics, we should make sure to agree on the way they should be assessed. This way, it will be easy to share the data across the team and externally, if we can provide a way to everybody to check by themselves.

nats: Add test for PCI hotplug through ACPI

In order to test the hotplug of PCI devices through ACPI, here is what we need:

Start a VM with OVMF and virt machine type:

sudo ./x86_64-softmmu/qemu-system-x86_64
    -bios $HOME/workloads/OVMF.fd \
    -nographic      -nodefaults      -L . \
    -machine virt,accel=kvm,kernel_irqchip,nvdimm \
    -smp sockets=1,cpus=4,cores=2,maxcpus=8 -cpu host \
    -m 2G,slots=2,maxmem=16G \
    -device virtio-blk-pci,drive=image -drive if=none,id=image,file=$HOME/workloads/clear-24350-cloud.img \
    -device virtio-serial-pci,id=virtio-serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev stdio,id=charconsole0 \
    -netdev user,id=mynet0 -device virtio-net-pci,netdev=mynet0 \
    -monitor telnet:127.0.0.1:55555,server,nowait

Check the list of PCI devices:

root@clr-6ea72c07eef64c41b1492fa8faf5aff2 ~ # ls -la /sys/bus/pci/devices/
total 0
drwxr-xr-x 2 root root 0 Aug 24 18:25 .
drwxr-xr-x 5 root root 0 Aug 24 18:25 ..
lrwxrwxrwx 1 root root 0 Aug 24 18:25 0000:00:00.0 -> ../../../devices/pci0000:00/0000:00:00.0
lrwxrwxrwx 1 root root 0 Aug 24 18:25 0000:00:01.0 -> ../../../devices/pci0000:00/0000:00:01.0
lrwxrwxrwx 1 root root 0 Aug 24 18:25 0000:00:02.0 -> ../../../devices/pci0000:00/0000:00:02.0
lrwxrwxrwx 1 root root 0 Aug 24 18:25 0000:00:03.0 -> ../../../devices/pci0000:00/0000:00:03.0

Use QMP to add a PCI device to the main bus:

(qemu) device_add virtio-net-pci,id=net1

Now, if we check the sysfs (or lspci), here is the new entry:

lrwxrwxrwx 1 root root 0 Aug 24 18:25 0000:00:04.0 -> ../../../devices/pci0000:00/0000:00:04.0

We need to check for the PCI unplug too, by using QMP:

(qemu) device_del net1

by making sure we're getting back to the original list of devices.

Memory hotplug interacts badly with NVDIMM

Memory hotplug does not work with OVMF (0.2)

Testing with Clear Linux https://download.clearlinux.org/releases/24380/clear/clear-24380-cloud.img.xz (has @sboeuf kernel changes)

When hotplugging 1G dimm:

(qemu) object_add memory-backend-ram,id=mem1,size=1G
(qemu) device_add pc-dimm,id=dimm1,memdev=mem1

I see the following in dmesg:

[  676.541766] Block size [0x8000000] unaligned hotplug range: start 0x100200000, size 0x40000000
[  676.541770] acpi PNP0C80:01: add_memory failed
[  676.542916] acpi PNP0C80:01: acpi_memory_enable_device() error
[  676.543412] acpi PNP0C80:01: Enumeration failure

CC: @sboeuf @yangzhon @sameo

qemu command line is:

/home/rob/build-x86_64/x86_64-softmmu/qemu-system-x86_64 -machine virt,accel=kvm,kernel_irqchip,nvdimm -pidfile qemu.pid -monitor telnet:127.0.0.1:55555,server,nowait -bios /home/rob/src/edk2/Build/OvmfX64/DEBUG_GCC5/FV/OVMF.fd -smp 2,cores=1,threads=1,sockets=2,maxcpus=32 -m 512,slots=4,maxmem=16384M -cpu host -nographic -no-user-config -nodefaults -daemonize -drive file=testvm.img,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -device sysbus-debugcon,iobase=0x402,chardev=debugcon -chardev file,path=/tmp/debug-log,id=debugcon -device virtio-blk-pci,drive=cloud -drive if=none,id=cloud,file=seed.img,format=raw -netdev user,id=mynet0,hostfwd=tcp::2222-:22,hostname=nemuvm -device virtio-net-pci,netdev=mynet0 -drive file=testscsi.img,if=none,id=drive-virtio-disk1,format=raw -device virtio-scsi-pci,id=virtio-disk1 -object memory-backend-file,id=mem0,share,mem-path=testnvdimm.img,size=51200 -device nvdimm,memdev=mem0,id=nv0 -device virtio-serial-pci,id=virtio-serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=console.sock,server,nowait -netdev tap,fd=3,id=hostnet0,vhost=on,vhostfd=4 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=1a:a2:11:c5:a5:11 -device virtio-rng-pci,rng=rng0 -object rng-random,filename=/dev/random,id=rng0 -device virtio-balloon-pci -object cryptodev-backend-builtin,id=cryptodev0 -device virtio-crypto-pci,id=crypto0,cryptodev=cryptodev0

Documentation

Hi there,

I don't see any specific documentation on building the project and getting started with it. Am I not finding it? Do we need to work on tracking how to get it in there?

Thanks,
Mohammed

DSDT warnings

When dumping our virt generated DSDT table, we get the following warnings:

iasl -tc ~/dsdt-nemu-virt-memhp.dsl 

Intel ACPI Component Architecture
ASL+ Optimizing Compiler/Disassembler version 20180629
Copyright (c) 2000 - 2018 Intel Corporation

/home/samuel/dsdt-nemu-virt-memhp.dsl    140:             CreateDWordField (MR64, \_SB.MHPC.MCRS._Y00._MIN, MINL)  // _MIN: Minimum Base Address
Warning  3128 -                                                           ResourceTag larger than Field ^  (Size mismatch, Tag: 64 bits, Field: 32 bits)

/home/samuel/dsdt-nemu-virt-memhp.dsl    142:             CreateDWordField (MR64, \_SB.MHPC.MCRS._Y00._LEN, LENL)  // _LEN: Length
Warning  3128 -                                                           ResourceTag larger than Field ^  (Size mismatch, Tag: 64 bits, Field: 32 bits)

/home/samuel/dsdt-nemu-virt-memhp.dsl    144:             CreateDWordField (MR64, \_SB.MHPC.MCRS._Y00._MAX, MAXL)  // _MAX: Maximum Base Address
Warning  3128 -                                                           ResourceTag larger than Field ^  (Size mismatch, Tag: 64 bits, Field: 32 bits)

/home/samuel/dsdt-nemu-virt-memhp.dsl    198:         Method (MOST, 4, NotSerialized)
Remark   2146 -                   Method Argument is never used ^  (Arg3)

/home/samuel/dsdt-nemu-virt-memhp.dsl    207:         Method (MEJ0, 2, NotSerialized)
Remark   2146 -                   Method Argument is never used ^  (Arg1)

/home/samuel/dsdt-nemu-virt-memhp.dsl    236:                 Return (MOST (_UID, Arg0, Arg1, Arg2))
Warning  3104 -               Reserved method should not return a value ^  (_OST)

/home/samuel/dsdt-nemu-virt-memhp.dsl    236:                 Return (MOST (_UID, Arg0, Arg1, Arg2))
Error    6080 -                          Called method returns no value ^ 

/home/samuel/dsdt-nemu-virt-memhp.dsl    241:                 Return (MEJ0 (_UID, Arg0))
Warning  3104 -               Reserved method should not return a value ^  (_EJ0)

/home/samuel/dsdt-nemu-virt-memhp.dsl    241:                 Return (MEJ0 (_UID, Arg0))
Error    6080 -                          Called method returns no value ^ 

/home/samuel/dsdt-nemu-virt-memhp.dsl    266:                 Return (MOST (_UID, Arg0, Arg1, Arg2))
Warning  3104 -               Reserved method should not return a value ^  (_OST)

/home/samuel/dsdt-nemu-virt-memhp.dsl    266:                 Return (MOST (_UID, Arg0, Arg1, Arg2))
Error    6080 -                          Called method returns no value ^ 

/home/samuel/dsdt-nemu-virt-memhp.dsl    271:                 Return (MEJ0 (_UID, Arg0))
Warning  3104 -               Reserved method should not return a value ^  (_EJ0)

/home/samuel/dsdt-nemu-virt-memhp.dsl    271:                 Return (MEJ0 (_UID, Arg0))
Error    6080 -                          Called method returns no value ^ 

/home/samuel/dsdt-nemu-virt-memhp.dsl    296:                 Return (MOST (_UID, Arg0, Arg1, Arg2))
Warning  3104 -               Reserved method should not return a value ^  (_OST)

/home/samuel/dsdt-nemu-virt-memhp.dsl    296:                 Return (MOST (_UID, Arg0, Arg1, Arg2))
Error    6080 -                          Called method returns no value ^ 

/home/samuel/dsdt-nemu-virt-memhp.dsl    301:                 Return (MEJ0 (_UID, Arg0))
Warning  3104 -               Reserved method should not return a value ^  (_EJ0)

/home/samuel/dsdt-nemu-virt-memhp.dsl    301:                 Return (MEJ0 (_UID, Arg0))
Error    6080 -                          Called method returns no value ^ 

/home/samuel/dsdt-nemu-virt-memhp.dsl    326:                 Return (MOST (_UID, Arg0, Arg1, Arg2))
Warning  3104 -               Reserved method should not return a value ^  (_OST)

/home/samuel/dsdt-nemu-virt-memhp.dsl    326:                 Return (MOST (_UID, Arg0, Arg1, Arg2))
Error    6080 -                          Called method returns no value ^ 

/home/samuel/dsdt-nemu-virt-memhp.dsl    331:                 Return (MEJ0 (_UID, Arg0))
Warning  3104 -               Reserved method should not return a value ^  (_EJ0)

/home/samuel/dsdt-nemu-virt-memhp.dsl    331:                 Return (MEJ0 (_UID, Arg0))
Error    6080 -                          Called method returns no value ^ 

/home/samuel/dsdt-nemu-virt-memhp.dsl    489:             Method (COST, 4, Serialized)
Remark   2146 -                       Method Argument is never used ^  (Arg3)

ASL Input:     /home/samuel/dsdt-nemu-virt-memhp.dsl - 864 lines, 28504 bytes, 370 keywords
Hex Dump:      /home/samuel/dsdt-nemu-virt-memhp.hex - 34290 bytes

Compilation complete. 8 Errors, 11 Warnings, 3 Remarks, 42 Optimizations

Optimise Jenkins build

Things to consider / research:

  • Increase parallelisation by bumping CPU count
  • Move from nproc / 2 to something closer to nproc (may need to reduce VCPUs in tests - currently 2)
  • Analyse I/O usage - try to build from tmpfs?
  • Split more tests to a separate worker system
  • Made /tmp a tmpfs (by modifying template image.) - /tmp is used for the root disks for the NATs VMs

update README

Is this a fork of Qemu? Why? Should have a unique description in the README.

Implement direct kernel boot for i386/virt

Since the i386 virt machine type presents a minimal hardware list, we should be able to do direct kernel boot without much complications.
This should be an opt-in choice, same as the nofw option from pc-lite.

NEMU run some error

I try start NEMU follow this command, but faced some error.

[root@duhy nemu]# qemu-system-x86_64 -m 2048M -smp 4 -drive file=qemu-3.0.qcow2,cache=none,if=virtio -machine virt,accel=kvm,kernel_irqchip -net nic -net tap,ifname=tap0,script=no,downscript=no
**
ERROR:qom/object.c:542:object_new_with_type: assertion failed: (type != NULL)
Aborted (core dumped)

[root@duhy nemu]# qemu-system-x86_64 -machine help
Supported machines are:
....
virt QEMU 3.0 i386 Virtual Machine (alias of virt-3.0)
virt-3.0 QEMU 3.0 i386 Virtual Machine
virt-2.12 QEMU 2.12 i386 Virtual Machine
isapc ISA-only PC
none empty machine

PXE booting?

I'm building out a decent sized set of private clouds, where every VM will PXE boot for a new image or a redirection to an existing image on disk. Are there plans to bring over the PXE booting support from QEMU?

Initial CI

We need to build a basic but automated CI for NEMU. Initial requirements:

  • PR gating: Use pullapprove with 2 ACKs from the nemu-write team before a PR can be merged.
  • Minimal CI: Use semaphoreci for automatically running our minimal_ci.sh script on each PR.

automatic-removal branch compile failed

I encounter some compile error

  CC      crypto/tlssession.o
  CC      crypto/secret.o
make: *** No rule to make target `crypto/random-platform.o', needed by `scsi/qemu-pr-helper'.  Stop.
make: *** Waiting for unfinished jobs....

I'm on a CentOS machine. What's your compile environment? Or where can I get the binary directly?

PCI hotplug through ACPI

We need to investigate how much it costs (from an emulation standpoint) to enable PCI hotplug through ACPI for virt.
This will help to make some comparison with other potential options and how much we can save with those options.

virt pulls PC headers

The i386/virt machine type still pulls in some of the pc machine type headers, showing that some of the generic part of the QEMU code still depends on pc definitions.

Investigate Clear Linux SSH Reboot flakes

The NATS reboot test flakes out when running with Clear Linux due to SSH connections repeatedly getting EOF (over a period of 50 seconds). This does not happen with Xenial.

Port OVMF to i386/virt

We want to be able to do direct kernel boot as well as OVMF based boot.
Although most of the changes should happen on the OVMF code base itself, the port will be directly influenced by the new machine type definition and implementation.

memory hot removing fails

@rbradford @sameo @mcastelino @yangzhon @sboeuf
Unplugging memory is not working both on 3.0.0 and 2.12.0

(qemu) object_add memory-backend-ram,id=mem1,size=1G
object_add memory-backend-ram,id=mem1,size=1G
(qemu) device_add pc-dimm,id=dimm1,memdev=mem1
device_add pc-dimm,id=dimm1,memdev=mem1
(qemu) device_del dimm1
device_del dimm1

[   46.696367] Offlined Pages 32768
[   46.700822] Offlined Pages 32768
[   46.705073] Offlined Pages 32768
[   46.709303] Offlined Pages 32768
[   46.712478] Offlined Pages 32768
[   46.715805] memory memory32: Offline failed.
sudo /home/liujing/work/code/nemu/x86_64-softmmu/qemu-system-x86_64 \
-bios /home/liujing/work/image/OVMF.fd \
        -nographic -nodefaults -L . \
        -machine virt,accel=kvm,kernel_irqchip,nonvdimm -cpu host -m size=1G,slots=2,maxmem=3G \
        -smp sockets=1,cpus=4,cores=2,maxcpus=8 \
        -device virtio-blk-pci,drive=image -drive if=none,id=image,file=/home/liujing/work/image/clear-24690-kvm.img \
        -device virtio-serial-pci,id=virtio-serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev stdio,id=charconsole0 \
        -monitor telnet:127.0.0.1:5000,server,nowait \

libvirt support

Hi there,

This is a pretty exciting projects and I'm trying to see if we can add coverage to get OpenStack running/tested on top of it. However, OpenStack deploys VMs using libvirt. Is there a recommended or documented way of using nemu with libvirt? Can we just drop in the file?

Thanks,
Mohammed

The bzImage loading is failed in pc/xen/q35 platfrom

@sameo, @rbradford
Since we added the new struct AcpiConfiguration for acpi usage, which is initialized latterly in pc_machine_done() and some member in AcpiConfiguration are not initialized, which made the qemu coredump while loading bzImage.

by the way, for xen, which is using pc machine type and will call xen_load_linux(pcms), in this time. the AcpiConfiguration should not be initialized.

So, we need refractor this struct and related load_linux_bzimage() functions.

virt build fails if the lbz2 development package are installed

As reported through issue #40, the virt build fails when the lbz2 packages are installed:

 GEN     trace/generated-helpers.c
  CC      x86_64_virt-softmmu/trace/control-target.o
  CC      x86_64_virt-softmmu/gdbstub-xml.o
  CC      x86_64_virt-softmmu/trace/generated-helpers.o
  LINK    x86_64_virt-softmmu/qemu-system-x86_64_virt
../block/dmg-bz2.o: In function `dmg_bz2_init':
/home/wkozaczuk/projects/nemu/block/dmg-bz2.c:59: undefined reference to `dmg_uncompress_bz2'
/home/wkozaczuk/projects/nemu/block/dmg-bz2.c:60: undefined reference to `dmg_uncompress_bz2'
collect2: error: ld returned 1 exit status
Makefile:199: recipe for target 'qemu-system-x86_64_virt' failed
make[1]: *** [qemu-system-x86_64_virt] Error 1
Makefile:481: recipe for target 'subdir-x86_64_virt-softmmu' failed
make: *** [subdir-x86_64_virt-softmmu] Error 2

That's because we disable dmg, but since the lbz2 devel package is installed, we end up with a config-host.mak with CONFIG_BZIP2=y.

New virtual machine type for x86

Rationale

We want to define a minimal, legacy and emulation free, based on virtual hardware, x86_64 machine type for QEMU. Existing machine types (pc and q35) try to emulate an actual chipsets (piix and q35 respectively), pulling requirements for emulating various pieces of hardware and buses (PCI, ISA, MCH, LPC, etc).
This increases the hypervisor's attack surface together with its implementation complexity for no technical reasons, as typical workloads do not have direct dependencies on such hardware definitions or even availability.

Instead, defining a new machine type based on top-down requirements could simplify the overall hypervisor implementation by only providing strictly virtualized and non emulated resources (CPU, memory, interrupts, timers and generic I/O devices) to cloud bound workloads. The overall hardware platform this machine type exposes is virtual and only provides the minimal set of features and devices modern cloud workloads would need.

There already are a few existing implementations for a virtual hardware platform approach: QEMU's ARM virt machine type, kvmtool and Hyper-v generation 2 VMs.

Description

This virtual hardware machine type has the following characteristics:

  • ACPI
    • Single hardware enumeration method.
    • ACPI Hardware reduced mode only.
    • Reset method, no i8042 emulation
  • Guest firmware
    • No legacy BIOS
    • UEFI (OVMF) and qboot
  • Virtio based
    • virtio-pci by default
    • virtio-scsi
    • virtio-nvme
  • Direct device assignment
    • vfio-pci
  • Interrupt controller
    • In-kernel and split implementations supported
  • Reduced PIC support
    • For early boot, real mode support, timer only
  • Clock source
    • kvm-clock
  • No RTC
    • Either emulate the couple of RTC CMOS IO ports, or fix OVMF.
  • No PCI bus
    • PCI segments only, no PCI bridges
  • No north or south bridge emulation (MCH, LPC, PCH)
  • No ISA
    • PIO basic serial port support on UART ports, for early debugging purposes. Output only.
  • Reduced CPU models support
    • Anything that supports x2APIC and flexpriority

TODO

  • Implement a virt machine type under hw/i386 for upstream QEMU
    • Simplify OVMF to support the new machine type
  • Upstream i386/virt
  • Apply code reduction scripts to QEMU 2.12 with only i386/virt and arm/virt as the supported machine types
    • Build a new NEMU branch based on the above code reduced QEMU

the size of nemu code

In the README, it says that nemu will reduce the size of code. then I did the statistics of nemu code and found that the LOC of nemu seems larger than qemu code. On developer perspective, I dont see any benifit, we still need to read large size of code. Any plan to do more work for reducing the size of nemu code?

Missing NATS tests for object properties

We are missing NATS tests for verifying that we are carrying the right driver/object default properties.
For example we want to make sure that the apic-common vapic property is always turned off but we're not testing this.

Ideally this should be done in 2 steps:

  • Add an object property getter API to govmm
  • Define an array of (object, property, value) tuples to be checked from NATS

NEMU boot time is looger than qemu

I start ubuntu cloud 16.04 with NEMU, but boot time is 19s, even looger than qemu-2.9, It takes about 13s from start NEMU command to print ‘loop: module loaded’.

/tmp/debug_log show
it take about 13s wait print "MpInitChangeApLoopCallback() done!" after print "FSOpen: Open '\EFI\BOOT\grubx64.efi' Success"

this is start NEMU command:

/home/duhy/workSpace/nemu/nemu/build-x86-64/x86_64-softmmu/qemu-system-x86_64
-bios /home/duhy/workSpace/nemu/OVMF.fd
-nographic -nodefaults -L . -net none
-machine virt,accel=kvm,kernel_irqchip
-cpu host -m 512,slots=4,maxmem=16950M
-smp 2
-device sysbus-debugcon,iobase=0x402,chardev=debugcon
-chardev file,path=/tmp/debug-log,id=debugcon
-device sysbus-debugcon,iobase=0x3f8,chardev=serialcon
-chardev file,path=/tmp/serial-log,id=serialcon
-device virtio-blk-pci,drive=image
-drive if=none,id=image,file=/home/duhy/workSpace/nemu/xenial-server-cloudimg-amd64-uefi1.img
-device virtio-blk-pci,drive=seed
-drive if=none,id=seed,file=/tmp/seed.img,format=raw
-device virtio-serial-pci,id=virtio-serial0
-device virtconsole,chardev=charconsole0,id=console0
-chardev stdio,id=charconsole0
-netdev tap,id=mynet0,ifname=tap0,script=no,downscript=no
-device virtio-net-pci,netdev=mynet0

this is start qemu command:

/usr/libexec/qemu-kvm
-nographic -net none
-machine pc,accel=kvm,kernel_irqchip
-cpu host -m 512,slots=4,maxmem=16950M
-smp 2
-device virtio-blk-pci,drive=image
-drive if=none,id=image,file=/home/duhy/workSpace/nemu/xenial-server-cloudimg-amd64-uefi1.img
-device virtio-blk-pci,drive=seed
-drive if=none,id=seed,file=/tmp/seed.img,format=raw
-netdev tap,id=mynet0,ifname=tap0,script=no,downscript=no
-device virtio-net-pci,netdev=mynet0

Device passthrough support for i386/virt

We want to be able to do device passthrough with the i386/virt machine type.
For that we will use vfio in the host and assign host PCI devices as PCI devices in the guest as well.

This opens one fundamental question: What kind of PCI emulation level do we need from i386/virt? Is PCI host bridge+root port enough? Are PCI segments enough for the guest OS to really discover new PCI devices?

virt: acpi_nvdimm_state in AcpiConfiguration

The AcpiConfiguration includes acpi_nvdimm_state.
However the other hotplug states are in

typedef struct VirtAcpiState {
    ...
    AcpiCpuHotplug cpuhp;
    CPUHotplugState cpuhp_state;
    MemHotplugState memhp_state;
    ...
} VirtAcpiState;

We should try and move it into VirtAcpiState if possible.

/cc @sboeuf

virtio support for i386/virt

We want the i386 virt platform to support virtio devices.
As we want to keep the PCI dependency to a minimum, we should aim at exposing virtio devices to the guest through MMIO.

TBD: If we are to support minimal PCI emulation (host bridge+root port), can we go virtio pci instead?

3.0.0 breaks pci windows

@sameo @mcastelino @rbradford @sboeuf @yangzhon
dmesg shows cold-plugged devices are assigned by firmware.
And if we add a pcie-root-port when launch guest, it also failed for "no space"


[    0.021357] pnp: PnP ACPI init
[    0.021386] pnp: PnP ACPI: found 0 devices
[    0.022620] pci_bus 0000:00: max bus depth: 0 pci_try_num: 1
[    0.022623] pci 0000:00:01.0: BAR 4: no space for [mem size 0x00004000 64bit pref]
[    0.022625] pci 0000:00:01.0: BAR 4: trying firmware assignment [mem 0x800000000-0x800003fff 64bit pref]
[    0.022627] pci 0000:00:01.0: BAR 4: assigned [mem 0x800000000-0x800003fff 64bit pref]
[    0.023374] pci 0000:00:02.0: BAR 4: no space for [mem size 0x00004000 64bit pref]
[    0.023375] pci 0000:00:02.0: BAR 4: trying firmware assignment [mem 0x800004000-0x800007fff 64bit pref]
[    0.023377] pci 0000:00:02.0: BAR 4: assigned [mem 0x800004000-0x800007fff 64bit pref]
[    0.024108] pci 0000:00:01.0: BAR 1: no space for [mem size 0x00001000]
[    0.024109] pci 0000:00:01.0: BAR 1: trying firmware assignment [mem 0x90001000-0x90001fff]
[    0.024110] pci 0000:00:01.0: BAR 1: assigned [mem 0x90001000-0x90001fff]
[    0.024122] pci 0000:00:02.0: BAR 1: no space for [mem size 0x00001000]
[    0.024123] pci 0000:00:02.0: BAR 1: trying firmware assignment [mem 0x90000000-0x90000fff]
[    0.024124] pci 0000:00:02.0: BAR 1: assigned [mem 0x90000000-0x90000fff]
[    0.024133] pci 0000:00:01.0: BAR 0: no space for [io  size 0x0040]
[    0.024134] pci 0000:00:01.0: BAR 0: trying firmware assignment [io  0xc040-0xc07f]
[    0.024135] pci 0000:00:01.0: BAR 0: assigned [io  0xc040-0xc07f]
[    0.024145] pci 0000:00:02.0: BAR 0: no space for [io  size 0x0040]
[    0.024146] pci 0000:00:02.0: BAR 0: trying firmware assignment [io  0xc000-0xc03f]
[    0.024146] pci 0000:00:02.0: BAR 0: assigned [io  0xc000-0xc03f]

root@clr-19bebbac82a145bb8c96500c89dbfd8a ~ # lspci
00:00.0 Host bridge: Red Hat, Inc. QEMU PCIe Host bridge
00:01.0 SCSI storage controller: Red Hat, Inc. Virtio block device
00:02.0 Communication controller: Red Hat, Inc. Virtio console
00:06.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port

00:06.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode])
        Physical Slot: 6
        Flags: bus master, fast devsel, latency 0, IRQ -2147483648
        Memory at 90200000 (32-bit, non-prefetchable) [size=4K]
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        I/O behind bridge: None
        Memory behind bridge: None
        Prefetchable memory behind bridge: None
        Capabilities: [54] Express Root Port (Slot+), MSI 00
        Capabilities: [48] MSI-X: Enable- Count=1 Masked-
        Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000
        Capabilities: [100] Advanced Error Reporting
        Kernel driver in use: pcieport
        Kernel modules: shpchp

Minimal Linux guest boot with i386/virt

Implement a minimal virt/i386 machine type to boot a Linux guest kernel up to userspace.

Devices to be added to the machine type:

  • CPUs
  • Memory
  • KVM clock
  • KVM IRQ chip
  • fw_cfg
  • e820
  • PIC?

The hw configurable impact other platform build

@sameo @rbradford

I am disabling APIC and IOAPIC device model since the irq will offload to kernel. when i did full build with below command
./configurable

There are many build errors during all platform builds, some built-in files have been configurable, we need add this CONFIG_XXX to all platforms, this is the reason why i added one hw-common.mak before. If we do not use this file, we need to add each CONFIG_XXX into each XXX-softmmu.mak file because we are not sure if each platform will need this CONFIG_XXX. thanks!

NATs: Verify the different combinations of hotplug using virt+qboot and virt+OVMF

Goal

QEMU supports multiple methods to perform device hotplug.

  • ACPI Hotplug on pci bus 0. (pcie.0 on virt or pci.0 on pc).
  • SHPC Based hotplug
  • PCIe Native hotplug

We need to verify the all of these hotplug methods work with virt with different firmwares (qboot and OVMF).

QEMU also support virtual (virtio) device hotplug and PCI (VFIO) device hotplug.
Note: VFIO hotplug may not be testable in NATs unless the VM has an exposed virtual IOMMU. So allows a local run of NATs to test VFIO and make it optional for the CI.

Kernel Requirements

The following config options have to be enabled

CONFIG_HOTPLUG_PCI_PCIE=y
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI_ACPI=y
CONFIG_HOTPLUG_PCI_SHPC=y

QEMU command line

Using virt here as it support all three types of hotplug

qemu-system-x86_64 \
     -bios ./qboot/bios.bin \
     -kernel ./linux/arch/x86_64/boot/bzImage -append 'console=hvc0 root=/dev/pmem0p3 rw rootfstype=ext4 data=ordered rcupdate.rcu_expedited=1 tsc=reliable no_timer_check noapictimer' \
     -device nvdimm,id=nvdimm1,memdev=bootmem -object memory-backend-file,id=bootmem,share=on,mem-path./clear.img,size=9161408512 \
     -nographic \
     -nodefaults \
     -L . \
     -net none \
     -machine virt,accel=kvm,kernel_irqchip,nvdimm \
     -smp 4 \
     -m 1024,slots=10,maxmem=16384M \
     -monitor telnet:127.0.0.1:55555,server,nowait \
     -device virtio-serial-pci,id=virtio-serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev stdio,id=charconsole0 \
     -device pcie-root-port,id=rp50,bus=pcie.0,chassis=1,addr=5.0,multifunction=on,pref64-reserve=32M \
     -device pcie-root-port,id=rp51,bus=pcie.0,chassis=2,addr=5.1 \
     -device pcie-pci-bridge,id=br0,addr=10.0 \

Based on where you hotplug the device you get different hotplug paths

  • Hotplug to bus 0 -> ACPI hotplug
  • Hotplug to pcie-pci-rbridge -> SHPC hotplug
  • Hotplug to the root-port -> PCIE hotplug

Test Hotplug

Connect to the qemu monitor and run this appropriate hotplug commands

    nc -N 127.0.0.1 55555

ACPI Hotplug

       device_add virtio-net-pci
       device_add vfio-pci,host=b3:00.0

SHPC pcie-pci-bridge hotplug:

      device_add virtio-net-pci,bus=br0,addr=02.0
      device_add vfio-pci,host=b3:00.0,bus=br0,addr=03.0

PCIe Native HP

      device_add virtio-net-pci,bus=rp51
      device_add vfio-pci,host=b3:00.0,bus=rp50

Verification

lspci inside the VM will show the newly added devices

few questions about the firmware boot time

1, Based on my testing result, the UEFI OVMF boot seems a little bit slow(~delay 800ms) that compared with seaBIOS, which is the default BIOS of QEMU. Have you done any tunning for OVMF fast boot or any plan for future?
2, I noticed the qboot is very fast, but it just support to boot into linux kernel? As for the linux distributions, do we have one fast boot FW to support?
3,seaBIOS does support the UEFI CSM. Why don't we enable the seaBIOS as the deafult firmware? any concern?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.