Qu Wenruo
2021-Dec-14 08:16 UTC
Libvirt on little.BIG ARM systems unable to start guest if no cpuset is provided
On 2021/12/14 15:53, Michal Pr?vozn?k wrote:> On 12/14/21 01:41, Qu Wenruo wrote: >> >> >> On 2021/12/14 00:49, Marc Zyngier wrote: >>> On Mon, 13 Dec 2021 16:06:14 +0000, >>> Peter Maydell <peter.maydell at linaro.org> wrote: >>>> >>>> KVM on big.little setups is a kernel-level question really; I've >>>> cc'd the kvmarm list. >>> >>> Thanks Peter for throwing us under the big-little bus! ;-) >>> >>>> >>>> On Mon, 13 Dec 2021 at 15:02, Qu Wenruo <quwenruo.btrfs at gmx.com> wrote: >>>>> >>>>> >>>>> >>>>> On 2021/12/13 21:17, Michal Pr?vozn?k wrote: >>>>>> On 12/11/21 02:58, Qu Wenruo wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Recently I got my libvirt setup on both RK3399 (RockPro64) and RPI >>>>>>> CM4, >>>>>>> with upstream kernels. >>>>>>> >>>>>>> For RPI CM4 its mostly smooth sail, but on RK3399 due to its >>>>>>> little.BIG >>>>>>> setup (core 0-3 are 4x A55 cores, and core 4-5 are 2x A72 cores), it >>>>>>> brings quite some troubles for VMs. >>>>>>> >>>>>>> In short, without proper cpuset to bind the VM to either all A72 >>>>>>> cores >>>>>>> or all A55 cores, the VM will mostly fail to boot. >>> >>> s/A55/A53/. There were thankfully no A72+A55 ever produced (just the >>> though of it makes me sick). >>> >>>>>>> >>>>>>> Currently the working xml is: >>>>>>> >>>>>>> ??? <vcpu placement='static' cpuset='4-5'>2</vcpu> >>>>>>> ??? <cpu mode='host-passthrough' check='none'/> >>>>>>> >>>>>>> But even with vcpupin, pinning each vcpu to each physical core, VM >>>>>>> will >>>>>>> mostly fail to start up due to vcpu initialization failed with >>>>>>> -EINVAL. >>> >>> Disclaimer: I know nothing about libvirt (and no, I don't want to >>> know! ;-). >>> >>> However, for things to be reliable, you need to taskset the whole QEMU >>> process to the CPU type you intend to use. >> >> Yep, that's what I'm doing. >> >>> That's because, AFAICT, >>> QEMU will snapshot the system registers outside of the vcpu threads, >>> and attempt to use the result to configure the actual vcpu threads. If >>> they happen to run on different CPU types, the sysregs will differ in >>> incompatible ways and an error will be returned. This may or may not >>> be a bug, I don't know (I see it as a feature). >> >> Then this brings another question. >> >> If we can pin each vCPU to each physical core (both little and big), >> then as long as the registers are per-vCPU based, it should be able to >> pass both big and little cores to the VM. >> >> Yeah, I totally understand this screw up the scheduling, but that's at >> least what (some insane) users want (just like me). >> >>> >>> If you are annoyed with this behaviour, you can always use a different >>> VMM that won't care about such difference (crosvm or kvmtool, to name >>> a few). >> >> Sounds pretty interesting, a new world but without libvirt... >> >>> However, the guest will be able to observe the migration from >>> one cpu type to another. This may or may not affect your guest's >>> behaviour. >> >> Not sure if it's possible to pin each vCPU thread to each core, but let >> me try. >> > > Sure it is, for instance: > > <cputune> > <vcpupin vcpu="0" cpuset="1-4,^2"/> > <vcpupin vcpu="1" cpuset="0,1"/> > <vcpupin vcpu="2" cpuset="2,3"/> > <vcpupin vcpu="3" cpuset="0,4"/> > <emulatorpin cpuset="1-3"/> > <iothreadpin iothread="1" cpuset="5,6"/> > <iothreadpin iothread="2" cpuset="7,8"/> > </cputune>That's what I have already tried before. I pinned vcpu 0-6 to physical core 0-6, and still no reliable boot up. And that's why I'm asking here. Thanks, Qu> > pins vCPU#0 onto host CPUs 1-4, excluding 2; vCPU#1 onto host CPUs 0-1 > and so on. You can also pin emulator (QEMU) and its iothreads. It's > documented here: > > https://libvirt.org/formatdomain.html#cpu-tuning > > Michal >
Marc Zyngier
2021-Dec-14 09:52 UTC
Libvirt on little.BIG ARM systems unable to start guest if no cpuset is provided
On Tue, 14 Dec 2021 08:16:40 +0000, Qu Wenruo <quwenruo.btrfs at gmx.com> wrote:> > > > On 2021/12/14 15:53, Michal Pr?vozn?k wrote: > > On 12/14/21 01:41, Qu Wenruo wrote: > >> > >> > >> On 2021/12/14 00:49, Marc Zyngier wrote: > >>> On Mon, 13 Dec 2021 16:06:14 +0000, > >>> Peter Maydell <peter.maydell at linaro.org> wrote: > >>>> > >>>> KVM on big.little setups is a kernel-level question really; I've > >>>> cc'd the kvmarm list. > >>> > >>> Thanks Peter for throwing us under the big-little bus! ;-) > >>> > >>>> > >>>> On Mon, 13 Dec 2021 at 15:02, Qu Wenruo <quwenruo.btrfs at gmx.com> wrote: > >>>>> > >>>>> > >>>>> > >>>>> On 2021/12/13 21:17, Michal Pr?vozn?k wrote: > >>>>>> On 12/11/21 02:58, Qu Wenruo wrote: > >>>>>>> Hi, > >>>>>>> > >>>>>>> Recently I got my libvirt setup on both RK3399 (RockPro64) and RPI > >>>>>>> CM4, > >>>>>>> with upstream kernels. > >>>>>>> > >>>>>>> For RPI CM4 its mostly smooth sail, but on RK3399 due to its > >>>>>>> little.BIG > >>>>>>> setup (core 0-3 are 4x A55 cores, and core 4-5 are 2x A72 cores), it > >>>>>>> brings quite some troubles for VMs. > >>>>>>> > >>>>>>> In short, without proper cpuset to bind the VM to either all A72 > >>>>>>> cores > >>>>>>> or all A55 cores, the VM will mostly fail to boot. > >>> > >>> s/A55/A53/. There were thankfully no A72+A55 ever produced (just the > >>> though of it makes me sick). > >>> > >>>>>>> > >>>>>>> Currently the working xml is: > >>>>>>> > >>>>>>> ??? <vcpu placement='static' cpuset='4-5'>2</vcpu> > >>>>>>> ??? <cpu mode='host-passthrough' check='none'/> > >>>>>>> > >>>>>>> But even with vcpupin, pinning each vcpu to each physical core, VM > >>>>>>> will > >>>>>>> mostly fail to start up due to vcpu initialization failed with > >>>>>>> -EINVAL. > >>> > >>> Disclaimer: I know nothing about libvirt (and no, I don't want to > >>> know! ;-). > >>> > >>> However, for things to be reliable, you need to taskset the whole QEMU > >>> process to the CPU type you intend to use. > >> > >> Yep, that's what I'm doing. > >> > >>> That's because, AFAICT, > >>> QEMU will snapshot the system registers outside of the vcpu threads, > >>> and attempt to use the result to configure the actual vcpu threads. If > >>> they happen to run on different CPU types, the sysregs will differ in > >>> incompatible ways and an error will be returned. This may or may not > >>> be a bug, I don't know (I see it as a feature). > >> > >> Then this brings another question. > >> > >> If we can pin each vCPU to each physical core (both little and big), > >> then as long as the registers are per-vCPU based, it should be able to > >> pass both big and little cores to the VM. > >> > >> Yeah, I totally understand this screw up the scheduling, but that's at > >> least what (some insane) users want (just like me). > >> > >>> > >>> If you are annoyed with this behaviour, you can always use a different > >>> VMM that won't care about such difference (crosvm or kvmtool, to name > >>> a few). > >> > >> Sounds pretty interesting, a new world but without libvirt... > >> > >>> However, the guest will be able to observe the migration from > >>> one cpu type to another. This may or may not affect your guest's > >>> behaviour. > >> > >> Not sure if it's possible to pin each vCPU thread to each core, but let > >> me try. > >> > > > > Sure it is, for instance: > > > > <cputune> > > <vcpupin vcpu="0" cpuset="1-4,^2"/> > > <vcpupin vcpu="1" cpuset="0,1"/> > > <vcpupin vcpu="2" cpuset="2,3"/> > > <vcpupin vcpu="3" cpuset="0,4"/> > > <emulatorpin cpuset="1-3"/> > > <iothreadpin iothread="1" cpuset="5,6"/> > > <iothreadpin iothread="2" cpuset="7,8"/> > > </cputune> > > That's what I have already tried before. > I pinned vcpu 0-6 to physical core 0-6, and still no reliable boot up. > > And that's why I'm asking here.You are still missing the point of how QEMU works: - QEMU creates a dummy VM with a single vcpu. This can happen on *any* CPU. - It snapshots the sysregs for this vcpu, and keep them for later - It then destroy this VM - QEMU then creates the full VM, with all the vcpus - Each vcpu gets initialised with the state saved earlier. If any vcpu is initialised on a physical CPU of a different type from the one that has been used for the dummy VM, you lose, as we cannot restore some of the registers such as MIDR_EL1 (and other registers that KVM considers as invariant). To fix this, you need to change QEMU's notion of a template VM, or change KVM's notion of invariant registers. The former is quite hard, and the later breaks a ton of things for guests, such as errata workarounds. The best workaround is to taskset the QEMU process (and I really mean the process, not individual threads) to an homogeneous set of CPUs and be done with it. M. -- Without deviation from the norm, progress is not possible.