Hello, We run Openstack Stein on arm. It runs nova-compute(use libvirt as virt driver) on arm host. We found when built with disks (use ceph rbd) on arm hosts, the vm can not attach all disk correctly. For example, built with six disks, the vm may attach three disks. No obvious error can be fond in nova-compute, libvirt. We compare aarch64 and x86, find when detach disk, the dmesg of the vm's os is different. May be the pciehg parameter is different? Did anyone met the problem? Or some suggestions? x86: Nothing at all aarch64: Sep 29 15:28:55 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): Attention button pressed Sep 29 15:28:55 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): Powering off due to button press Sep 29 15:29:00 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): Attention button pressed Sep 29 15:29:00 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): Button cancel Sep 29 15:29:00 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): Action canceled due to button press Sep 29 15:29:07 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): Attention button pressed Sep 29 15:29:07 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): Powering off due to button press Sep 29 15:29:13 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): Link Up Sep 29 15:29:16 * kernel: pciehp 0000:00:01.5:pcie004: Failed to check link status Sep 29 15:29:18 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): Attention button pressed Sep 29 15:29:18 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): Powering off due to button press Sep 29 15:29:23 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): Attention button pressed Sep 29 15:29:23 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): Button cancel Sep 29 15:29:23 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): Action canceled due to button press Sep 29 15:29:30 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): Attention button pressed Sep 29 15:29:30 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): Powering off due to button press Sep 29 15:29:36 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): Link Up Sep 29 15:29:39 * kernel: pciehp 0000:00:01.6:pcie004: Failed to check link status Sep 29 15:29:39 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): Attention button pressed Sep 29 15:29:39 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): Powering off due to button press Sep 29 15:29:45 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): Attention button pressed Sep 29 15:29:45 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): Button cancel Sep 29 15:29:45 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): Action canceled due to button press Sep 29 15:29:52 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): Attention button pressed Sep 29 15:29:52 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): Powering off due to button press Sep 29 15:29:58 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): Link Up Sep 29 15:30:01 * kernel: pciehp 0000:00:02.0:pcie004: Slot(0-8): Attention button pressed Sep 29 15:30:01 * kernel: pciehp 0000:00:02.0:pcie004: Slot(0-8): Powering off due to button press
Hello? Jaze Lee <jazeltq at gmail.com> ?2021?10?8??? ??4:54???> > Hello, > We run Openstack Stein on arm. It runs nova-compute(use libvirt > as virt driver) on arm host. We found when built with disks (use ceph > rbd) on arm hosts, the vm can not attach all disk correctly. For > example, built with six disks, the vm may attach three disks. No > obvious error can be fond in nova-compute, libvirt. We compare aarch64 > and x86, find when detach disk, the dmesg of the vm's os is different. > May be the pciehg parameter is different? > > Did anyone met the problem? Or some suggestions? > > > x86: > Nothing at all > > aarch64: > Sep 29 15:28:55 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): > Attention button pressed > Sep 29 15:28:55 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): > Powering off due to button press > Sep 29 15:29:00 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): > Attention button pressed > Sep 29 15:29:00 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): Button cancel > Sep 29 15:29:00 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): > Action canceled due to button press > Sep 29 15:29:07 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): > Attention button pressed > Sep 29 15:29:07 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): > Powering off due to button press > Sep 29 15:29:13 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): Link Up > Sep 29 15:29:16 * kernel: pciehp 0000:00:01.5:pcie004: Failed to check > link status > Sep 29 15:29:18 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): > Attention button pressed > Sep 29 15:29:18 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): > Powering off due to button press > Sep 29 15:29:23 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): > Attention button pressed > Sep 29 15:29:23 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): Button cancel > Sep 29 15:29:23 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): > Action canceled due to button press > Sep 29 15:29:30 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): > Attention button pressed > Sep 29 15:29:30 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): > Powering off due to button press > Sep 29 15:29:36 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): Link Up > Sep 29 15:29:39 * kernel: pciehp 0000:00:01.6:pcie004: Failed to check > link status > Sep 29 15:29:39 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): > Attention button pressed > Sep 29 15:29:39 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): > Powering off due to button press > Sep 29 15:29:45 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): > Attention button pressed > Sep 29 15:29:45 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): Button cancel > Sep 29 15:29:45 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): > Action canceled due to button press > Sep 29 15:29:52 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): > Attention button pressed > Sep 29 15:29:52 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): > Powering off due to button press > Sep 29 15:29:58 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): Link Up > Sep 29 15:30:01 * kernel: pciehp 0000:00:02.0:pcie004: Slot(0-8): > Attention button pressed > Sep 29 15:30:01 * kernel: pciehp 0000:00:02.0:pcie004: Slot(0-8): > Powering off due to button press-- ????
On Fri, Oct 8, 2021 at 5:16 PM Jaze Lee <jazeltq at gmail.com> wrote:> Hello, > We run Openstack Stein on arm. It runs nova-compute(use libvirt > as virt driver) on arm host. We found when built with disks (use ceph > rbd) on arm hosts, the vm can not attach all disk correctly. For > example, built with six disks, the vm may attach three disks. No > obvious error can be fond in nova-compute, libvirt. We compare aarch64 > and x86, find when detach disk, the dmesg of the vm's os is different. > May be the pciehg parameter is different? > > Please provide the version of libvirt, qemu, openstack-nova and librbd1.> Did anyone met the problem? Or some suggestions? > > > x86: > Nothing at all > > aarch64: > Sep 29 15:28:55 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): > Attention button pressed > Sep 29 15:28:55 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): > Powering off due to button press > Sep 29 15:29:00 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): > Attention button pressed > Sep 29 15:29:00 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): Button > cancel > Sep 29 15:29:00 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): > Action canceled due to button press > Sep 29 15:29:07 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): > Attention button pressed > Sep 29 15:29:07 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): > Powering off due to button press > Sep 29 15:29:13 * kernel: pciehp 0000:00:01.5:pcie004: Slot(0-5): Link Up > Sep 29 15:29:16 * kernel: pciehp 0000:00:01.5:pcie004: Failed to check > link status > Sep 29 15:29:18 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): > Attention button pressed > Sep 29 15:29:18 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): > Powering off due to button press > Sep 29 15:29:23 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): > Attention button pressed > Sep 29 15:29:23 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): Button > cancel > Sep 29 15:29:23 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): > Action canceled due to button press > Sep 29 15:29:30 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): > Attention button pressed > Sep 29 15:29:30 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): > Powering off due to button press > Sep 29 15:29:36 * kernel: pciehp 0000:00:01.6:pcie004: Slot(0-6): Link Up > Sep 29 15:29:39 * kernel: pciehp 0000:00:01.6:pcie004: Failed to check > link status > Sep 29 15:29:39 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): > Attention button pressed > Sep 29 15:29:39 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): > Powering off due to button press > Sep 29 15:29:45 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): > Attention button pressed > Sep 29 15:29:45 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): Button > cancel > Sep 29 15:29:45 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): > Action canceled due to button press > Sep 29 15:29:52 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): > Attention button pressed > Sep 29 15:29:52 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): > Powering off due to button press > Sep 29 15:29:58 * kernel: pciehp 0000:00:01.7:pcie004: Slot(0-7): Link Up > Sep 29 15:30:01 * kernel: pciehp 0000:00:02.0:pcie004: Slot(0-8): > Attention button pressed > Sep 29 15:30:01 * kernel: pciehp 0000:00:02.0:pcie004: Slot(0-8): > Powering off due to button press > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20211011/861d5b3b/attachment.htm>
On Fri, Oct 08, 2021 at 04:54:37PM +0800, Jaze Lee wrote:> Hello, > We run Openstack Stein on arm. It runs nova-compute(use libvirt > as virt driver) on arm host. We found when built with disks (use ceph > rbd) on arm hosts, the vm can not attach all disk correctly. For > example, built with six disks, the vm may attach three disks. No > obvious error can be fond in nova-compute, libvirt. We compare aarch64 > and x86, find when detach disk, the dmesg of the vm's os is different. > May be the pciehg parameter is different? > > Did anyone met the problem? Or some suggestions?I think you might have just ran out of PCI ports available for hotplug. Please try setting https://docs.openstack.org/nova/stein/configuration/config.html#libvirt.num_pcie_ports to a reasonable value and see whether that helps. Note that, since you're using libvirt through OpenStack and not directly, you're more likely to find someone who's able to help you out if you use the OpenStack support channels. -- Andrea Bolognani / Red Hat / Virtualization