Nitesh Konkar
2016-Jul-15 20:55 UTC
[libvirt-users] NPIV storage pools do not map to same LUN units across hosts.
Link: http://wiki.libvirt.org/page/NPIV_in_libvirt Topic: Virtual machine configuration change to use vHBA LUN There is a NPIV storage pool defined on two hosts and pool contains a total of 8 volumes, allocated from a storage device. Source: # virsh vol-list poolvhba0 Name Path ------------------------------------------------------------------------------ unit:0:0:0 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000366 unit:0:0:1 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000367 unit:0:0:2 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000368 unit:0:0:3 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000369 unit:0:0:4 /dev/disk/by-id/wwn-0x6005076802818bda300000000000036a unit:0:0:5 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000380 unit:0:0:6 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000381 unit:0:0:7 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000382 -------------------------------------------------------------------- Destination: -------------------------------------------------------------------- # virsh vol-list poolvhba0 Name Path ------------------------------------------------------------------------------ unit:0:0:0 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000380 unit:0:0:1 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000381 unit:0:0:2 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000382 unit:0:0:3 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000367 unit:0:0:4 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000368 unit:0:0:5 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000366 unit:0:0:6 /dev/disk/by-id/wwn-0x6005076802818bda300000000000036a unit:0:0:7 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000369 -------------------------------------------------------------------- As you can see in the above output,the same set of eight LUNs from the storage server have been mapped, but the order that the LUNs are probed on each host is different, resulting in different unit names on the two different hosts . If the the guest XMLs is referencing its storage by "unit" number then is it safe to migrate such guests because the "unit number" is assigned by the driver according to the specific way it probes the storage and hence when you migrate these guests , it results in different unit names on the destination hosts. Thus the migrated guest gets mapped to the wrong LUNs and is given the wrong disks. The problem is that the LUN numbers on the destination host and source host do not agree. Example, LUN 0 on source_host, for example, may be LUN 5 on destination_host. When the guest is given the wrong disk, it suffers a fatal I/O error. (This is manifested as fatal I/O errors since the guest has no idea that its disks just changed out under it.)The migration does not take into account that the unit numbers do match on on the source and destination sides. So, should libvirt make sure that the guest domains reference NPIV pool volumes by their globally-unique wwn instead of by "unit" numbers? The guest XML references its storage by "unit" number. Eg:- <disk type='volume' device='lun'> <driver name='qemu' type='raw' cache='none'/> <source pool='poolvhba0' volume='unit:0:0:0'/> <backingStore/> <target dev='vdb' bus='virtio'/> <alias name='virtio-disk1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> I am planning to write a patch for it. Any comments on the above observation/approach would be appreciated. Thanks, Nitesh.
Nitesh Konkar
2016-Aug-04 07:03 UTC
Re: [libvirt-users] NPIV storage pools do not map to same LUN units across hosts.
Any comments on this observation? On Sat, Jul 16, 2016 at 2:25 AM, Nitesh Konkar < niteshkonkar.libvirt@gmail.com> wrote:> Link: http://wiki.libvirt.org/page/NPIV_in_libvirt > Topic: Virtual machine configuration change to use vHBA LUN > > There is a NPIV storage pool defined on two hosts and pool contains a > total of 8 volumes, allocated from a storage device. > > Source: > > # virsh vol-list poolvhba0 > Name Path > ------------------------------------------------------------------------------ > unit:0:0:0 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000366 > unit:0:0:1 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000367 > unit:0:0:2 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000368 > unit:0:0:3 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000369 > unit:0:0:4 /dev/disk/by-id/wwn-0x6005076802818bda300000000000036a > unit:0:0:5 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000380 > unit:0:0:6 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000381 > unit:0:0:7 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000382 > -------------------------------------------------------------------- > Destination: > -------------------------------------------------------------------- > # virsh vol-list poolvhba0 > Name Path > ------------------------------------------------------------------------------ > unit:0:0:0 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000380 > unit:0:0:1 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000381 > unit:0:0:2 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000382 > unit:0:0:3 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000367 > unit:0:0:4 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000368 > unit:0:0:5 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000366 > unit:0:0:6 /dev/disk/by-id/wwn-0x6005076802818bda300000000000036a > unit:0:0:7 /dev/disk/by-id/wwn-0x6005076802818bda3000000000000369 > -------------------------------------------------------------------- > > As you can see in the above output,the same set of eight LUNs from the storage server have been mapped, > but the order that the LUNs are probed on each host is different, resulting in different unit names > on the two different hosts . > > If the the guest XMLs is referencing its storage by "unit" number then is > it safe to migrate such guests because the "unit number" is assigned by the > driver according to the specific way it probes the storage and hence when you migrate > these guests , it results in different unit names on the destination hosts. > Thus the migrated guest gets mapped to the wrong LUNs and is given the wrong disks. > The problem is that the LUN numbers on the destination host and source host do not agree. > Example, LUN 0 on source_host, for example, may be LUN 5 on destination_host. > When the guest is given the wrong disk, it suffers a fatal I/O error. (This is > manifested as fatal I/O errors since the guest has no idea that its disks just > changed out under it.)The migration does not take into account that the unit numbers do > match on on the source and destination sides. > > So, should libvirt make sure that the guest domains reference NPIV pool volumes by their > globally-unique wwn instead of by "unit" numbers? > > The guest XML references its storage by "unit" number. > > Eg:- > <disk type='volume' device='lun'> > <driver name='qemu' type='raw' cache='none'/> > <source pool='poolvhba0' volume='unit:0:0:0'/> > <backingStore/> > <target dev='vdb' bus='virtio'/> > <alias name='virtio-disk1'/> > <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> > </disk> > > I am planning to write a patch for it. Any comments on the above observation/approach would be appreciated. > > Thanks, > > Nitesh. > >