Devin Reade
2016-Mar-10 04:33 UTC
[libvirt-users] different uuids, but still "Attempt to migrate guest to same host" error
Background: ---------- I'm trying to debug a two-node pacemaker/corosync cluster where I want to be able to do live migration of KVM/qemu VMs. Storage is backed via dual-primary DRBD (yes, fencing is in place). When moving the VM between nodes via 'pcs resource move RES NODENAME', the live migration fails although pacemaker will shut down the VM and restart it on the other node. For the purpose of diagnosing things, on both nodes I've put SELinux into permissive mode and disabled firewalld. Interesting Bit: --------------- Debugging a bit further, I put the VM into an unmanaged state and then try with virsh, from the node currently running the VM: [root@node1 ~]# virsh migrate --live --verbose testvm qemu+ssh://node2/system error: internal error: Attempt to migrate guest to the same host node1.example.tld A quick google points toward uuid problems, however the two nodes are, afaict, working with different UUIDs. (Substantiating info shown toward the end.) I thought that since `hostname` only returns the node name and not the FQDN that perhaps there was internal qemu confusion about using the short node name vs FQDN. However fully qualifying it made no difference: [root@node1 ~]# virsh migrate --live --verbose testvm qemu+ssh://node2.example.tld/system error: internal error: Attempt to migrate guest to the same host node1.example.tld Running virsh with a debug level of 1 doesn't reveal anything interesting that I can see. Running libvirtd at that level shows that node2 is seeing node1.example.tld in the emitted XML in qemuMigrationPrepareDirect. I'm assuming that means the wrong node has been calculated somewhere prior to that. At this point I'm grasping at straws and looking for ideas. Does anyone have a clue-bat? Devin Config Info Follows: ------------------- CentOS Linux release 7.2.1511 (Core) libvirt on both nodes is 1.2.17-13 [root@node1 ~]# virsh sysinfo | grep uuid <entry name='uuid'>03DE0294-0480-05A4-B906-8E0700080009</entry> [root@node2 ~]# virsh sysinfo | grep uuid <entry name='uuid'>03DE0294-0480-05A4-B206-320700080009</entry> [root@node1 ~]# dmidecode -s system-uuid 03DE0294-0480-05A4-B906-8E0700080009 [root@node2 ~]# dmidecode -s system-uuid 03DE0294-0480-05A4-B206-320700080009 [root@node1 ~]# fgrep uuid /etc/libvirt/libvirtd.conf | grep -v '#' host_uuid = "875cb1a3-437c-4cb5-a3de-9789d0233e4b" [root@node2 ~]# fgrep uuid /etc/libvirt/libvirtd.conf | grep -v '#' host_uuid = "643c0ef4-bb46-4dc9-9f91-13dda8d9aa33" [root@node2 ~]# pcs config show ... Resource: testvm (class=ocf provider=heartbeat type=VirtualDomain) Attributes: hypervisor=qemu:///system config=/cluster/config/libvirt/qemu/testvm.xml migration_transport=ssh Meta Attrs: allow-migrate=true is-managed=false Operations: start interval=0s timeout=120 (testvm-start-interval-0s) stop interval=0s timeout=240 (testvm-stop-interval-0s) monitor interval=10 timeout=30 (testvm-monitor-interval-10) migrate_from interval=0 timeout=60s (testvm-migrate_from-interval-0) migrate_to interval=0 timeout=120s (testvm-migrate_to-interval-0) ... (The /cluster/config directory is a shared GlusterFS filesystem.) [root@node1 ~]# cat /etc/hosts | grep -v localhost 192.168.10.8 node1.example.tld node2 192.168.10.9 node2.example.tld node2 192.168.11.8 node1hb.example.tld node1hb 192.168.11.9 node2hb.example.tld node2hb (node1 and node2 are the "reachable" IPs and totem ring1. node1hb and node2hb form a direct connection via crossover cable for DRBD and totem ring0.)