Hi.
I have three Ubuntu Server 14.04 trusty with KVM. Two of
them are HP servers and one is Dell. Both brands run fine
the KVM virtual servers, and I can do live migration between
the HPs. But I get I/O errors in the vda when I migrate to
or from the Dell server.
I have shared storage with NFS, mounted the same way in all
of them:
nfs.sever:/kvm /var/lib/libvirt/images nfs auto,vers=3
I checked the version of all the packages to make sure are
the same. I got:
kernel: 3.13.0-43-generic #72-Ubuntu SMP x86_64 libvirt:
libvirt: 1.2.2-0ubuntu13.1.9
qemu-utils: 2.0.0+dfsg-2ubuntu1.10
qemu-kvm: 2.0.0+dfsg-2ubuntu1.10
I made sure the Cache in the Storage is set to None.
Disk bus: virtio Cache mode: none IO mode: default
I run this to do live migration:
virsh migrate --live virtual qemu+ssh://dellserver/system
I open two consoles with virt-manager, one in the origin host
and another one in the destination.
As soon as it starts in the origin console I spot I/O error
messages, when it finishes I got them in the console in the
destination server. The file system is read only and I have to
shut it down hard.
end request I/O error, /dev/vda, sector 8790327
When I migrate to the other HP server the process runs fine.
I don't know what else to check, I wonder if such different
hardware could be a problem.
These are the CPU flags in the HP server:
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
pbe syscall nx rdtscp lm c onstant_tsc arch_perfmon pebs bts
rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64
monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1
sse4_2 po pcnt lahf_lm dtherm tpr_shadow vnmi flexpriority
ept vpid
And those in the Dell server:
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
pbe syscall nx lm constant _tsc pebs bts nopl pni dtes64
monitor ds_cpl vmx est cid cx16 xtpr pdcm lahf_lm tpr_shadow
I tried to check the log files in /var/log/libvirt but I
can't see any different message when I migrate from HP to HP
than when I do from HP to Dell.
I think I found something checking SELinux. ls -Z and getfattr
return nothing. But ps -eZ showed something very different
in the Dell server.
This is in the HP server:
/usr/sbin/libvirtd 1034 ? 11:51:44 libvirtd
libvirt-09540b5d-82 701 ? 05:28:40 qemu-system-x86
unconfined 1 ? 00:01:00 init
In the Dell server init is confined in lxc and there are also
lxc-start processes.
/usr/sbin/libvirtd 1622 ? 05:07:07 libvirtd
libvirt-8a0f9087-32d... 29926 ? 00:00:01 qemu-system-x86
lxc-container-default 1774 ? 00:00:00 init
/usr/bin/lxc-start 1763 ? 00:00:00 lxc-start
There is also LXC installed in that server ! Maybe that is messing
with kvm. The qemu processes look fine to me but there is a chance
the problem comes from there.
I could move the LXC somewhere else or I can keep it there to
try to fix this issue. What do you advice I should do now ?