We''ve seen the same behavior (random reboots) using ISCSI+OCFS2+XEN4
Regards.
From: xen-users-bounces@lists.xensource.com
[mailto:xen-users-bounces@lists.xensource.com] On Behalf Of Benjamin Weaver
Sent: Thursday, August 18, 2011 3:32 AM
To: xen-users@lists.xensource.com
Subject: [Xen-users] (no subject)
Debian box running virtual machines blows up under large NFS network load
________________________________
[Log in<http://www.linuxquestions.org/questions/lqlogin.php> to get rid of
this advertisement]
I am running Debian Squeeze with Xen 4.0. I am running a
stress<http://www.linuxquestions.org/questions/linux-networking-3/debian-box-running-virtual-machines-blows-up-under-large-nfs-network-load-897663/>
test. I have created 2 virtual machines, each with 512 Mb of memory adn 8Gb in
size. I have made one of the vms an NFS server, sharing out a large file (4.6
Gb). The other virtual
machine<http://www.linuxquestions.org/questions/linux-networking-3/debian-box-running-virtual-machines-blows-up-under-large-nfs-network-load-897663/>
is an NFS client. The stress test consists of passing that big file back and
forth via an mv command executed on the client, which moves the file back and
forth from the nfs share directory to a local directory. The virtual machines
are stored on a remote SAN connected to by
ISCSI<http://www.linuxquestions.org/questions/linux-networking-3/debian-box-running-virtual-machines-blows-up-under-large-nfs-network-load-897663/>
and formatted in ocfs2.
It is true I have had better luck with some ethernet cards than others.
One of the boxes, running Intel 1000Mb cards (1 to the SAN/OCFS2, 1 to the
outside world), runs the vms and the stress test without problems.
But the other box, running a 1000Mb Realtek nic to the outside
world and a 100Mb Realtek nic to the SAN, fails. The
100Mb nic was dropping packets to the SAN so I changed
the SAN nic to the Realtek 1000Mb
Now I do not drop packets (aside from a handful on the 2 vif interfaces at
startup)
And yet,
about 1 out of 2 times I attempt to mv the file from the local directory of the
nfs client vm to the nfs share, the box running the vms reboots. It leaves no
logs, and seldom even any messages on the screen. It just blanks out and the
next thing I know I it is rebooting.
I have tried manipulating the size of the MTU, with out positive success.
I have noticed that all--or nearly all--the reboots occur when I attempt to mv
the file BACK INTO the nfs shared directory.
I begun testing with tcpdump, and notice that a large number of packets go over
with checksums correct, then, after a packet of unusually long length, all show
checksums incorrect. (but I am new to tcpdump and may not be interpreting the
output correctly).
Any ideas why the host machine is rebooting, and how this could be fixed? Could
changing the size of the ring buffer make a difference? I read about this on a
couple of web pages.
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users