Fraser Campbell
2006-Feb-01 04:44 UTC
[Xen-users] Crash *always* after approximately 1024 migrations
Hi, We are running Xen 3 guests on NFS. One of our stress tests involves repeated live migration of a domU between 2 Xen hosts. Here are the basic steps: * start domU on hostA * start script on hostB that will live migrate to hostA once it sees running domU * start script on hostA that will live migrate to hostB once it sees running domU (this starts the sequence) The test always fails at close to 1024 migrations away (i.e. guest itself would have been migrated 2048 times). I say close because it never seems to quite get to 1024 and earlier runs crashed at 1018, 1020 and 1023 ... tonight''s run crashed at 1021. The domU can be under heavy load or no load, it makes no difference. One of the hosts will freeze solid, the guest will be migrated to other host but still in paused state. Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 1000 1 r----- 11728.2 toxenc03 1022 512 1 --p--- 0.0 I am hoping this is a known problem. If it is not I will gather more information before and after each migration, document exact sequence of events and repeat twice to ensure some consistency. If there are any special points of interest that I should gather please advise. We are using BETA of both SLES10 and FC5. Kernels have been both vendor supplied and custom compiled, behaviour the same across the board. Servers pairs were either 2x HP-DL380G4 or 1x IBM-x366 and 1x HP-DL380G4. Thanks _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users