Hi All, I believe that I am suffering a similar problem to some other users of xVM on OpenSolaris. The observed problem is that Dom0 and all other domains hang (become unresponsive and never recover) when heavy disk load occurs in any domain. The problem occurs under 2 circumstances: a) when a file is used to back a domU disk with high activity ie install service pack 3; or, b) there is high disk activity in dom0 ie a dd from a zfs file to zvol. The problem ceased for case (a) when a zvol was used to back the domU. System details are as follows: uname -a SunOS ra 5.11 snv_90 i86pc i386 i86xpv /boot/grub/menu.lst title Solaris xVM with limits findroot (pool_rpool,0,a) kernel$ /boot/$ISADIR/xen.gz /boot/$ISADIR/xen.gz dom0_mem=2G dom0_max_vcpus=2 module$ /platform/i86xpv/kernel/$ISADIR/unix /platform/i86xpv/kernel/ $ISADIR/unix -B $ZFS-BOOTFS module$ /platform/i86pc/$ISADIR/boot_archive # svccfg -s xvm/xend listprop config application config/dom0-cpus integer 0 config/enable-dump boolean true config/stability astring Unstable config/xend-relocation-address astring 127.0.0.1 config/xend-relocation-hosts-allow astring ^localhost$ config/xend-relocation-server boolean true config/xend-unix-server boolean true config/vncpasswd astring doyoureallythinkiwouldpostthistotheweb config/vnc-listen astring 0.0.0.0 config/default-nic astring nfo0 config/dom0-min-mem integer 2000 xenstored dependency xenstored/entities fmri svc:/system/xvm/store xenstored/grouping astring require_all xenstored/restart_on astring restart xenstored/type astring service general framework general/entity_stability astring Unstable general/single_instance boolean true start method start/exec astring "/lib/svc/method/xend %m" start/timeout_seconds count 0 start/type astring method stop method stop/exec astring :kill stop/timeout_seconds count 60 stop/type astring method tm_common_name template tm_common_name/C ustring "Hypervisor Control Daemon" tm_man_xend template tm_man_xend/manpath astring /usr/share/man tm_man_xend/section astring 1M tm_man_xend/title astring xend tail -1 /etc/system set zfs:zfs_arc_max = 0x10000000 This is probably just another data point following on from the discussion about not using dom0 for anything serious; however, the fixes prescribed don''t seem to be entirely successful. Any further suggestions welcome. Maurice Castro PS. Thanks to David Edmonson for persisting with my network fault, the system is now usable.
Maurice Castro wrote:> Hi All, > I believe that I am suffering a similar problem to some other users > of xVM on OpenSolaris. The observed problem is that Dom0 and all other > domains hang (become unresponsive and never recover) when heavy disk > load occurs in any domain. The problem occurs under 2 circumstances: > > a) when a file is used to back a domU disk with high activity ie install > service pack 3; or, > b) there is high disk activity in dom0 ie a dd from a zfs file to zvol. > > The problem ceased for case (a) when a zvol was used to back the domU.I have a very similar problem as well-- I see hangs under the same conditions (high disk activity in domU), but my domU *is* on a zvol. My setup is almost identical to Maurice''s, except I give 4G to dom0, and as I mentioned, my domU is on a zvol. In my case, I see the hang when running a big imapsync job. My domU is running CentOS 5 and Zimbra. My dom0 is not used for anything other than management. I was thinking this was hardware related, but it has happened on different physical servers that are both brand-new X4150s. It happens with both snv_89 and snv_90. I''d love to figure out what''s going on here-- this has stalled my project. Thanks, Eric