netz-haut - stephan seitz
2009-Sep-07 12:24 UTC
[Xen-users] iSCSI domU - introducing more stability
Hi there, during peak load on some running domU, I noticed random iSCSI "Reported LUNs data has changed" which forced me to shutdown the respective domU, re-login the target and do a fsck before starting domU again. This occurred on a 16 core machine, having only about 14 domUs running. Spare memory has been occupied by dom0 (about 40G). Each domU has it''s own iSCSI target. iSCSI is initiated by open-iscsi via a quad port e1000e w/ offloading and mode0 bonding (this has recently been changed from 802.3ad to round-robin as the switch manuel advices not to use LACP if possible..., however) . Dom0 is Debian Lenny amd64 (out of the box - setup) Any ideas how to avoid iSCSI dead locking? I *assume* it''s caused by CPU saturation, as the load has shown above 16.0 on dom0, but I don''t know how to throttle I/O... Oh, using an iSCS HBA instead of open-iscsi would need very good arguments as this kind of hardware is not the cheapest... Thanks in advance. Regards, Mit freundlichen Gruessen -- Stephan Seitz Senior System Administrator *netz-haut* e.K. multimediale kommunikation zweierweg 22 97074 würzburg fon: +49 931 2876247 fax: +49 931 2876248 web: http://www.netz-haut.de/ registriergericht: amtsgericht würzburg, hra 5054 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fajar A. Nugraha
2009-Sep-07 20:04 UTC
Re: [Xen-users] iSCSI domU - introducing more stability
On Mon, Sep 7, 2009 at 7:24 PM, netz-haut - stephan seitz<s.seitz@netz-haut.de> wrote:> Hi there, > > during peak load on some running domU, I noticed random iSCSI "Reported LUNs data has changed" which forced me to shutdown the respective domU, re-login the target and do a fsck before starting domU again. >Which one logs in to iscsi targets? Dom0 or domUs? What is your iscsi target? Linux? An appliance? Is there any error message on the target? -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
netz-haut - stephan seitz
2009-Sep-08 05:52 UTC
RE: [Xen-users] iSCSI domU - introducing more stability
> > during peak load on some running domU, I noticed random iSCSI > "Reported LUNs data has changed" which forced me to shutdown the > respective domU, re-login the target and do a fsck before starting domU > again. > > > > Which one logs in to iscsi targets? Dom0 or domUs? > What is your iscsi target? Linux? An appliance? Is there any error > message on the target?I adopted the block-iscsi script which has been posted on the list a while ago. It logs in to the respective target on dom0 for each domU. Every domU gets at least one LUN with at least one root partition on it. Most domUs are identical configured with two block devices; one is a LUN with two partitions: root fs ext3 and low prio swap; the other block device is located on local dom0 LVM: a LV with a high prio swap. iSCSI target is a iStor IntegraStor is325 appliance. The appliance takes the problem not really serious. It shows "Reported LUNs data has changed" followed by a "Scheduled Transaction". This is marked as warning, not as failure. The open-iscsi initiater seems to have more problems. It never received (or initiated?) the scheduled transaction. It marks the LUN as readonly device immediately. Depending on the workload, the domU has obviously some bigger problems with a readonly root fs... I''ve read about cpu saturation and iscsi timeouts, but I''m pretty sure (from dom0 view) there''s enough hardware overhead for I/O. From domU''s view, the cpu can and will be saturated by large block I/O. E.g. tar''ing some 800Gigs will kick the load on domU up to about 40-50. Some voodoo which I''ve never seen on physical machines ;) _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fajar A. Nugraha
2009-Sep-08 05:56 UTC
Re: [Xen-users] iSCSI domU - introducing more stability
On Tue, Sep 8, 2009 at 12:52 PM, netz-haut - stephan seitz<s.seitz@netz-haut.de> wrote:> iSCSI target is a iStor IntegraStor is325 appliance. The appliance takes the problem > not really serious. It shows "Reported LUNs data has changed" followed by a "Scheduled > Transaction". This is marked as warning, not as failure.So let me get this straight. Even the iscsi target (the appliance) actually reports that LUN data has changed? Isn''t open-iscsi initiator doing its job correctly then, since it detects the target has changed, it refused to write data to it to prevent corruption? -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users