Hi, I''m using Xen 2.0-testing. I use an LVM volume for swap partitions and loop files for the OS filesystems. I am using the program stress (http://weather.ou.edu/~apw/projects/stress) to simulate different kinds of workloads. As far as the CPU (sqrt()), IO (sync()), and VM (malloc ()) tests everything is fine and the domains seem fair and domain0 is robust. The problem begins when I use the HDD (write()) test inside a domU. As soon as it starts dom0 immediately freezes and it stays that way until the stress program exits. The domU running stress along with other domU''s remain responsive. I can get a snapshot of what is going on in dom0 by looking at stats immediately after the stress program exits. What I note is that xenblkd and the loop device for the domain are using some CPU, about 5% each. The rest of the CPU is consumed by "wa" (time spent waiting for I/O). I have tried this with dom0 and the domU on different and same processor. I have also used information from this list to tune the BVT scheduler to make dom0 warp and have a high priority (low MCU) compared to the domU''s. I read that BVT has problems with I/O so I tried using the round robin scheduler as a test and noticed the same behavior. Now onto how to solve this... 1) Is it most likely that this is caused by how the schedulers work or is it due to using loop files? 2) Could opening the loop files with the O_DIRECT flag cause any performance benefits? dom0 on my system only has 128MB (it runs very little). 3) If the problem is with the schedulers is there any possibility for a fix or to at least tune it to make it so dom0 doesn''t lock-up? It makes it very hard to login and diagnose which domain might be causing the high I/O when you can''t even connect to dom0. -- Thanks, Matt ------------------------------------------------------- This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting Tool for open source databases. Create drag-&-drop reports. Save time by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. Download a FREE copy at http://www.intelliview.com/go/osdn_nl _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> 1) Is it most likely that this is caused by how the schedulers work or > is it due to using loop files?In my experience, it''s due to using loop files. High I/O loads in the guests killed dom0''s interactivity and I/O response time. Since you have LVM set up already for the swap files, it should be easy to verify this with an all-LVM configuration? ------------------------------------------------------- This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting Tool for open source databases. Create drag-&-drop reports. Save time by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. Download a FREE copy at http://www.intelliview.com/go/osdn_nl _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Fri, 2005-02-04 at 16:15 -0500, John L Griffin wrote:> > 1) Is it most likely that this is caused by how the schedulers work or > > is it due to using loop files? > > In my experience, it''s due to using loop files. High I/O loads in the > guests killed dom0''s interactivity and I/O response time. Since you have > LVM set up already for the swap files, it should be easy to verify this > with an all-LVM configuration? > >I thought of this just as I clicked on send, yes it was easy for me to setup a test domain with a LVM backing. With this dom0 churns along happily using some CPU for xenblkd (normally 1-2%, bursts up to 6%). 0 % of CPU is in the "IO wait" state. Note before I was running just "stress --hdd 1", these numbers represent running "stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --hdd 4". I knew the loop interface added some overhead, just didn''t realize it was that much. I had only counted on it adding a few % to CPU for the loop kernel thread. Do any developers thing opening the loop files O_DIRECT is possible and if so, would it help performance? I think the major problem is that read() and write() calls with O_DIRECT must be aligned to the page size of the architecture. ------------------------------------------------------- This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting Tool for open source databases. Create drag-&-drop reports. Save time by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. Download a FREE copy at http://www.intelliview.com/go/osdn_nl _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel