thr3ads.net - Xen users - Disk starvation between DomU''s [Jun 2013]

If this information is useful, please help other people find it:
Share via:

Wiebe Cazemier

2013-Jun-10 08:11 UTC

Disk starvation between DomU''s

Hi,

I have the issue of my virtual machines becoming extremely slow when even only
one of them is creating a lot of I/O. Is there a way to prioritize disk access?
I can''t seem to find any.

The Xen host in question is:

- Quad core Xeon X3430  @ 2.40GHz
- 3Ware 9650SE RAID6 array, Seagate 2 TB disks.
- Xen 4.0.1-5.8 on Debian 6 (upgrade planned)
- 15 DomU''s
- All VM''s have noop as disk scheduler (cat
/sys/block/xvda2/queue/scheduler)
- VM''s are prioritized with ''xm sched-cred'', but that
doesn''t help the disk much.
- Dom-0 has significantly more credits (10000) because it needs to service
IO''s.
- Dom-0 doesn''t do anything else.
- All virtual disks are logical volumes, exposed to the VM through xen-blkfront

So, what can I do to improve disk performance or priority?

Regards,

Wiebe

Thanos Makatos

2013-Jun-10 08:24 UTC

head link

Re: Disk starvation between DomU''s

> -----Original Message-----
> From: xen-users-bounces@lists.xen.org [mailto:xen-users-
> bounces@lists.xen.org] On Behalf Of Wiebe Cazemier
> Sent: 10 June 2013 09:11
> To: xen-users@lists.xen.org
> Subject: [Xen-users] Disk starvation between DomU''s
> 
> Hi,
> 
> I have the issue of my virtual machines becoming extremely slow when
> even only one of them is creating a lot of I/O. Is there a way to
> prioritize disk access? I can''t seem to find any.
> 
> The Xen host in question is:
> 
> - Quad core Xeon X3430  @ 2.40GHz
> - 3Ware 9650SE RAID6 array, Seagate 2 TB disks.
> - Xen 4.0.1-5.8 on Debian 6 (upgrade planned)
> - 15 DomU''s
> - All VM''s have noop as disk scheduler (cat
> /sys/block/xvda2/queue/scheduler)
> - VM''s are prioritized with ''xm sched-cred'', but
that doesn''t help the
> disk much.
> - Dom-0 has significantly more credits (10000) because it needs to
> service IO''s.
> - Dom-0 doesn''t do anything else.
> - All virtual disks are logical volumes, exposed to the VM through xen-
> blkfront
> 
> So, what can I do to improve disk performance or priority?
Can you try using ionice to set the disk priority of the corresponding
tapdisk/qemu process?
> 
> Regards,
> 
> Wiebe
> 
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xen.org
> http://lists.xen.org/xen-users

Ian Campbell

2013-Jun-11 10:41 UTC

head link

Re: Disk starvation between DomU''s

On Mon, 2013-06-10 at 08:24 +0000, Thanos Makatos wrote:> 
> > -----Original Message-----
> > From: xen-users-bounces@lists.xen.org [mailto:xen-users-
> > bounces@lists.xen.org] On Behalf Of Wiebe Cazemier
> > Sent: 10 June 2013 09:11
> > To: xen-users@lists.xen.org
> > Subject: [Xen-users] Disk starvation between DomU''s
> > 
> > Hi,
> > 
> > I have the issue of my virtual machines becoming extremely slow when
> > even only one of them is creating a lot of I/O. Is there a way to
> > prioritize disk access? I can''t seem to find any.
> > 
> > The Xen host in question is:
> > 
> > - Quad core Xeon X3430  @ 2.40GHz
> > - 3Ware 9650SE RAID6 array, Seagate 2 TB disks.
> > - Xen 4.0.1-5.8 on Debian 6 (upgrade planned)
> > - 15 DomU''s
> > - All VM''s have noop as disk scheduler (cat
> > /sys/block/xvda2/queue/scheduler)
> > - VM''s are prioritized with ''xm
sched-cred'', but that doesn''t help the
> > disk much.
> > - Dom-0 has significantly more credits (10000) because it needs to
> > service IO''s.
> > - Dom-0 doesn''t do anything else.
> > - All virtual disks are logical volumes, exposed to the VM through
xen-
> > blkfront
> > 
> > So, what can I do to improve disk performance or priority?
> 
> Can you try using ionice to set the disk priority of the corresponding
tapdisk/qemu process?
....Or if using blkback the relevant kernel thread.

Ian.

Wiebe Cazemier

2013-Jun-17 15:24 UTC

head link

Re: Disk starvation between DomU''s

----- Original Message -----> From: "Ian Campbell" <Ian.Campbell@citrix.com>
> To: "Thanos Makatos" <thanos.makatos@citrix.com>
> Cc: "Wiebe Cazemier" <wiebe@halfgaar.net>,
xen-users@lists.xen.org
> Sent: Tuesday, 11 June, 2013 12:41:39 PM
> Subject: Re: [Xen-users] Disk starvation between DomU''s
> 
> ....Or if using blkback the relevant kernel thread.
> 
> Ian.
That''s what I ended up doing. After first having a certain Domu
"best effort, 0", I now put it in the real-time class, with prio 3. I
can''t say I notice any ''real-time'' performance now.
It still hangs occasionally.

Additionally, when I do the following on the virtual machine in question:

dd if=/dev/zero of=dummy bs=1M

I hardly see any disk activity on the Dom0 with iostat. I see the blkback
popping up occasionally with a few kb/s, but I would expect tens of
MB''s per second. The file ''dummy'' is several
GB''s big in a short while, so it does write.

Why don''t I see the traffic popping up in iostat on the Dom0?

Thanos Makatos

2013-Jun-19 08:53 UTC

head link

Re: Disk starvation between DomU''s

> -----Original Message-----
> From: Wiebe Cazemier [mailto:wiebe@halfgaar.net]
> Sent: 17 June 2013 16:24
> To: Ian Campbell
> Cc: Thanos Makatos; xen-users@lists.xen.org
> Subject: Re: [Xen-users] Disk starvation between DomU''s
> 
> ----- Original Message -----
> > From: "Ian Campbell" <Ian.Campbell@citrix.com>
> > To: "Thanos Makatos" <thanos.makatos@citrix.com>
> > Cc: "Wiebe Cazemier" <wiebe@halfgaar.net>,
xen-users@lists.xen.org
> > Sent: Tuesday, 11 June, 2013 12:41:39 PM
> > Subject: Re: [Xen-users] Disk starvation between DomU''s
> >
> > ....Or if using blkback the relevant kernel thread.
> >
> > Ian.
> 
> That''s what I ended up doing. After first having a certain Domu
"best
> effort, 0", I now put it in the real-time class, with prio 3. I
can''t
> say I notice any ''real-time'' performance now. It still
hangs
> occasionally.
I''m not sure whether this will work. AFAIK actual I/O is performed by
tapdisk/qemu, so could you experiment with that instead? Also, keep in mind that
there is CPU processing in the data path, so have a look at the dom0 CPU usage
when executing the I/O test.
> 
> Additionally, when I do the following on the virtual machine in
> question:
> 
> dd if=/dev/zero of=dummy bs=1M
> 
> I hardly see any disk activity on the Dom0 with iostat. I see the
> blkback popping up occasionally with a few kb/s, but I would expect
> tens of MB''s per second. The file ''dummy'' is
several GB''s big in a
> short while, so it does write.
> 
> Why don''t I see the traffic popping up in iostat on the Dom0?
This is inexplicable. Either you''ve found a bug, or there''s
something wrong in the I/O test. Could you post more details? (E.g. total I/O
performed, domU memory size, dom0 memory size, average CPU usage, etc.)

What''s the array''s I/O scheduler? I think since it''s
a RAID controller the "suggested" value is NOOP. If your backend is
tapdisk, then CFQ *might* do the trick since each domU is served by a different
tapdisk process (it may be the same with qemu).

Wiebe Cazemier

2013-Jun-19 12:06 UTC

head link

Re: Disk starvation between DomU''s

----- Original Message -----> From: "Thanos Makatos" <thanos.makatos@citrix.com>
> To: "Wiebe Cazemier" <wiebe@halfgaar.net>, "Ian
Campbell" <Ian.Campbell@citrix.com>
> Cc: xen-users@lists.xen.org
> Sent: Wednesday, 19 June, 2013 10:53:52 AM
> Subject: RE: [Xen-users] Disk starvation between DomU''s
> > 
> > That''s what I ended up doing. After first having a certain
Domu
> > "best
> > effort, 0", I now put it in the real-time class, with prio 3. I
> > can''t
> > say I notice any ''real-time'' performance now. It
still hangs
> > occasionally.
> 
> I''m not sure whether this will work. AFAIK actual I/O is performed
by
> tapdisk/qemu, so could you experiment with that instead? Also, keep
> in mind that there is CPU processing in the data path, so have a
> look at the dom0 CPU usage when executing the I/O test.
Tapdisk? I use the phy backend, with the DomU being on a logical volume. I
don''t even have processes with tap or qemu in their name.

As for the CPU usage; see below.
> 
> > 
> > Additionally, when I do the following on the virtual machine in
> > question:
> > 
> > dd if=/dev/zero of=dummy bs=1M
> > 
> > I hardly see any disk activity on the Dom0 with iostat. I see the
> > blkback popping up occasionally with a few kb/s, but I would expect
> > tens of MB''s per second. The file ''dummy''
is several GB''s big in a
> > short while, so it does write.
> > 
> > Why don''t I see the traffic popping up in iostat on the Dom0?
> 
> This is inexplicable. Either you''ve found a bug, or
there''s something
> wrong in the I/O test. Could you post more details? (E.g. total I/O
> performed, domU memory size, dom0 memory size, average CPU usage,
> etc.)
I have a DomU with ID 9 in "xm list", and the processes
"[blkback.9.xvda2]" and "[blkback.9.xvda1]" have RT/3
priority.

The DomU has 2 GB of RAM, no swap, 800 MB free (without cache). 
The Dom0 has 512 MB of RAM (288 free without cache, 30 in use on swap).
It''s mem is limited with a boot param.

When I do this on the DomU:

dd if=/dev/zero of=bla2.img bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 14.6285 s, 71.7 MB/s

I see the [blkback.9.xvda2] popping up at the top of "iotop" on the
Dom0, hanging between 50 and 300 kB/s. Nowhere near the 70 MB/s. There is hardly
any other process performing IO.

"iostat 2" does show a high blocks/s count for its logical volume,
dm-4.

The Dom0 uses about 30% CPU according to "xm top" while
dd''ing. It has 4 cores available.
> 
> What''s the array''s I/O scheduler? I think since
it''s a RAID
> controller the "suggested" value is NOOP. If your backend is
> tapdisk, then CFQ *might* do the trick since each domU is served by
> a different tapdisk process (it may be the same with qemu).
The host has a 3Ware RAID6 array. Dom0 has CFQ, all DomU''s have noop.
Are you saying that when using a hardware RAID, the Dom0 should use noop as
well?


Specs
Debian 6
Linux 2.6.32-5-xen-amd64
xen-hypervisor-4.0-amd64: 4.0.1-5.8
CPU: Intel(R) Xeon(R) CPU           X3430  @ 2.40GHz
16 GB RAM

Thanos Makatos

2013-Jun-19 14:28 UTC

head link

Re: Disk starvation between DomU''s

> The DomU has 2 GB of RAM, no swap, 800 MB free (without cache).
> The Dom0 has 512 MB of RAM (288 free without cache, 30 in use on swap).
> It''s mem is limited with a boot param.
> 
> When I do this on the DomU:
> 
> dd if=/dev/zero of=bla2.img bs=1M count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 14.6285 s, 71.7 MB/s
You''re generating 1 GB of I/O while domU''s memory is 2 GB, so
the entire workload fits in domU buffer cache. If you wait a bit longer after
the dd has finished you should see in iostat 1 GB of I/O traffic, you can make
this happen sooner by executing "sync" right after the dd.

I''d suggest increasing the number of blocks written (I''d say
at least 2xRAM size) and/or use oflag=direct in dd.
> > What''s the array''s I/O scheduler? I think since
it''s a RAID
> controller
> > the "suggested" value is NOOP. If your backend is tapdisk,
then CFQ
> > *might* do the trick since each domU is served by a different tapdisk
> > process (it may be the same with qemu).
> 
> The host has a 3Ware RAID6 array. Dom0 has CFQ, all DomU''s have
noop.
> Are you saying that when using a hardware RAID, the Dom0 should use
> noop as well?
The RAID6 array should be present in dom0 as a block device, and IIUC on top of
it you''ve created logical volumes (one per domU), is this correct?
AFAIK the raid controller provides sufficient scheduling so it''s
usually suggested to use NOOP, however you could experiment setting the I/O
scheduler of the RAI6 array block device to CFQ. I''m not sure what the
I/O scheduler of /dev/xvd* should be in each domU, but I suspect it''s
irrelevant to the issue you''re facing.

Wiebe Cazemier

2013-Jun-19 15:02 UTC

head link

Re: Disk starvation between DomU''s

----- Original Message -----> From: "Thanos Makatos" <thanos.makatos@citrix.com>
> To: "Wiebe Cazemier" <wiebe@halfgaar.net>
> Cc: xen-users@lists.xen.org
> Sent: Wednesday, 19 June, 2013 4:28:47 PM
> Subject: RE: [Xen-users] Disk starvation between DomU''s
> 
> > The DomU has 2 GB of RAM, no swap, 800 MB free (without cache).
> > The Dom0 has 512 MB of RAM (288 free without cache, 30 in use on
> > swap).
> > It''s mem is limited with a boot param.
> > 
> > When I do this on the DomU:
> > 
> > dd if=/dev/zero of=bla2.img bs=1M count=1000
> > 1000+0 records in
> > 1000+0 records out
> > 1048576000 bytes (1.0 GB) copied, 14.6285 s, 71.7 MB/s
> 
> You''re generating 1 GB of I/O while domU''s memory is 2
GB, so the
> entire workload fits in domU buffer cache. If you wait a bit longer
> after the dd has finished you should see in iostat 1 GB of I/O
> traffic, you can make this happen sooner by executing "sync"
right
> after the dd.
> 
> I''d suggest increasing the number of blocks written (I''d
say at least
> 2xRAM size) and/or use oflag=direct in dd.
Hmm. That didn''t make a difference, but something else did. I was
looking at the read column... Stupid mistake, but not so stupid as you might
think. My eye was drawn to the changing figures, and now I see that write IO is
always 0, according to iotop. When I do "iotop -oa" (accumulated,
leave out non-active processes), all blkback processes that are appearing all
accumulate 0 bytes written. I don''t understand that...
> 
> > > What''s the array''s I/O scheduler? I think since
it''s a RAID
> > controller
> > > the "suggested" value is NOOP. If your backend is
tapdisk, then
> > > CFQ
> > > *might* do the trick since each domU is served by a different
> > > tapdisk
> > > process (it may be the same with qemu).
> > 
> > The host has a 3Ware RAID6 array. Dom0 has CFQ, all DomU''s
have
> > noop.
> > Are you saying that when using a hardware RAID, the Dom0 should use
> > noop as well?
> 
> The RAID6 array should be present in dom0 as a block device, and IIUC
> on top of it you''ve created logical volumes (one per domU), is
this
> correct? AFAIK the raid controller provides sufficient scheduling so
> it''s usually suggested to use NOOP, however you could experiment
> setting the I/O scheduler of the RAI6 array block device to CFQ.
I''m
> not sure what the I/O scheduler of /dev/xvd* should be in each domU,
> but I suspect it''s irrelevant to the issue you''re facing.
> 
That''s correct. Currently, the RAID array is CFQ. I would seem weird to
me to change that into noop. The RAID controller might schedule, but it
can''t receive instructions from the OS what should have priority.
I''ll look into it, though.

I do know that the recommended DomU scheduler is noop. It''s also the
default for all my machines without configuring it. I guess they know
they''re virtual.

Thanos Makatos

2013-Jun-19 16:09 UTC

head link

Re: Disk starvation between DomU''s

> Hmm. That didn''t make a difference, but something else did. I was
> looking at the read column... Stupid mistake, but not so stupid as you
> might think. My eye was drawn to the changing figures, and now I see
> that write IO is always 0, according to iotop. When I do "iotop
-oa"
> (accumulated, leave out non-active processes), all blkback processes
> that are appearing all accumulate 0 bytes written. I don''t
understand
> that...
Can you try "iostat -x 1" instead of iotop?
> That''s correct. Currently, the RAID array is CFQ. I would seem
weird to
> me to change that into noop. The RAID controller might schedule, but it
> can''t receive instructions from the OS what should have priority.
I''ll
> look into it, though.
The rule of thumb is to let the RAID controller do the scheduling; otherwise the
two schedulers may end "competing" with each other. Of course this
depends on the RAID controller, the I/O workload etc. so it may make no
difference in your particular case.
> I do know that the recommended DomU scheduler is noop. It''s also
the
> default for all my machines without configuring it. I guess they know
> they''re virtual.
Not necessarily: using CFQ inside a VM would still make sense if you want to
enforce I/O fairness among the applications running inside it, although this
could potentially lead to weird interactions with the
OS''s/controller''s I/O scheduler.

Xen users - Jun 2013 - Disk starvation between DomU's

Disk starvation between DomU''s

Re: Disk starvation between DomU''s

Re: Disk starvation between DomU''s

Re: Disk starvation between DomU''s

Re: Disk starvation between DomU''s

Re: Disk starvation between DomU''s

Re: Disk starvation between DomU''s

Re: Disk starvation between DomU''s

Re: Disk starvation between DomU''s