thr3ads.net - Xen users - [Xen-users] Snapshotting LVM backed guests from dom0 [Apr 2010]

If this information is useful, please help other people find it:
Share via:

chris

2010-Apr-17 18:53 UTC

[Xen-users] Snapshotting LVM backed guests from dom0

Just looking for some feedback from other people who do this. I know
its not a good "backup" method but "crash consistent" images
have been
very useful for me in disaster situations just to get OS running
quickly then restore data from a data backup. My typical setup is to
put the LV in snapshot mode while guest is running then dd the data to
a backup file which is on a NFS mount point. The thing that seems to
be happening is that the VM''s performance gets pretty poor during the
time the copy is happening. My guesses at why this was happening were:

1.   dom0 having equal weight to the other 4 guests on the box and
somehow hogging cpu time
2.   lack of QoS on the IO side / dom0 hogging IO
3.   process priorities in dom0
4.   NFS overhead

For each of these items I tried to adjust things to see if it improved.

1.   Tried increasing dom0 weight to 4x the other VM''s.
2.   Saw pasi mentioning dm-ioband a few times and think this might
address IO scheduling but haven''t tried it yet.
3.   Tried nice-ing the dd to lowest priority and qemu-dm to highest
4.   Changing destination to a local

Changing the things above didn''t really seem to help either alone or
in combination. My setup is Xen 3.2 and Xen 4.0 on dual nehalem
processors, 24GB RAM, RAID 5+0 of WD RE3 1TB disks. The hardware in
the boxes is quite good and there seems to be no noticable difference
between Xen versions. What I''d ideally like to accomplish is to be
able to take the backups with the least possible impact on the running
VM''s as possible. I honestly don''t care how long the backups
take but
I want to avoid just slowing them down to a fixed speed, because it
seems inefficient/hacky. Can anyone share their experiences both good
and bad?

Thanks,
- chris

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

chris

2010-Apr-23 17:53 UTC

head link

[Xen-users] Re: Snapshotting LVM backed guests from dom0

I think this got missed during the mailinglist downtime last
weekend... I can''t imagine no one has any inpurt?

- chris

On Sat, Apr 17, 2010 at 2:53 PM, chris <tknchris@gmail.com>
wrote:> Just looking for some feedback from other people who do this. I know
> its not a good "backup" method but "crash consistent"
images have been
> very useful for me in disaster situations just to get OS running
> quickly then restore data from a data backup. My typical setup is to
> put the LV in snapshot mode while guest is running then dd the data to
> a backup file which is on a NFS mount point. The thing that seems to
> be happening is that the VM''s performance gets pretty poor during
the
> time the copy is happening. My guesses at why this was happening were:
>
> 1.   dom0 having equal weight to the other 4 guests on the box and
> somehow hogging cpu time
> 2.   lack of QoS on the IO side / dom0 hogging IO
> 3.   process priorities in dom0
> 4.   NFS overhead
>
> For each of these items I tried to adjust things to see if it improved.
>
> 1.   Tried increasing dom0 weight to 4x the other VM''s.
> 2.   Saw pasi mentioning dm-ioband a few times and think this might
> address IO scheduling but haven''t tried it yet.
> 3.   Tried nice-ing the dd to lowest priority and qemu-dm to highest
> 4.   Changing destination to a local
>
> Changing the things above didn''t really seem to help either alone
or
> in combination. My setup is Xen 3.2 and Xen 4.0 on dual nehalem
> processors, 24GB RAM, RAID 5+0 of WD RE3 1TB disks. The hardware in
> the boxes is quite good and there seems to be no noticable difference
> between Xen versions. What I''d ideally like to accomplish is to be
> able to take the backups with the least possible impact on the running
> VM''s as possible. I honestly don''t care how long the
backups take but
> I want to avoid just slowing them down to a fixed speed, because it
> seems inefficient/hacky. Can anyone share their experiences both good
> and bad?
>
> Thanks,
> - chris
>
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Jeff Sturm

2010-Apr-23 19:34 UTC

head link

RE: [Xen-users] Re: Snapshotting LVM backed guests from dom0

Chris,

Saw your original post, but hesitated to respond, since I''m not really
an expert on either Linux block I/O or NFS.  Anyway...

On Sat, Apr 17, 2010 at 2:53 PM, chris <tknchris@gmail.com>
wrote:> Just looking for some feedback from other people who do this. I know
> its not a good "backup" method but "crash consistent"
images have been
> very useful for me in disaster situations just to get OS running
> quickly then restore data from a data backup. My typical setup is to
> put the LV in snapshot mode while guest is running then dd the data to
> a backup file which is on a NFS mount point. The thing that seems to
> be happening is that the VM''s performance gets pretty poor during
the
> time the copy is happening.
We see this all the time on Linux hosts.  One process with heavy I/O can starve
others.

I''m not quite sure why but I suspect it has something to do with the
unified buffer cache.  When reading a large volume with "normal" I/O,
buffer pages might get quickly replaced with pages that are never going to be
read again, and your buffer cache hit ratio suffers.  Every other process on the
affected host that needs to do I/O may experience longer latency as a result. 
With Xen, that includes any domU.

A quick fix that worked for us:  Direct I/O.  Run your "dd" command
with "iflag=direct" and/or "oflag=direct", if your version
supports it (definitely works on CentOS 5.x, definitely *not* on CentOS 4.x). 
This bypasses the buffer cache completely and forces dd to read/write direct to
the underlying disk device.  Make sure you use an ample block size
("bs=64k" or larger) so the copy will finish in reasonable time.

Not sure if that''ll work properly with NFS, however.  (Having been
badly burned by NFS numerous times I tend to not use it on production hosts.) 
To copy disks from one host to another, we resort to tricks like piping over ssh
(e.g. "dd if=<somefile> iflag=direct bs=256k | ssh <otherhost>
-c ''dd of=<otherfile> oflag=direct bs=256k''"). 
These copies run slow, but steady.   Importantly they run with minimal impact on
other processing going on at the time.
> 3.   Tried nice-ing the dd to lowest priority and qemu-dm to highest
"nice" applies only to CPU scheduling and probably isn''t
helpful for this.  You could try playing with ionice, which lets you override
scheduling priorities on a per-process basis.

Jeff

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nick Couchman

2010-Apr-23 19:37 UTC

head link

[Xen-users] Re: Snapshotting LVM backed guests from dom0

> On Sat, Apr 17, 2010 at 2:53 PM, chris <tknchris@gmail.com> wrote:
>> Just looking for some feedback from other people who do this. I know
>> its not a good "backup" method but "crash
consistent" images have been
>> very useful for me in disaster situations just to get OS running
>> quickly then restore data from a data backup. My typical setup is to
>> put the LV in snapshot mode while guest is running then dd the data to
>> a backup file which is on a NFS mount point. The thing that seems to
>> be happening is that the VM''s performance gets pretty poor
during the
>> time the copy is happening. My guesses at why this was happening were:
>>
>> 1.   dom0 having equal weight to the other 4 guests on the box and
>> somehow hogging cpu time
>> 2.   lack of QoS on the IO side / dom0 hogging IO
>> 3.   process priorities in dom0
>> 4.   NFS overhead
>>
>> For each of these items I tried to adjust things to see if it improved.
>>
>> 1.   Tried increasing dom0 weight to 4x the other VM''s.
Probably not going to help - if you increase the weight, you''ll choke
out your other domUs, if you decrease the weight, the domUs also may be affected
because network and disk I/O end up going through dom0 in the end, anyway.
>> 2.   Saw pasi mentioning dm-ioband a few times and think this might
>> address IO scheduling but haven''t tried it yet.
>> 3.   Tried nice-ing the dd to lowest priority and qemu-dm to highest
I would expect this to help, some, but may not be the only thing.  Also,
remember that network and disk I/O are still done through drivers on dom0, which
means pushing qemu-dm to the highest really won''t buy you anything.  I
would expect re-niceing dd to help some, though.
>> 4.   Changing destination to a local
This indicates that the bottleneck is local and not the network.  The next step
would be to grab some Linux performance monitoring and debugging tools and
figure out where your bottleneck is.   So, things like top, xentop, iostat,
vmstat, and sar may be useful in determining what component is hitting its
performance limit and needs to be tweaked or worked around.

-Nick




--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

chris

2010-Apr-23 21:10 UTC

head link

Re: [Xen-users] Re: Snapshotting LVM backed guests from dom0

Thanks everyone for the tips i will try experimenting with these over
this weekend and let you know how much it helps if any.

- chris

On Fri, Apr 23, 2010 at 3:37 PM, Nick Couchman <Nick.Couchman@seakr.com>
wrote:>> On Sat, Apr 17, 2010 at 2:53 PM, chris <tknchris@gmail.com>
wrote:
>>> Just looking for some feedback from other people who do this. I
know
>>> its not a good "backup" method but "crash
consistent" images have been
>>> very useful for me in disaster situations just to get OS running
>>> quickly then restore data from a data backup. My typical setup is
to
>>> put the LV in snapshot mode while guest is running then dd the data
to
>>> a backup file which is on a NFS mount point. The thing that seems
to
>>> be happening is that the VM''s performance gets pretty poor
during the
>>> time the copy is happening. My guesses at why this was happening
were:
>>>
>>> 1.   dom0 having equal weight to the other 4 guests on the box and
>>> somehow hogging cpu time
>>> 2.   lack of QoS on the IO side / dom0 hogging IO
>>> 3.   process priorities in dom0
>>> 4.   NFS overhead
>>>
>>> For each of these items I tried to adjust things to see if it
improved.
>>>
>>> 1.   Tried increasing dom0 weight to 4x the other VM''s.
>
> Probably not going to help - if you increase the weight, you''ll
choke out your other domUs, if you decrease the weight, the domUs also may be
affected because network and disk I/O end up going through dom0 in the end,
anyway.
>
>>> 2.   Saw pasi mentioning dm-ioband a few times and think this might
>>> address IO scheduling but haven''t tried it yet.
>>> 3.   Tried nice-ing the dd to lowest priority and qemu-dm to
highest
>
> I would expect this to help, some, but may not be the only thing.  Also,
remember that network and disk I/O are still done through drivers on dom0, which
means pushing qemu-dm to the highest really won''t buy you anything.  I
would expect re-niceing dd to help some, though.
>
>>> 4.   Changing destination to a local
>
> This indicates that the bottleneck is local and not the network.  The next
step would be to grab some Linux performance monitoring and debugging tools and
figure out where your bottleneck is.   So, things like top, xentop, iostat,
vmstat, and sar may be useful in determining what component is hitting its
performance limit and needs to be tweaked or worked around.
>
> -Nick
>
>
>
>
> --------
> This e-mail may contain confidential and privileged material for the sole
use of the intended recipient.  If this email is not intended for you, or you
are not responsible for the delivery of this message to the intended recipient,
please note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.
>
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Xen users - Apr 2010 - Snapshotting LVM backed guests from dom0

[Xen-users] Snapshotting LVM backed guests from dom0

[Xen-users] Re: Snapshotting LVM backed guests from dom0

RE: [Xen-users] Re: Snapshotting LVM backed guests from dom0

[Xen-users] Re: Snapshotting LVM backed guests from dom0

Re: [Xen-users] Re: Snapshotting LVM backed guests from dom0