Kenneth Kalmer
2006-Oct-17 22:32 UTC
[Xen-users] Yet another backup proposal (file based vbd''s & lvm)
Greeting list This is yet another request for feedback on a proposed backup solution for a xen environment, and I''ll appreciate any all scrutiny of what I want to attempt. Key features of xen backups: - Snapshot-like backups of domU''s - Avoiding LVM snapshots - Near-zero downtime First a brief overview of how we currently use Xen. We have a Linux storage server attached to our network with two bonded gigabit Ethernet cards. We have several dom0''s running on this same network. On the NAS we have a directory containing all our domU''s as file-based VBD''s. Each dom0 runs their selected domU''s from the NFS directly. These domU''s are basically just configured to offer a minimal amount of services, and are really small. The only things of importance inside them is the various configuration & user files for their particular services. All the data they require to deliver their services (WWW, POP, etc) also resides on the NAS under different exports. The problem is that the NAS is performing very poorly with this setup, and I intend to move the VM''s from the NAS directly to the hardware of their dom0, but before I take this leap of faith I need to make sure I have a viable backup solution in place.>From what I''ve gathered in the list archives and google, LVM snapshotsare frowned upon, rdiff-backup seems to get decent remarks, and dd-like backups are too slow and space consuming. What I''d like to attempt to setup (based on the feedback I receive) is the following. 1. Pause the domU using "xm pause" 2. Sync the domU using "xm sysrq" 3. Use rdiff-backup to make a local backup of the file VBD 4. Resume the domU using "xm unpause" After this I can rsync the backup directory to another server while the domU continues to run. This is based almost entirely on the xen-server-tools backup script by Christian Wieke from xmlvalidation.com. I can understand that the first backup will take a substantial amount of time, afterwards it should take that long since the only things that change inside our domU''s are config files and log files, everything else runs of the NAS. We have all the domU''s on the NFS to ease migration in case of hardware failures, and I need to be able to restore a backup on another machine within minutes of any critical failure. I believe the above solution will work for file-based VBD''s, albeit not the best or fastest solution around. Any feedback would be dually appreciated. -- Kenneth Kalmer kenneth.kalmer@gmail.com Folding@home stats http://fah-web.stanford.edu/cgi-bin/main.py?qtype=userpage&username=kenneth%2Ekalmer _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Roger Lucas
2006-Oct-17 23:11 UTC
RE: [Xen-users] Yet another backup proposal (file based vbd''s & lvm)
> >From what I''ve gathered in the list archives and google, LVM snapshots > are frowned upon, rdiff-backup seems to get decent remarks, and > dd-like backups are too slow and space consuming.?? LVM frowned upon ?? Can you clarify this - we use LVM here with Xen-3.0.2 and it works well. There are a lot of notes and discussions of using LVM as a block device for the DomU''s. AFAIK, Xen+LVM is pretty widely used.> > What I''d like to attempt to setup (based on the feedback I receive) is > the following. > > 1. Pause the domU using "xm pause" > 2. Sync the domU using "xm sysrq" > 3. Use rdiff-backup to make a local backup of the file VBD > 4. Resume the domU using "xm unpause" >If you don''t exclude LVM and instead use it to provide LVs as VBDs for the DomUs, then an alternative sequence of events would allow you to resume the DomU a lot faster: 1. Pause the domU using "xm pause" 2. Sync the domU using "xm sysrq" 3. Use LVM to snapshot to LV providing the VBD to the DomU 4. Resume the domU using "xm unpause" 5. Use rdiff-backup to make a local backup of the snapshot LV 6. Remove the snapshot Moving the rdiff-backup from stage 3 to stage 5 will allow you to reduce the time between the pause and unpause actions. Running rdiff-backup on (for example) a 10GB block device image will still take some time even on a fast system (e.g. with 200MB/sec disk access, just scanning the block device for data changes will take more than 100 seconds, 50 secs to read the old backup image and 50 secs to read the new file VBD), so you probably wouldn''t want to have to freeze a running server for this long. In comparison, taking a snapshot of a LV should take less than a second, after which you can resume the server again. BR, Roger _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Luke Crawford
2006-Oct-17 23:22 UTC
RE: [Xen-users] Yet another backup proposal (file based vbd''s & lvm)
On Wed, 18 Oct 2006, Roger Lucas wrote:>>> From what I''ve gathered in the list archives and google, LVM snapshots >> are frowned upon, rdiff-backup seems to get decent remarks, and >> dd-like backups are too slow and space consuming. > > ?? LVM frowned upon ??LVM is not frowned upon. LVM used as a way to name/have a bunch of partitions is quite stable. using lvm2 writable snapshots to create CoW disks is rather iffy- (which I bet is what the original poster was talking about) I have not heard of recent problems using _read only_ lvm snapshots to take backups, so that part is relitively safe. Note- there is a big difference between read-only and read-write snapshots. read-write snapshots are much more complex to implement. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Tim Post
2006-Oct-18 04:30 UTC
Re: [Xen-users] Yet another backup proposal (file based vbd''s & lvm)
On Wed, 2006-10-18 at 00:32 +0200, Kenneth Kalmer wrote:> Greeting listHello :)> > This is yet another request for feedback on a proposed backup solution > for a xen environment, and I''ll appreciate any all scrutiny of what I > want to attempt.Be careful what you ask for :P> > What I''d like to attempt to setup (based on the feedback I receive) is > the following. > > 1. Pause the domU using "xm pause" > 2. Sync the domU using "xm sysrq" > 3. Use rdiff-backup to make a local backup of the file VBD > 4. Resume the domU using "xm unpause" >I''m going to ask the same ''why not use lvm'' question here, and toss in the idea of replacing NFS with AoE. LVM read only snapshots are rather safe. If you don''t want to use LVM, setup ocfs2 and join dom-0 and the nas to the same cluster. Format a big partition on the nas (ocfs2) , create your loops there as you would normally with ext3 file systems. Export it as 0 0 and all of your xen nodes (dom-0) have access. Mount the big partition via AoE on dom-0, get the loop and boot it, then use the sysrq method to sync and rdiff to backup as you said. Doing this, however is only really basically emulating what a lvm/clvm snapshot would do. However I admire your drive toward simplicity, I have no love for LVM either, but realize its use in this type of setting, despite bad past experiences. This would of course work with NFS, but nfs will be a bottleneck.> After this I can rsync the backup directory to another server while > the domU continues to run. This is based almost entirely on the > xen-server-tools backup script by Christian Wieke from > xmlvalidation.com.Again, use AoE. Faster, easier ..> > We have all the domU''s on the NFS to ease migration in case of > hardware failures, and I need to be able to restore a backup on > another machine within minutes of any critical failure. I believe the > above solution will work for file-based VBD''s, albeit not the best or > fastest solution around.I think with a better network fs and medium (aoe and ocfs2) you''d get the performance increase you want and simplify things, while removing the hassle of nfs. Migration would also be very easy. Remember, loops can be exported as block devices via AoE too :) I''m still going to officially recommend LVM + AoE as it solves all of your problems, but completely understand a reluctance to use it and the need for something a little different.> > Any feedback would be dually appreciated. >Best, -Tim _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Kenneth Kalmer
2006-Oct-18 05:33 UTC
Re: [Xen-users] Yet another backup proposal (file based vbd''s & lvm)
On 10/18/06, Tim Post <tim.post@netkinetics.net> wrote:> On Wed, 2006-10-18 at 00:32 +0200, Kenneth Kalmer wrote: > > Greeting list > > Hello :)Hi Tim, and Roger and Roger as well> > This is yet another request for feedback on a proposed backup solution > > for a xen environment, and I''ll appreciate any all scrutiny of what I > > want to attempt. > > Be careful what you ask for :PI love a good sense of humour :)> > What I''d like to attempt to setup (based on the feedback I receive) is > > the following. > > > > 1. Pause the domU using "xm pause" > > 2. Sync the domU using "xm sysrq" > > 3. Use rdiff-backup to make a local backup of the file VBD > > 4. Resume the domU using "xm unpause" > > > > I''m going to ask the same ''why not use lvm'' question here, and toss in > the idea of replacing NFS with AoE. LVM read only snapshots are rather > safe.I''ll be making the transition from file backed to lvm as we get more hardware in... But the AoE looks promising as well. We used the file-backed VBD''s to be ready for live migrations in the future...> If you don''t want to use LVM, setup ocfs2 and join dom-0 and the nas to > the same cluster. > > Format a big partition on the nas (ocfs2) , create your loops there as > you would normally with ext3 file systems. Export it as 0 0 and all of > your xen nodes (dom-0) have access. > > Mount the big partition via AoE on dom-0, get the loop and boot it, then > use the sysrq method to sync and rdiff to backup as you said. Doing > this, however is only really basically emulating what a lvm/clvm > snapshot would do. However I admire your drive toward simplicity, I have > no love for LVM either, but realize its use in this type of setting, > despite bad past experiences. > > This would of course work with NFS, but nfs will be a bottleneck. > > > After this I can rsync the backup directory to another server while > > the domU continues to run. This is based almost entirely on the > > xen-server-tools backup script by Christian Wieke from > > xmlvalidation.com. > > Again, use AoE. Faster, easier .. > > > > > We have all the domU''s on the NFS to ease migration in case of > > hardware failures, and I need to be able to restore a backup on > > another machine within minutes of any critical failure. I believe the > > above solution will work for file-based VBD''s, albeit not the best or > > fastest solution around. > > I think with a better network fs and medium (aoe and ocfs2) you''d get > the performance increase you want and simplify things, while removing > the hassle of nfs. Migration would also be very easy. Remember, loops > can be exported as block devices via AoE too :) I''m still going to > officially recommend LVM + AoE as it solves all of your problems, but > completely understand a reluctance to use it and the need for something > a little different.I''ll definitely look into this setup. I was trying to avoid LVM two reasons (Roger, this one''s for you), the apparent snapshot issue that everyone seems to complain about throughout the list and the issue of live migrations between hosts. But this is only using LVM as a backend for the domU''s. I use LVM on the NAS with ext3 for the ability to grow my partitions as needed and resize the ext3 fs on top of the volumes, and it works beautifully. Luke, I''ll setup a test server here using read-only LVM snapshots and keep on backing them up to see if I get the lockups that the previous posts in the archive (and the kernel archives) mention. I''ll report my successes here when I''m done. Thanks all! -- Kenneth Kalmer kenneth.kalmer@gmail.com Folding@home stats http://fah-web.stanford.edu/cgi-bin/main.py?qtype=userpage&username=kenneth%2Ekalmer _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Roger Lucas
2006-Oct-18 08:36 UTC
RE: [Xen-users] Yet another backup proposal (file based vbd''s & lvm)
Hi Ken,> I''ll definitely look into this setup. I was trying to avoid LVM two > reasons (Roger, this one''s for you), the apparent snapshot issue that > everyone seems to complain about throughout the list and the issue of > live migrations between hosts. But this is only using LVM as a backend > for the domU''s. I use LVM on the NAS with ext3 for the ability to grow > my partitions as needed and resize the ext3 fs on top of the volumes, > and it works beautifully. > > Luke, I''ll setup a test server here using read-only LVM snapshots and > keep on backing them up to see if I get the lockups that the previous > posts in the archive (and the kernel archives) mention. I''ll report my > successes here when I''m done. >If you are going to try LVM, I strongly recommend that you download the latest LVM code from RedHat and recompile it for your system... http://sources.redhat.com/dm/ http://sourceware.org/lvm2/ As you say, there have been lockup problems with LVM snapshots in the past, but these have been fixed in the recent releases. We are running Xen-3.0.2-2 on this server with the following: root@hydra:~# uname -a Linux hydra 2.6.16-xen #1 SMP Thu Apr 13 18:46:07 BST 2006 i686 GNU/Linux root@hydra:~# lvm version LVM version: 2.02.07 (2006-07-17) Library version: 1.02.08 (2006-07-17) Driver version: 4.5.0 root@hydra:~# So far, no LVM or Xen problems at all with an uptime of several months. We use LVM snapshots daily both within the Dom-0 and also within the Dom-Us (which use LVM on top of the VBDs) to take backups. BR, Roger _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Kenneth Kalmer
2006-Oct-18 21:36 UTC
Re: [Xen-users] Yet another backup proposal (file based vbd''s & lvm)
On 10/18/06, Roger Lucas <roger@planbit.co.uk> wrote:> Hi Ken, > > > I''ll definitely look into this setup. I was trying to avoid LVM two > > reasons (Roger, this one''s for you), the apparent snapshot issue that > > everyone seems to complain about throughout the list and the issue of > > live migrations between hosts. But this is only using LVM as a backend > > for the domU''s. I use LVM on the NAS with ext3 for the ability to grow > > my partitions as needed and resize the ext3 fs on top of the volumes, > > and it works beautifully. > > > > Luke, I''ll setup a test server here using read-only LVM snapshots and > > keep on backing them up to see if I get the lockups that the previous > > posts in the archive (and the kernel archives) mention. I''ll report my > > successes here when I''m done. > > > > If you are going to try LVM, I strongly recommend that you download the latest LVM code from RedHat and recompile it for your > system... > > http://sources.redhat.com/dm/ > http://sourceware.org/lvm2/ > > As you say, there have been lockup problems with LVM snapshots in the past, but these have been fixed in the recent releases. We > are running Xen-3.0.2-2 on this server with the following: > > root@hydra:~# uname -a > Linux hydra 2.6.16-xen #1 SMP Thu Apr 13 18:46:07 BST 2006 i686 GNU/Linux > root@hydra:~# lvm version > LVM version: 2.02.07 (2006-07-17) > Library version: 1.02.08 (2006-07-17) > Driver version: 4.5.0 > root@hydra:~# > > So far, no LVM or Xen problems at all with an uptime of several months. We use LVM snapshots daily both within the Dom-0 and also > within the Dom-Us (which use LVM on top of the VBDs) to take backups. > > BR, > > RogerThanks Roger, good to have confirmation that this does indeed work as expected! It seems like the best solution, once we''ve completed our tests I''ll start the tedious process of migrating all our domU''s to LVM backends... Sigh -- Kenneth Kalmer kenneth.kalmer@gmail.com Folding@home stats http://fah-web.stanford.edu/cgi-bin/main.py?qtype=userpage&username=kenneth%2Ekalmer _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Williamson
2006-Oct-21 02:48 UTC
Re: [Xen-users] Yet another backup proposal (file based vbd''s & lvm)
On Wednesday 18 October 2006 00:11, Roger Lucas wrote:> > >From what I''ve gathered in the list archives and google, LVM snapshots > > > > are frowned upon, rdiff-backup seems to get decent remarks, and > > dd-like backups are too slow and space consuming. > > ?? LVM frowned upon ?? > > Can you clarify this - we use LVM here with Xen-3.0.2 and it works well. > There are a lot of notes and discussions of using LVM as a block device for > the DomU''s. AFAIK, Xen+LVM is pretty widely used.LVM snapshots are frowned upon because if you keep them running for a long time (e.g. to emulate a copy-on-write block device) they tend to chew up lots of memory, and eventually cause out of memory conditions in dom0 - bad! Using them to snapshot during a backup is probably less of a problem that these long-lived snapshots. Various folks have been looking at alternatives for long-lived snapshots - the blktap approach supports a few, for instance. Cheers, Mark> > What I''d like to attempt to setup (based on the feedback I receive) is > > the following. > > > > 1. Pause the domU using "xm pause" > > 2. Sync the domU using "xm sysrq" > > 3. Use rdiff-backup to make a local backup of the file VBD > > 4. Resume the domU using "xm unpause" > > If you don''t exclude LVM and instead use it to provide LVs as VBDs for the > DomUs, then an alternative sequence of events would allow you to resume the > DomU a lot faster: > > 1. Pause the domU using "xm pause" > 2. Sync the domU using "xm sysrq" > 3. Use LVM to snapshot to LV providing the VBD to the DomU > 4. Resume the domU using "xm unpause" > 5. Use rdiff-backup to make a local backup of the snapshot LV > 6. Remove the snapshot > > Moving the rdiff-backup from stage 3 to stage 5 will allow you to reduce > the time between the pause and unpause actions. Running rdiff-backup on > (for example) a 10GB block device image will still take some time even on a > fast system (e.g. with 200MB/sec disk access, just scanning the block > device for data changes will take more than 100 seconds, 50 secs to read > the old backup image and 50 secs to read the new file VBD), so you probably > wouldn''t want to have to freeze a running server for this long. In > comparison, taking a snapshot of a LV should take less than a second, after > which you can resume the server again. > > BR, > > Roger > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Williamson
2006-Oct-21 03:06 UTC
Re: [Xen-users] Yet another backup proposal (file based vbd''s & lvm)
> > What I''d like to attempt to setup (based on the feedback I receive) is > > the following. > > > > 1. Pause the domU using "xm pause" > > 2. Sync the domU using "xm sysrq" > > 3. Use rdiff-backup to make a local backup of the file VBD > > 4. Resume the domU using "xm unpause"> 1. Pause the domU using "xm pause" > 2. Sync the domU using "xm sysrq" > 3. Use LVM to snapshot to LV providing the VBD to the DomU > 4. Resume the domU using "xm unpause" > 5. Use rdiff-backup to make a local backup of the snapshot LV > 6. Remove the snapshotYou''re both going to have to change the sync-ing stage to come before the pause - otherwise the domain won''t be running and therefore won''t be able to sync it''s disks! Cheers, Mark> > Moving the rdiff-backup from stage 3 to stage 5 will allow you to reduce > the time between the pause and unpause actions. Running rdiff-backup on > (for example) a 10GB block device image will still take some time even on a > fast system (e.g. with 200MB/sec disk access, just scanning the block > device for data changes will take more than 100 seconds, 50 secs to read > the old backup image and 50 secs to read the new file VBD), so you probably > wouldn''t want to have to freeze a running server for this long. In > comparison, taking a snapshot of a LV should take less than a second, after > which you can resume the server again. > > BR, > > Roger > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users