Yes but the biggest issue is how to recover You'll need to recover the whole storage not a single snapshot and this can last for days Il 23 mar 2017 9:24 PM, "Alvin Starr" <alvin at netvel.net> ha scritto:> For volume backups you need something like snapshots. > > If you take a snapshot A of a live volume L that snapshot stays at that > moment in time and you can rsync that to another system or use something > like deltacp.pl to copy it. > > The usual process is to delete the snapshot once its copied and than > repeat the process again when the next backup is required. > > That process does require rsync/deltacp to read the complete volume on > both systems which can take a long time. > > I was kicking around the idea to try and handle snapshot deltas better. > > The idea is that you could take your initial snapshot A then sync that > snapshot to your backup system. > > At a later point you could take another snapshot B. > > Because snapshots contain the copies of the original data at the time of > the snapshot and unmodified data points to the Live volume it is possible > to tell what blocks of data have changed since the snapshot was taken. > > Now that you have a second snapshot you can in essence perform a diff on > the A and B snapshots to get only the blocks that changed up to the time > that B was taken. > > These blocks could be copied to the backup image and you should have a > clone of the B snapshot. > > You would not have to read the whole volume image but just the changed > blocks dramatically improving the speed of the backup. > > At this point you can delete the A snapshot and promote the B snapshot to > be the A snapshot for the next backup round. > > On 03/23/2017 03:53 PM, Gandalf Corvotempesta wrote: > > Are backup consistent? > What happens if the header on shard0 is synced referring to some data on > shard450 and when rsync parse shard450 this data is changed by subsequent > writes? > > Header would be backupped of sync respect the rest of the image > > Il 23 mar 2017 8:48 PM, "Joe Julian" <joe at julianfamily.org> ha scritto: > >> The rsync protocol only passes blocks that have actually changed. Raw >> changes fewer bits. You're right, though, that it still has to check the >> entire file for those changes. >> >> On 03/23/17 12:47, Gandalf Corvotempesta wrote: >> >> Raw or qcow doesn't change anything about the backup. >> Georep always have to sync the whole file >> >> Additionally, raw images has much less features than qcow >> >> Il 23 mar 2017 8:40 PM, "Joe Julian" <joe at julianfamily.org> ha scritto: >> >>> I always use raw images. And yes, sharding would also be good. >>> >>> On 03/23/17 12:36, Gandalf Corvotempesta wrote: >>> >>> Georep expose to another problem: >>> When using gluster as storage for VM, the VM file is saved as qcow. >>> Changes are inside the qcow, thus rsync has to sync the whole file every >>> time >>> >>> A little workaround would be sharding, as rsync has to sync only the >>> changed shards, but I don't think this is a good solution >>> >>> Il 23 mar 2017 8:33 PM, "Joe Julian" <joe at julianfamily.org> ha scritto: >>> >>>> In many cases, a full backup set is just not feasible. Georep to the >>>> same or different DC may be an option if the bandwidth can keep up with the >>>> change set. If not, maybe breaking the data up into smaller more manageable >>>> volumes where you only keep a smaller set of critical data and just back >>>> that up. Perhaps an object store (swift?) might handle fault tolerance >>>> distribution better for some workloads. >>>> >>>> There's no one right answer. >>>> >>>> On 03/23/17 12:23, Gandalf Corvotempesta wrote: >>>> >>>> Backing up from inside each VM doesn't solve the problem >>>> If you have to backup 500VMs you just need more than 1 day and what if >>>> you have to restore the whole gluster storage? >>>> >>>> How many days do you need to restore 1PB? >>>> >>>> Probably the only solution should be a georep in the same >>>> datacenter/rack with a similiar cluster, >>>> ready to became the master storage. >>>> In this case you don't need to restore anything as data are already >>>> there, >>>> only a little bit back in time but this double the TCO >>>> >>>> Il 23 mar 2017 6:39 PM, "Serkan ?oban" <cobanserkan at gmail.com> ha >>>> scritto: >>>> >>>>> Assuming a backup window of 12 hours, you need to send data at 25GB/s >>>>> to backup solution. >>>>> Using 10G Ethernet on hosts you need at least 25 host to handle 25GB/s. >>>>> You can create an EC gluster cluster that can handle this rates, or >>>>> you just backup valuable data from inside VMs using open source backup >>>>> tools like borg,attic,restic , etc... >>>>> >>>>> On Thu, Mar 23, 2017 at 7:48 PM, Gandalf Corvotempesta >>>>> <gandalf.corvotempesta at gmail.com> wrote: >>>>> > Let's assume a 1PB storage full of VMs images with each brick over >>>>> ZFS, >>>>> > replica 3, sharding enabled >>>>> > >>>>> > How do you backup/restore that amount of data? >>>>> > >>>>> > Backing up daily is impossible, you'll never finish the backup that >>>>> the >>>>> > following one is starting (in other words, you need more than 24 >>>>> hours) >>>>> > >>>>> > Restoring is even worse. You need more than 24 hours with the whole >>>>> cluster >>>>> > down >>>>> > >>>>> > You can't rely on ZFS snapshot due to sharding (the snapshot took >>>>> from one >>>>> > node is useless without all other node related at the same shard) >>>>> and you >>>>> > still have the same restore speed >>>>> > >>>>> > How do you backup this? >>>>> > >>>>> > Even georep isn't enough, if you have to restore the whole storage >>>>> in case >>>>> > of disaster >>>>> > >>>>> > _______________________________________________ >>>>> > Gluster-users mailing list >>>>> > Gluster-users at gluster.org >>>>> > http://lists.gluster.org/mailman/listinfo/gluster-users >>>>> >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing listGluster-users at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>>> _______________________________________________ Gluster-users mailing >>>> list Gluster-users at gluster.org http://lists.gluster.org/mailm >>>> an/listinfo/gluster-users >>> >>> _______________________________________________ > Gluster-users mailing listGluster-users at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users > > -- > Alvin Starr || voice: (905)513-7688 <(905)%20513-7688> > Netvel Inc. || Cell: (416)806-0133 <(416)%20806-0133>alvin at netvel.net || > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170323/fd8127e1/attachment.html>
That's true and it can last much longer than days. I have a client that has some data-sets that take months to copy and are not the biggest data user in the world. The biggest problems with backups is that some day you may need to restore them. On 03/23/2017 04:29 PM, Gandalf Corvotempesta wrote:> Yes but the biggest issue is how to recover > You'll need to recover the whole storage not a single snapshot and > this can last for days > > Il 23 mar 2017 9:24 PM, "Alvin Starr" <alvin at netvel.net > <mailto:alvin at netvel.net>> ha scritto: > > For volume backups you need something like snapshots. > > If you take a snapshot A of a live volume L that snapshot stays at > that moment in time and you can rsync that to another system or > use something like deltacp.pl <http://deltacp.pl> to copy it. > > The usual process is to delete the snapshot once its copied and > than repeat the process again when the next backup is required. > > That process does require rsync/deltacp to read the complete > volume on both systems which can take a long time. > > I was kicking around the idea to try and handle snapshot deltas > better. > > The idea is that you could take your initial snapshot A then sync > that snapshot to your backup system. > > At a later point you could take another snapshot B. > > Because snapshots contain the copies of the original data at the > time of the snapshot and unmodified data points to the Live volume > it is possible to tell what blocks of data have changed since the > snapshot was taken. > > Now that you have a second snapshot you can in essence perform a > diff on the A and B snapshots to get only the blocks that changed > up to the time that B was taken. > > These blocks could be copied to the backup image and you should > have a clone of the B snapshot. > > You would not have to read the whole volume image but just the > changed blocks dramatically improving the speed of the backup. > > At this point you can delete the A snapshot and promote the B > snapshot to be the A snapshot for the next backup round. > > > On 03/23/2017 03:53 PM, Gandalf Corvotempesta wrote: >> Are backup consistent? >> What happens if the header on shard0 is synced referring to some >> data on shard450 and when rsync parse shard450 this data is >> changed by subsequent writes? >> >> Header would be backupped of sync respect the rest of the image >> >> Il 23 mar 2017 8:48 PM, "Joe Julian" <joe at julianfamily.org >> <mailto:joe at julianfamily.org>> ha scritto: >> >> The rsync protocol only passes blocks that have actually >> changed. Raw changes fewer bits. You're right, though, that >> it still has to check the entire file for those changes. >> >> >> On 03/23/17 12:47, Gandalf Corvotempesta wrote: >>> Raw or qcow doesn't change anything about the backup. >>> Georep always have to sync the whole file >>> >>> Additionally, raw images has much less features than qcow >>> >>> Il 23 mar 2017 8:40 PM, "Joe Julian" <joe at julianfamily.org >>> <mailto:joe at julianfamily.org>> ha scritto: >>> >>> I always use raw images. And yes, sharding would also be >>> good. >>> >>> >>> On 03/23/17 12:36, Gandalf Corvotempesta wrote: >>>> Georep expose to another problem: >>>> When using gluster as storage for VM, the VM file is >>>> saved as qcow. Changes are inside the qcow, thus rsync >>>> has to sync the whole file every time >>>> >>>> A little workaround would be sharding, as rsync has to >>>> sync only the changed shards, but I don't think this is >>>> a good solution >>>> >>>> Il 23 mar 2017 8:33 PM, "Joe Julian" >>>> <joe at julianfamily.org <mailto:joe at julianfamily.org>> ha >>>> scritto: >>>> >>>> In many cases, a full backup set is just not >>>> feasible. Georep to the same or different DC may be >>>> an option if the bandwidth can keep up with the >>>> change set. If not, maybe breaking the data up into >>>> smaller more manageable volumes where you only keep >>>> a smaller set of critical data and just back that >>>> up. Perhaps an object store (swift?) might handle >>>> fault tolerance distribution better for some workloads. >>>> >>>> There's no one right answer. >>>> >>>> >>>> On 03/23/17 12:23, Gandalf Corvotempesta wrote: >>>>> Backing up from inside each VM doesn't solve the >>>>> problem >>>>> If you have to backup 500VMs you just need more >>>>> than 1 day and what if you have to restore the >>>>> whole gluster storage? >>>>> >>>>> How many days do you need to restore 1PB? >>>>> >>>>> Probably the only solution should be a georep in >>>>> the same datacenter/rack with a similiar cluster, >>>>> ready to became the master storage. >>>>> In this case you don't need to restore anything as >>>>> data are already there, >>>>> only a little bit back in time but this double the TCO >>>>> >>>>> Il 23 mar 2017 6:39 PM, "Serkan ?oban" >>>>> <cobanserkan at gmail.com >>>>> <mailto:cobanserkan at gmail.com>> ha scritto: >>>>> >>>>> Assuming a backup window of 12 hours, you need >>>>> to send data at 25GB/s >>>>> to backup solution. >>>>> Using 10G Ethernet on hosts you need at least >>>>> 25 host to handle 25GB/s. >>>>> You can create an EC gluster cluster that can >>>>> handle this rates, or >>>>> you just backup valuable data from inside VMs >>>>> using open source backup >>>>> tools like borg,attic,restic , etc... >>>>> >>>>> On Thu, Mar 23, 2017 at 7:48 PM, Gandalf >>>>> Corvotempesta >>>>> <gandalf.corvotempesta at gmail.com >>>>> <mailto:gandalf.corvotempesta at gmail.com>> wrote: >>>>> > Let's assume a 1PB storage full of VMs >>>>> images with each brick over ZFS, >>>>> > replica 3, sharding enabled >>>>> > >>>>> > How do you backup/restore that amount of data? >>>>> > >>>>> > Backing up daily is impossible, you'll never >>>>> finish the backup that the >>>>> > following one is starting (in other words, >>>>> you need more than 24 hours) >>>>> > >>>>> > Restoring is even worse. You need more than >>>>> 24 hours with the whole cluster >>>>> > down >>>>> > >>>>> > You can't rely on ZFS snapshot due to >>>>> sharding (the snapshot took from one >>>>> > node is useless without all other node >>>>> related at the same shard) and you >>>>> > still have the same restore speed >>>>> > >>>>> > How do you backup this? >>>>> > >>>>> > Even georep isn't enough, if you have to >>>>> restore the whole storage in case >>>>> > of disaster >>>>> > >>>>> > _______________________________________________ >>>>> > Gluster-users mailing list >>>>> > Gluster-users at gluster.org >>>>> <mailto:Gluster-users at gluster.org> >>>>> > >>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>> <http://lists.gluster.org/mailman/listinfo/gluster-users> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> <mailto:Gluster-users at gluster.org> >>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>> <http://lists.gluster.org/mailman/listinfo/gluster-users> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> <mailto:Gluster-users at gluster.org> >>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>> <http://lists.gluster.org/mailman/listinfo/gluster-users> >>>> >>>> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >> http://lists.gluster.org/mailman/listinfo/gluster-users >> <http://lists.gluster.org/mailman/listinfo/gluster-users> > -- > Alvin Starr || voice:(905)513-7688 <tel:%28905%29%20513-7688> > Netvel Inc. || Cell:(416)806-0133 <tel:%28416%29%20806-0133> > alvin at netvel.net <mailto:alvin at netvel.net> || > > _______________________________________________ Gluster-users > mailing list Gluster-users at gluster.org > <mailto:Gluster-users at gluster.org> > http://lists.gluster.org/mailman/listinfo/gluster-users > <http://lists.gluster.org/mailman/listinfo/gluster-users> >-- Alvin Starr || voice: (905)513-7688 Netvel Inc. || Cell: (416)806-0133 alvin at netvel.net || -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170323/755ce291/attachment.html>
Don't snapshot the entire gluster volume, keep a rolling routine for snapshotting the individual VMs & rsync those. As already mentioned, you need to "itemize" the backups - trying to manage backups for the whole volume as a single unit is just crazy! Also, for long term backups, maintaining just the core data of each VM is far more manageable. I settled on oVirt for our platform, and do the following... - A cronjob regularly snapshots & clones each VM, whose image is then rsynced to our backup storage; - The backup server snapshots the VM's *image* backup volume to maintain history/versioning; - These full images are only maintained for 30 days, for DR purposes; - A separate routine rsyncs the VM's core data to its own *data* backup volume, which is snapshotted & maintained for 10 years; - This could be made more efficient by using guestfish to extract the core data from backup image, instead of basically rsyncing the data across the network twice. That *active* storage layer uses Gluster on top of XFS & LVM. The *backup* storage layer uses a mirrored storage unit running ZFS on FreeNAS. This of course doesn't allow for HA in the case of the entire cloud failing. For that we'd use geo-rep & a big fat pipe. D On 23 March 2017 at 16:29, Gandalf Corvotempesta < gandalf.corvotempesta at gmail.com> wrote:> Yes but the biggest issue is how to recover > You'll need to recover the whole storage not a single snapshot and this > can last for days > > Il 23 mar 2017 9:24 PM, "Alvin Starr" <alvin at netvel.net> ha scritto: > >> For volume backups you need something like snapshots. >> >> If you take a snapshot A of a live volume L that snapshot stays at that >> moment in time and you can rsync that to another system or use something >> like deltacp.pl to copy it. >> >> The usual process is to delete the snapshot once its copied and than >> repeat the process again when the next backup is required. >> >> That process does require rsync/deltacp to read the complete volume on >> both systems which can take a long time. >> >> I was kicking around the idea to try and handle snapshot deltas better. >> >> The idea is that you could take your initial snapshot A then sync that >> snapshot to your backup system. >> >> At a later point you could take another snapshot B. >> >> Because snapshots contain the copies of the original data at the time of >> the snapshot and unmodified data points to the Live volume it is possible >> to tell what blocks of data have changed since the snapshot was taken. >> >> Now that you have a second snapshot you can in essence perform a diff on >> the A and B snapshots to get only the blocks that changed up to the time >> that B was taken. >> >> These blocks could be copied to the backup image and you should have a >> clone of the B snapshot. >> >> You would not have to read the whole volume image but just the changed >> blocks dramatically improving the speed of the backup. >> >> At this point you can delete the A snapshot and promote the B snapshot to >> be the A snapshot for the next backup round. >> >> On 03/23/2017 03:53 PM, Gandalf Corvotempesta wrote: >> >> Are backup consistent? >> What happens if the header on shard0 is synced referring to some data on >> shard450 and when rsync parse shard450 this data is changed by subsequent >> writes? >> >> Header would be backupped of sync respect the rest of the image >> >> Il 23 mar 2017 8:48 PM, "Joe Julian" <joe at julianfamily.org> ha scritto: >> >>> The rsync protocol only passes blocks that have actually changed. Raw >>> changes fewer bits. You're right, though, that it still has to check the >>> entire file for those changes. >>> >>> On 03/23/17 12:47, Gandalf Corvotempesta wrote: >>> >>> Raw or qcow doesn't change anything about the backup. >>> Georep always have to sync the whole file >>> >>> Additionally, raw images has much less features than qcow >>> >>> Il 23 mar 2017 8:40 PM, "Joe Julian" <joe at julianfamily.org> ha scritto: >>> >>>> I always use raw images. And yes, sharding would also be good. >>>> >>>> On 03/23/17 12:36, Gandalf Corvotempesta wrote: >>>> >>>> Georep expose to another problem: >>>> When using gluster as storage for VM, the VM file is saved as qcow. >>>> Changes are inside the qcow, thus rsync has to sync the whole file every >>>> time >>>> >>>> A little workaround would be sharding, as rsync has to sync only the >>>> changed shards, but I don't think this is a good solution >>>> >>>> Il 23 mar 2017 8:33 PM, "Joe Julian" <joe at julianfamily.org> ha scritto: >>>> >>>>> In many cases, a full backup set is just not feasible. Georep to the >>>>> same or different DC may be an option if the bandwidth can keep up with the >>>>> change set. If not, maybe breaking the data up into smaller more manageable >>>>> volumes where you only keep a smaller set of critical data and just back >>>>> that up. Perhaps an object store (swift?) might handle fault tolerance >>>>> distribution better for some workloads. >>>>> >>>>> There's no one right answer. >>>>> >>>>> On 03/23/17 12:23, Gandalf Corvotempesta wrote: >>>>> >>>>> Backing up from inside each VM doesn't solve the problem >>>>> If you have to backup 500VMs you just need more than 1 day and what if >>>>> you have to restore the whole gluster storage? >>>>> >>>>> How many days do you need to restore 1PB? >>>>> >>>>> Probably the only solution should be a georep in the same >>>>> datacenter/rack with a similiar cluster, >>>>> ready to became the master storage. >>>>> In this case you don't need to restore anything as data are already >>>>> there, >>>>> only a little bit back in time but this double the TCO >>>>> >>>>> Il 23 mar 2017 6:39 PM, "Serkan ?oban" <cobanserkan at gmail.com> ha >>>>> scritto: >>>>> >>>>>> Assuming a backup window of 12 hours, you need to send data at 25GB/s >>>>>> to backup solution. >>>>>> Using 10G Ethernet on hosts you need at least 25 host to handle >>>>>> 25GB/s. >>>>>> You can create an EC gluster cluster that can handle this rates, or >>>>>> you just backup valuable data from inside VMs using open source backup >>>>>> tools like borg,attic,restic , etc... >>>>>> >>>>>> On Thu, Mar 23, 2017 at 7:48 PM, Gandalf Corvotempesta >>>>>> <gandalf.corvotempesta at gmail.com> wrote: >>>>>> > Let's assume a 1PB storage full of VMs images with each brick over >>>>>> ZFS, >>>>>> > replica 3, sharding enabled >>>>>> > >>>>>> > How do you backup/restore that amount of data? >>>>>> > >>>>>> > Backing up daily is impossible, you'll never finish the backup that >>>>>> the >>>>>> > following one is starting (in other words, you need more than 24 >>>>>> hours) >>>>>> > >>>>>> > Restoring is even worse. You need more than 24 hours with the whole >>>>>> cluster >>>>>> > down >>>>>> > >>>>>> > You can't rely on ZFS snapshot due to sharding (the snapshot took >>>>>> from one >>>>>> > node is useless without all other node related at the same shard) >>>>>> and you >>>>>> > still have the same restore speed >>>>>> > >>>>>> > How do you backup this? >>>>>> > >>>>>> > Even georep isn't enough, if you have to restore the whole storage >>>>>> in case >>>>>> > of disaster >>>>>> > >>>>>> > _______________________________________________ >>>>>> > Gluster-users mailing list >>>>>> > Gluster-users at gluster.org >>>>>> > http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Gluster-users mailing listGluster-users at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users >>>>> >>>>> _______________________________________________ Gluster-users mailing >>>>> list Gluster-users at gluster.org http://lists.gluster.org/mailm >>>>> an/listinfo/gluster-users >>>> >>>> _______________________________________________ >> Gluster-users mailing listGluster-users at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users >> >> -- >> Alvin Starr || voice: (905)513-7688 <(905)%20513-7688> >> Netvel Inc. || Cell: (416)806-0133 <(416)%20806-0133>alvin at netvel.net || >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >> > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170323/8d1d9f49/attachment.html>