Hi, I got the following Filesystem: Filesystem Size Used Avail Capacity iused ifree %iused /dev/da0a 1.3T 422G 823G 34% 565952 182833470 0% Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb. The system is used as SMB/NFS server for my other systems here. I would like to make weekly snapshots, but manually running mksnap_ffs freezes access to the disk (I sort of expected that) but the process never terminates. So I let is sit overnight, but looking a gstat did not reveil any activity what so ever... The disk was not released, mksnap_ffs could not be terminated. And things resulted in me rebooting the system. So: - How long should I expect making a snapshot to take: 5, 15, 30min, 1, 2 hour or even more??? - How do I diagnose the reason why it is not terminating? --WjW
Willem Jan Withagen wrote:> Hi, > > I got the following Filesystem: > Filesystem Size Used Avail Capacity iused ifree %iused > /dev/da0a 1.3T 422G 823G 34% 565952 182833470 0% > > Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb. > The system is used as SMB/NFS server for my other systems here. > > I would like to make weekly snapshots, but manually running mksnap_ffs > freezes access to the disk (I sort of expected that) but the process > never terminates. So I let is sit overnight, but looking a gstat did not > reveil any activity what so ever... > The disk was not released, mksnap_ffs could not be terminated. > And things resulted in me rebooting the system. > > So: > - How long should I expect making a snapshot to take: > 5, 15, 30min, 1, 2 hour or even more??? > - How do I diagnose the reason why it is not terminating? > > --WjWFor a point of reference, I have 2 300GB SerialATA disks in a RAID1 config that I take daily snapshots of. df info: Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ar0s1d 283810134 160945668 117188264 58% /r1 As of last night, this snapshot took 18m59.77s to complete. -Proto
On Tue, Jan 02, 2007 at 09:06:24PM +0100, Willem Jan Withagen wrote:> Hi, > > I got the following Filesystem: > Filesystem Size Used Avail Capacity iused ifree %iused > /dev/da0a 1.3T 422G 823G 34% 565952 182833470 0% > > Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb. > The system is used as SMB/NFS server for my other systems here. > > I would like to make weekly snapshots, but manually running mksnap_ffs > freezes access to the disk (I sort of expected that) but the process > never terminates. So I let is sit overnight, but looking a gstat did not > reveil any activity what so ever... > The disk was not released, mksnap_ffs could not be terminated. > And things resulted in me rebooting the system. > > So: > - How long should I expect making a snapshot to take: > 5, 15, 30min, 1, 2 hour or even more??? > - How do I diagnose the reason why it is not terminating?You forgot to mention what revision of FreeBSD you are running, and if you are using quotas or anything else on the filesystem that could impact this.
Gary Palmer wrote:> On Tue, Jan 02, 2007 at 09:06:24PM +0100, Willem Jan Withagen wrote: >> Hi, >> >> I got the following Filesystem: >> Filesystem Size Used Avail Capacity iused ifree %iused >> /dev/da0a 1.3T 422G 823G 34% 565952 182833470 0% >> >> Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb. >> The system is used as SMB/NFS server for my other systems here. >> >> I would like to make weekly snapshots, but manually running mksnap_ffs >> freezes access to the disk (I sort of expected that) but the process >> never terminates. So I let is sit overnight, but looking a gstat did not >> reveil any activity what so ever... >> The disk was not released, mksnap_ffs could not be terminated. >> And things resulted in me rebooting the system. >> >> So: >> - How long should I expect making a snapshot to take: >> 5, 15, 30min, 1, 2 hour or even more??? >> - How do I diagnose the reason why it is not terminating? > > You forgot to mention what revision of FreeBSD you are running, and > if you are using quotas or anything else on the filesystem that > could impact this.Yes, I pressed send somewhat to fast: [~] wjw@bigsurf> uname -a FreeBSD bigsurf.digiware.nl 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #3: Wed Sep 27 15:57:20 CEST 2006 wjw@bigsurf.digiware.nl:/usr/obj/usr/src/sys/BIGSURF amd64 --WjW
On Wed, Jan 03, 2007 at 12:05:26AM +0100, Willem Jan Withagen wrote:> Gary Palmer wrote: > >On Tue, Jan 02, 2007 at 09:06:24PM +0100, Willem Jan Withagen wrote: > >>Hi, > >> > >>I got the following Filesystem: > >>Filesystem Size Used Avail Capacity iused ifree %iused > >>/dev/da0a 1.3T 422G 823G 34% 565952 182833470 0% > >> > >>Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb. > >>The system is used as SMB/NFS server for my other systems here. > >> > >>I would like to make weekly snapshots, but manually running mksnap_ffs > >>freezes access to the disk (I sort of expected that) but the process > >>never terminates. So I let is sit overnight, but looking a gstat did not > >>reveil any activity what so ever... > >>The disk was not released, mksnap_ffs could not be terminated. > >>And things resulted in me rebooting the system. > >> > >>So: > >> - How long should I expect making a snapshot to take: > >> 5, 15, 30min, 1, 2 hour or even more??? > >> - How do I diagnose the reason why it is not terminating? > > > >You forgot to mention what revision of FreeBSD you are running, and > >if you are using quotas or anything else on the filesystem that > >could impact this. > > Yes, I pressed send somewhat to fast: > > [~] wjw@bigsurf> uname -a > FreeBSD bigsurf.digiware.nl 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #3: Wed > Sep 27 15:57:20 CEST 2006 > wjw@bigsurf.digiware.nl:/usr/obj/usr/src/sys/BIGSURF amd64See http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html for instruction how to gather information needed to debug the problem. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20070103/cb1ac56f/attachment.pgp
Willem Jan Withagen wrote:> Hi, > > I got the following Filesystem: > Filesystem Size Used Avail Capacity iused ifree %iused > /dev/da0a 1.3T 422G 823G 34% 565952 182833470 0% > > Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb. > The system is used as SMB/NFS server for my other systems here. > > I would like to make weekly snapshots, but manually running mksnap_ffs > freezes access to the disk (I sort of expected that) but the process > never terminates. So I let is sit overnight, but looking a gstat did not > reveil any activity what so ever... > The disk was not released, mksnap_ffs could not be terminated. > And things resulted in me rebooting the system. > > So: > - How long should I expect making a snapshot to take: > 5, 15, 30min, 1, 2 hour or even more???This depends how much cylinder groups do you have. If you have a lot of large files, using "newfs -b 32768" instead of the default settings would speed up the process drastically. Note that this might be unfeasable because you already have data on the disk. Another suggestion is to separate the volume into smaller slices, this would reduce the impact. BTW. Our experience with a semi full 1.3T volume is that the snapshot would take about 1 hour on FreeBSD 5.x, but I doubt that it is not really comparable to your situation as the hardware is very different.> - How do I diagnose the reason why it is not terminating?This might be somewhat complicated. Check out the developers' handbook. Cheers, -- Xin LI <delphij@delphij.net> http://www.delphij.net/ FreeBSD - The Power to Serve! -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 249 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20070103/e7210a25/signature.pgp
On Tue, Jan 02, 2007 at 09:06:24PM +0100, Willem Jan Withagen wrote:> Hi, > > I got the following Filesystem: > Filesystem Size Used Avail Capacity iused ifree %iused > /dev/da0a 1.3T 422G 823G 34% 565952 182833470 0% > > Running of a 3ware 9550, on a dual core Opteron 242 with 1Gb. > The system is used as SMB/NFS server for my other systems here. > > I would like to make weekly snapshots, but manually running mksnap_ffs > freezes access to the disk (I sort of expected that) but the process > never terminates. So I let is sit overnight, but looking a gstat did not > reveil any activity what so ever... > The disk was not released, mksnap_ffs could not be terminated. > And things resulted in me rebooting the system. > > So: > - How long should I expect making a snapshot to take: > 5, 15, 30min, 1, 2 hour or even more???Yes :) Snapshots were not designed for use in this way (they were designed to support background fsck and allow faster system recovery after power failure), so they don't scale as well as you might like on large filesystems. Kris -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20070111/6201a1f1/attachment.pgp
Kris Kennaway writes: | On Tue, Jan 16, 2007 at 10:13:57AM -0800, Doug Ambrisko wrote: | | > FWIW, with this patch I find making snap-shots a lot more reliable: | > | > --- sys/ufs/ffs/ffs_snapshot.c.orig Wed Mar 22 09:42:31 2006 | > +++ sys/ufs/ffs/ffs_snapshot.c Mon Nov 20 14:59:13 2006 | > @@ -282,6 +282,8 @@ restart: | > if (error) | > goto out; | > bawrite(nbp); | > + if (cg % 10 == 0) | > + ffs_syncvnode(vp, MNT_WAIT); | > } | > /* | > * Copy all the cylinder group maps. Although the | > @@ -303,6 +305,8 @@ restart: | > goto out; | > error = cgaccount(cg, vp, nbp, 1); | > bawrite(nbp); | > + if (cg % 10 == 0) | > + ffs_syncvnode(vp, MNT_WAIT); | > if (error) | > goto out; | > } | > | > or things can get wedged. We have some other patches as well that might | > be required. As a hack on a local server we have been using snap shots | > to do a "hot" back-up of a data base each morning. This is based on | > 6.x. | | What do you mean by "get wedged"? Are you seeing a deadlock, and if | so then what are the details? When you say 6.x, do you mean | up-to-date RELENG_6? There were various snapshot deadlock fixes | committed over the past year including some in the past few months. The file-system would come to a stop, processes stuck on bio, snap-shots not finishing etc. This was caused by the system running out of usable buffers. The change forces them to be flushed every so often. This is independant of locking. 10 might be to aggresive. Some scaling of nbuf would probably be better. Doug A.