Hi Folks, how can i guarantee consistency for the ZFS snapshots?. If i am running a db or any other app on my ZFS and want to take a snapshot is there is any filesystem equivalent command to quiesce the ZFS before taking a snapshot or do i have to rely on the app itself?. Can i do something like lockfs or the like?. If i take snapshost on the storage, how can i guarantee consistency on those snapshosts?. Any methods to quiesce the FS after which i can take snapshosts on storage?. thanks for any inputs. This message posted from opensolaris.org
ganesh wrote:> Hi Folks, > how can i guarantee consistency for the ZFS snapshots?. > If i am running a db or any other app on my ZFS and want to take a snapshot is there is any filesystem equivalent command to quiesce the ZFS before taking a snapshot or do i have to rely on the app itself?.You almost always have to quiesce the app in order to flush its buffers.> Can i do something like lockfs or the like?. If i take snapshost on the storage, how can i guarantee consistency on those snapshosts?. Any methods to quiesce the FS after which i can take snapshosts on storage?.zfs snapshot -- richard
> how can i guarantee consistency for the ZFS snapshots?.Filesystem consistency or application/data consistency?> If i am running a db or any other app on my ZFS and want to take a > snapshot is there is any filesystem equivalent command to quiesce the > ZFS before taking a snapshot or do i have to rely on the app itself?.Because ZFS is taking the snapshot, it is able to guarantee filesystem consistency. However, it cannot speak to the data or application contents. You have to do that, and ensure it has a consistent on-disk image at the time of the snapshot. This is the same as any other snapshot or copy technique would require.> Can i do something like lockfs or the like?. If i take snapshost on > the storage, how can i guarantee consistency on those snapshosts?. Any > methods to quiesce the FS after which i can take snapshosts on > storage?.At the filesystem level, that''s all taken care of. -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
Both levels, application and filesystem. If i put the database in hotbackup mode,then i will have to ensure that the filesystem is consistent as well.So, you are saying that taking a ZFS snapshot is the only method to guarantee consistency in the filesystem since it flushes all the buffers to the filesystem , so its consistent. Just curious,is there any manual way of telling ZFS to flush the buffers after queiscing the db other than taking a ZFS snapshot?. This message posted from opensolaris.org
> If i put the database in hotbackup mode,then i will have to ensure > that the filesystem is consistent as well.So, you are saying that > taking a ZFS snapshot is the only method to guarantee consistency in > the filesystem since it flushes all the buffers to the filesystem , so > its consistent.The ZFS filesystem is always consistent on disk. By taking a snapshot, you simply make a consistent copy of the filesystem available. Flushing buffers would be a way of making sure all the writes have made it to storage. That''s a different statement than consistency.> Just curious,is there any manual way of telling ZFS to flush the > buffers after queiscing the db other than taking a ZFS snapshot?.The usual methods of doing this on a filesystem are to run sync, or call fsync(), but that''s not anything specific to ZFS. If you''re not taking a snapshot, why would you want ZFS to flush the buffers? -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
> > If i put the database in hotbackup mode,then i will > have to ensure > > that the filesystem is consistent as well.So, you > are saying that > > taking a ZFS snapshot is the only method to > guarantee consistency in > > the filesystem since it flushes all the buffers to > the filesystem , so > > its consistent. > > The ZFS filesystem is always consistent on disk. By > taking a snapshot, > you simply make a consistent copy of the filesystem > available. > > Flushing buffers would be a way of making sure all > the writes have made > it to storage. That''s a different statement than > consistency. > > > Just curious,is there any manual way of telling ZFS > to flush the > > buffers after queiscing the db other than taking a > ZFS snapshot?. > > The usual methods of doing this on a filesystem are > to run sync, or call > fsync(), but that''s not anything specific to ZFS. > > If you''re not taking a snapshot, why would you want > ZFS to flush the > buffers? >Maybe just a way to confirm some really important data/change made it to the storage? Ivan.> -- > Darren Dunham > > dunham at taos.com > Senior Technical Consultant TAOS > http://www.taos.com/ > Pepper? San Francisco, CA > bay area > < This line left intentionally blank to > confuse you. > > ______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discu > ss >This message posted from opensolaris.org
> > > Just curious,is there any manual way of telling ZFS > > to flush the > > > buffers after queiscing the db other than taking a > > ZFS snapshot?. > > > > If you''re not taking a snapshot, why would you want > > ZFS to flush the > > buffers? > > > > Maybe just a way to confirm some really important data/change made it > to the storage?Could be. There''s lots of reasons to. But because of the discussion with snapshots I was curious about what the OP had in mind. -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
Thanks Darren. let me try to put this in points - 1. ZFS atomic operation that commits data. 2. Writes come into the app. 3. The db put in hotbackup mode. 4. Snapshot taken on storage. 5. ZFS atomic operation that commits data. So if i do a snap restore, ZFS might revert to point1, but from the db perspective, it is inconsistent and we would need to do a recovery..correct?. Lets say between 3 and 4 i manually wanted to do what ZFS does in point 5, would just sync do the job here?. Taking a zfs snaphost might take a snap at point 1..which might not work from teh db perspective. Am i getting this right?. TIA This message posted from opensolaris.org
> Thanks Darren. > let me try to put this in points - > > 1. ZFS atomic operation that commits data. > 2. Writes come into the app. > 3. The db put in hotbackup mode. > 4. Snapshot taken on storage. > 5. ZFS atomic operation that commits data. > > So if i do a snap restore, ZFS might revert to point1, but from the db > perspective, it is inconsistent and we would need to do a > recovery..correct?.Right. So you''ll want to synchronize your snapshots with a database consistency. Just like doing backups.> Lets say between 3 and 4 i manually wanted to do what ZFS does in > point 5, would just sync do the job here?.It should. Under normal operation, the disk is updated every 10 seconds, or sooner if the cache is filled due to write operations.> Taking a zfs snaphost might > take a snap at point 1..which might not work from teh db perspective.Right. I just wanted to know why you want to sync the disk yourself if you weren''t taking a snapshot. If this is important for the database, then it should be doing it itself before it reports that the data has been written. -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
>> 1. ZFS atomic operation that commits data. >> 2. Writes come into the app. >> 3. The db put in hotbackup mode. >> 4. Snapshot taken on storage. >> 5. ZFS atomic operation that commits data. >> >> So if i do a snap restore, ZFS might revert to point1, but from the db >> perspective, it is inconsistent and we would need to do a >> recovery..correct?. > > Right. So you''ll want to synchronize your snapshots with a database > consistency. Just like doing backups.I have gotten the feeling that everyone is misunderstanding everyone else in this thread ;) My understanding is that a zfs snapshot that can be proven to have happened subsequent to a particular write() (or link(), etc), is guaranteed to contain the data that was written. Anything else would massively decrease the usefulness of snapshots. Is this incorrect? If not, feel free to ignore the remainder of this E-Mail. If it is, then I don''t see why the filesystem would be reverted to (1). It should in fact be guaranteed to revert to (4) (unless the creation of the snapshot is itself not guaranteed to be persistent without an explicit global "sync" by the administrator - but I doubt this is the case?). Regardless of the details of snapshots, I think the point that needs making to the OP is that regardless of filesystem issues the data as written to that filesystem by the application must always be consistent from the perspective of the application, and that a snapshot just gives you a snapshot of a filesystem for which any read will return whatever it would have done exactly at the point of the snapshot. If the application has not written the data, it will not be part of the snapshot. Thus if the application has writes pending that are needed for consistency, those writes must complete prior to snapshotting. The synching, which I assume refer to fsync() and/or the "sync" command, is about ensuring that the view of the filesystem (or usually a subset of it) as seen by applications is actually committed to persistent storage. This is done either to guarantee that some application-level data is committed and will remain in the face of a crash (e.g. a banking application does an SQL COMMIT), or as an overkill way of ensuring that some I/O operation B physically happens after some I/O operation A (such that in the event of a crash, B will never appear on disk if A does not also appear) (such as a database maintaining internal transactional consistency). Now, assuming that snapshots work in the way I assume and ask about above, the use of a zfs snapshot at a point in time when the application has written consistent data to the filesystem is sufficient to guarantee consistency in the event of a crash. Essentially the zfs snapshot can be used to achieve the effect of "fsync()", with the added benefit of being able to administratively roll back to the previous version rather than just guaranteeing that there is some consistent state to return back to. (Incidentally, since, according to a post here on the list in response to a related question I had, ZFS already guarantees ordering of writes there is presumably some pretty significant performance improvements to be had if a database was made aware of this and allowed a weaker form of COMMIT where you drop the persistence requirement, but keep the consistency requirement.) -- / Peter Schuller PGP userID: 0xE9758B7D or ''Peter Schuller <peter.schuller at infidyne.com>'' Key retrieval: Send an E-Mail to getpgpkey at scode.org E-Mail: peter.schuller at infidyne.com Web: http://www.scode.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070608/a0819d7b/attachment.bin>
> >> 1. ZFS atomic operation that commits data. > >> 2. Writes come into the app. > >> 3. The db put in hotbackup mode. > >> 4. Snapshot taken on storage. > >> 5. ZFS atomic operation that commits data. > >> > >> So if i do a snap restore, ZFS might revert to point1, but from the db> > >> perspective, it is inconsistent and we would need to do a > >> recovery..correct?. > >=20 > > Right. So you''ll want to synchronize your snapshots with a database > > consistency. Just like doing backups. > > I have gotten the feeling that everyone is misunderstanding everyone > else in this thread ;)I don''t know that everyone is misunderstanding, but I did make a blunder with my "Right". You are correct that the snap restore should have no reason to revert to point 1. The snapshot at point 4 would also be certain to commit data as well as points 1 and 5. -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
Thanks Darren, so a sync should do the job for me in that case. How about locking the FS so that i dont miss any new writes further on?. Anything similar to lockfs?. This message posted from opensolaris.org
Darren Dunham
2007-Jun-08 21:18 UTC
[zfs-discuss] Re: Re: Re: Re: ZFS consistency guarantee
> Thanks Darren, so a sync should do the job for me in that case. How > about locking the FS so that i dont miss any new writes further on?.I''m not sure I understand what you might miss here. Normally you''d ask your application to make itself consistent, take a snapshot, then when the snapshot was complete you''d notify the application that it was free to do whatever it did normally. What would being able to lock the FS accomplish for you?> Anything similar to lockfs?.Not that I''m aware of, but I haven''t looked. -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
Richard L. Hamilton
2007-Jun-09 10:14 UTC
[zfs-discuss] Re: Re: Re: Re: Re: ZFS consistency guarantee
I wish there was a uniform way whereby applications could register their ability to achieve or release consistency on demand, and if registered, could also communicate back that they had either achieved consistency on-disk, or were unable to do so. That would allow backup procedures to automatically talk to apps capable of such functions, to get them to a known state on-disk before taking a snapshot. That would allow one to for example not stop a DBMS, but simply have it seem to pause for a moment while achieving consistency and until told that the snapshot was complete; thus providing minimum impact while still having fully usable backups (and without needing to do the database backups _through_ the DBMS). Something I heard once leads me to believe that some such facility or convention for how to communicate such issues with e.g. database server processes exists on Windows. If they''ve got it, we really ought to have something even better, right? :-) (That''s of course not specific to ZFS, but would be useful with any filesystem that can take snapshots.) This message posted from opensolaris.org
David Magda
2007-Jun-09 18:40 UTC
[zfs-discuss] Re: Re: Re: Re: Re: ZFS consistency guarantee
On Jun 9, 2007, at 06:14, Richard L. Hamilton wrote:> I wish there was a uniform way whereby applications could > register their ability to achieve or release consistency on demand, > and if registered, could also communicate back that they had > either achieved consistency on-disk, or were unable to do so.This doesn''t (necessarily) have to be built into the kernel: a user space program could work as (say like D-Bus). Any interested program can register / listen for updates (read-only), and anytime a backup program is about to run it can send out an event saying that the /foo/bar directory tree (and below) will be backed up (or snapshotted (word?)). If you have /var/run/backups| where the user ownership is the backup program''s user (or using a SUID program to send notifications), an the group is read-only (640), process owners (like oracle, mysql, postgres, etc.) are members of that read-only group so that they can''t affect other programs by sending spurious messages.
Thanks Darren, but the snapshot taken at 4 would be the snapshot on the storage and not on the host so the storage system wouldnt really have to bother about flushing the host FS or about consistency...which would be more a function of the host FS or app?. This message posted from opensolaris.org
Darren Dunham
2007-Jun-11 04:19 UTC
[zfs-discuss] Re: Re: Re: Re: ZFS consistency guarantee
> Thanks Darren, but the snapshot taken at 4 would be the snapshot on > the storage and not on the hostI''m not sure what the difference is. I was discussing this in the context of ZFS, so all the snapshots I meant were ZFS/host snapshots.> so the storage system wouldnt really > have to bother about flushing the host FS or about consistency...which > would be more a function of the host FS or app?.I don''t really understand what you mean there. Do you have a specific example or situation that you''re thinking of? -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >