My understanding is that the answers to the questions posed below are both YES due the transactional design of ZFS. However, I''m working with some folks that need more details or documents describing the design/behavior without having to look through all the source code. [b]Scenario 1[/b] * Create file * Open and Write data to file * Issue fsync() call for file [b]Question:[/b] Is it guaranteed that the write to the directory occurs prior to the write to the file? [b]Scenario 1[/b] * Write an extended attribute (such as a file version number) for a file. * Open and Write data to file * Issue fsync() call for file [b]Question:[/b] Is it guaranteed that the extended attribute write occurs prior to the write to the file? Additionally, is it possible that there are differences in this behavior as relates to these scenarios between Solaris 10 U4 or a SXDE 01/08 implementation (snv_b79)? This message posted from opensolaris.org
>My understanding is that the answers to the questions posed below are both YES due the transactional design of ZFS. However, I''m working with some folks that need more details or documents describ ing the design/behavior without having to look through all the source code.> >[b]Scenario 1[/b] >* Create file >* Open and Write data to file >* Issue fsync() call for file > >[b]Question:[/b] Is it guaranteed that the write to the directory occurs prior to the write to the file? It is guaranteed that the view will be consistent (note that you cannot create a file without opening it, though). There are, of course, three objects being written: the directory entry, the file node and the file content. ZFS guarantees that: if the directory entry exists, it will point to a valid node if the node exists, it will point to valid file content But all of these can be in the same transaction group and show up at once (or none will show at all). In total, think of the following transaction happening for file create + write & sync (sync only alters timing, so is not really relevant): a) writing the directory entry b) writing the node (create file w/ size 0) c) writing the file''s content d) writing the node (update size to reflect written content and block pointers) Only a partial order is defined and only b needs to happen before a and c before d. (b < a, c < d) In theory, you can have a sequence of c, b+d, a.> >[b]Scenario 1[/b] >* Write an extended attribute (such as a file version number) for a file. >* Open and Write data to file >* Issue fsync() call for file > >[b]Question:[/b] Is it guaranteed that the extended attribute write occurs prior to the write tothe file? No, unless the attribute is fsync''ed also.>Additionally, is it possible that there are differences in this behavior as relates to these scenarios between Solaris 10 U4 or a SXDE 01/08 implementation (snv_b79)? Why does the customer want to know? Casper
Todd Moore wrote:> My understanding is that the answers to the questions posed below are both YES due the transactional design of ZFS. However, I''m working with some folks that need more details or documents describing the design/behavior without having to look through all the source code. > > [b]Scenario 1[/b] > * Create file > * Open and Write data to file > * Issue fsync() call for file > > [b]Question:[/b] Is it guaranteed that the write to the directory occurs prior to the write to the file? >Yes, this is guaranteed.> > [b]Scenario 1[/b] > * Write an extended attribute (such as a file version number) for a file. > * Open and Write data to file > * Issue fsync() call for file > > [b]Question:[/b] Is it guaranteed that the extended attribute write occurs prior to the write to the file? >Again yes this is guaranteed in ZFS. ZFS writes all transactions related to specified file and other transactions not related to the file that may be needed to create the file.> Additionally, is it possible that there are differences in this behavior as relates to these scenarios between Solaris 10 U4 or a SXDE 01/08 implementation (snv_b79)? >No the zfs code has always been this way. The ZIL which handles this behaviour is described at http://blogs.sun.com/perrin/entry/the_lumberjack but this maybe insufficient detail for you. Neil.