Hi, we're working on a stackable versioning file system for 2.4.x. Versioning can easily create lots of files for a file that gets modified frequently, and our current design puts all versions of a file in the same directory as the main file. We are therefore evaluating how stable and efficient different combinations of file systems would be in this scenario. We've run our versionfs on ext2 and ext3, and with and without the HTree patches. We started with 2.4.20, and managed to tickle the "buffer credits" bug reported here: http://www.spinics.net/lists/ext3/msg02297.html So we went to 2.4.21 and 2.4.22, as they include the patch reported in the above message. But we're still getting the bug. We can reproduce it using postmark inside versionfs, mounted on top of ext3, in 2.4.22: kernel: Assertion failure in do_get_write_access() at transaction.c:720: "handle->h_buffer_credits > 0" Now, it's quite possible that we're not doing something right in versionfs which messes ext3 up. We're unable to tickle this assertion directly via ext3. However, given that our benchmarks run fine on ext2 and that at least one reported buffer_credits problem was a true ext3 bug, I'm seeking help/advise. 1. Are there any more known ext3 bugs of the sort that have been reported? If so, are there fixes anywhere? (We didn't see anything new wrt ext3 in the 2.4.23-pre series.) 2. To help us narrow down the problem, could anyone suggest what versionfs might be doing wrong that can mess ext3 up wrt the buffer credits? Can you suggest any tests we might do to help track the problem? FWIW, we're managed to narrow down the problem to the area in our code that uses the sendfile functionality. We use sendfile inside our file system to make a copy of a file before it'd be modified, for versioning purposes. We'll keep digging and let this list know what we find. Thanks, Erez.
<chrisl_ext3_user@cli.mailshell.com>
2003-Sep-22 20:02 UTC
Re: journal buffer_credits problem
>From Erez Zadok <ezk@cs.sunysb.edu> on 22 Sep 2003:> Now, it's quite possible that we're not doing something right in > versionfs > which messes ext3 up. We're unable to tickle this assertion directlyThat is very possible. Do you mind show me the related versionfs code? Ext3 need to reserve the number of block it need to modify on creating the the transaction. If your versionfs dirty some block, which I think it will, it need to reserved the extra blocks in ext3_journal_start(). Without doing so can result in the problem you describe.> 1. Are there any more known ext3 bugs of the sort that have been > reported? > If so, are there fixes anywhere? (We didn't see anything new wrt > ext3 in > the 2.4.23-pre series.)I think there is a good chance you did not reserved it right.> > FWIW, we're managed to narrow down the problem to the area in our code > that > uses the sendfile functionality. We use sendfile inside our file system > to > make a copy of a file before it'd be modified, for versioning purposes.Again, I would like to take a look at your related versionfs change. BTW, you might want to post on ext2-devel as well. Regards, Chris
[--snip--]> > Again, I would like to take a look at your related versionfs change. > > > > BTW, you might want to post on ext2-devel as well. > > > > Regards, > > > > Chris > >Thanks Chris. We had to first understand what these buffer_credits are. >And we believe we may have fixed the problem (preliminary tests no longer >tickle the problem). This was due to the way we were nesting another file >creation+write (for the version file) inside a normal prepare/commit write. > >I'll let my student Akshat post a more detailed explanation of the fix's >hypothesis. > >Erez.Chris, Here's a detailed explanation of what we're trying to do, and what we understand of it. It would be useful if you could confirm that our understanding is correct. In versionfs, we make a copy of a file whenever it changes. To do so, we do a sendfile-like operation after doing an ext3_prepare_write() so that we can copy the old contents of the file before the file changes permanently. A prepare_write will start a transaction and reserve a number of buffer credits for itself. In our code, we are nesting the file copy for versioning between the prepare_write and the commit_write. As a result, the copy operation will eat up buffer credits from the current transaction. ext3_prepare_write obviously does not reserve credits for us because it has no idea that we intend to a whole lot more ext3 activity in between. I think that for the most part, it so happens that the number of credits reserved in ext3_prepare_write is conservatively quite high and we can squeeze in our copy file without running out of credits, but sometimes we get unlucky and there aren't enough credits for copying. As Erez said, we were nesting the copy operation between a prepare and commit write. If instead, we do the copy before ext3_prepare_write and not bother ext3 between ext3_prepare_write and ext3_commit_write, we don't run into the buffer_credit problem, since the copy operation will happen in its own transaction and will be allocated as many buffer credits as required. Regards, Akshat Aranya _________________________________________________________________ The hottest things. The coolest deals. http://www.msn.co.in/Shopping/ Get them online!
> > think > > that for the most part, it so happens that the number of credits > > reserved in > > ext3_prepare_write is conservatively quite high and we can squeeze in > > our > > copy file without running out of credits, but sometimes we get unlucky > > and > > there aren't enough credits for copying. > > As Erez said, we were nesting the copy operation between a prepare and > > commit write. If instead, we do the copy before ext3_prepare_write and > > not > > bother ext3 between ext3_prepare_write and ext3_commit_write, we don't > > run > > into the >--buffer_credit problem, since the copy operation will happen in > > its > > own transaction and will be allocated as many buffer credits as > > required. > >You might need to nest the transaction as well. >What if power failure happen after backuping the file and before modify >to original file? Dp you left the backup file around? >We do need to keep the backup file around since it is not really a backup file, but a version of the file before it was changed. We do need to nest the transaction like you mentioned, but there does not seem to be an easy way to do so without losing portability. Basically, we rely on the vfs ops to call the underlying file system, so we cannot do explicit journal_start() since that would break the file system independence property of our system. We need to be careful to not perform file system operations that are specific to the semantics of one particular file system. Is there any way that you suggest that can get us a nested transaction (through a VFS operation or an operation exposed by the file system to VFS) when there's a transaction already associated with the current process?>Chris >Cheers, Akshat Aranya SUNY, Stony Brook _________________________________________________________________ Yearning for ol' pals? For ol' school days? http://www.batchmates.com/msn.asp Here's a surprise!