Josef Bacik
2007-Aug-07 15:06 UTC
[Btrfs-devel] [RFC] wait for more writers to join the transaction
Hello, Sorry for my long silence, getting all situated took longer than I expected. I just wrote something very quickly to try and make btrfs sleep longer if there were writers while we were sleeping during an fsync. Running the fs_mark test, btrfs by itself gets walloped by ext3, so I'm ignoring that for now. With btrfs+my patch we seem to get about the same performance, except in the -S2 and -S5 cases, which are Sync method: SYSTEM SYNC/SINGLE FSYNC: Issue sync() after main write loop and 1 file fsync() per subdirectory Sync method: POST: Reopen and fsync() each file in order after main write loop Now I don't think thats good enough so this is going to require more work, but I'm also kind of new to journalling stuff so I wanted to run this by people to see if I'm going in the right direction. Might as well get put on the right track before I go hacking around at things. Thank you, Josef diff -r 1765acb6d268 transaction.c --- a/transaction.c Tue Aug 07 16:35:25 2007 -0400 +++ b/transaction.c Tue Aug 07 17:29:12 2007 -0400 @@ -56,6 +56,7 @@ static int join_transaction(struct btrfs root->fs_info->generation++; root->fs_info->running_transaction = cur_trans; cur_trans->num_writers = 0; + cur_trans->writers_while_asleep = 0; cur_trans->transid = root->fs_info->generation; init_waitqueue_head(&cur_trans->writer_wait); init_waitqueue_head(&cur_trans->commit_wait); @@ -67,6 +68,7 @@ static int join_transaction(struct btrfs init_bit_radix(&cur_trans->dirty_pages); } cur_trans->num_writers++; + cur_trans->writers_while_asleep++; return 0; } @@ -454,12 +456,16 @@ int btrfs_commit_transaction(struct btrf mutex_lock(&root->fs_info->trans_mutex); } } - while (trans->transaction->num_writers > 1) { + while (trans->transaction->num_writers > 1 || + trans->transaction->writers_while_asleep > 0) { + WARN_ON(cur_trans != trans->transaction); + prepare_to_wait(&trans->transaction->writer_wait, &wait, TASK_UNINTERRUPTIBLE); if (trans->transaction->num_writers <= 1) break; + trans->transaction->writers_while_asleep = 0; mutex_unlock(&root->fs_info->fs_mutex); mutex_unlock(&root->fs_info->trans_mutex); schedule(); diff -r 1765acb6d268 transaction.h --- a/transaction.h Tue Aug 07 16:35:25 2007 -0400 +++ b/transaction.h Tue Aug 07 17:23:30 2007 -0400 @@ -26,6 +26,7 @@ struct btrfs_transaction { int in_commit; int use_count; int commit_done; + unsigned long writers_while_asleep; struct list_head list; struct radix_tree_root dirty_pages; unsigned long start_time;
Chris Mason
2007-Aug-07 16:35 UTC
[Btrfs-devel] [RFC] wait for more writers to join the transaction
On Tue, 7 Aug 2007 18:11:54 -0400 Josef Bacik <jwhiter@redhat.com> wrote:> Hello, > > Sorry for my long silence, getting all situated took longer than I > expected. I just wrote something very quickly to try and make btrfs > sleep longer if there were writers while we were sleeping during an > fsync. Running the fs_mark test, btrfs by itself gets walloped by > ext3, so I'm ignoring that for now. With btrfs+my patch we seem to > get about the same performance, except in the -S2 and -S5 cases, > which arefs_mark can test lots of different things. In the multi-threaded tests, the big fat fs_mutex in btrfs is going to hurt us quite a lot, so that part isn't horribly surprising (this is very high on my list to fix). How did you mount ext3? If you're running on sata or IDE drives, make sure to use mount -o barrier=1. btrfs does write barriers by default, so to be fair you have to turn them on in ext3 as well.> > Sync method: SYSTEM SYNC/SINGLE FSYNC: Issue sync() after main write > loop and 1 file fsync() per subdirectoryI would just use the default sync method. ext3 has some code to basically make the fsync a noop if the transaction that changed the file last is already committed. This would be the next thing I ask you to add to btrfs ;)> > Sync method: POST: Reopen and fsync() each file in order after main > write loop > > Now I don't think thats good enough so this is going to require more > work, but I'm also kind of new to journalling stuff so I wanted to > run this by people to see if I'm going in the right direction. Might > as well get put on the right track before I go hacking around at > things. Thank you, > > Josef > > diff -r 1765acb6d268 transaction.c > --- a/transaction.c Tue Aug 07 16:35:25 2007 -0400 > +++ b/transaction.c Tue Aug 07 17:29:12 2007 -0400 > @@ -56,6 +56,7 @@ static int join_transaction(struct btrfs > root->fs_info->generation++; > root->fs_info->running_transaction = cur_trans; > cur_trans->num_writers = 0; > + cur_trans->writers_while_asleep = 0; > cur_trans->transid = root->fs_info->generation; > init_waitqueue_head(&cur_trans->writer_wait); > init_waitqueue_head(&cur_trans->commit_wait); > @@ -67,6 +68,7 @@ static int join_transaction(struct btrfs > init_bit_radix(&cur_trans->dirty_pages); > } > cur_trans->num_writers++; > + cur_trans->writers_while_asleep++; > return 0; > } > > @@ -454,12 +456,16 @@ int btrfs_commit_transaction(struct btrf > mutex_lock(&root->fs_info->trans_mutex); > } > } > - while (trans->transaction->num_writers > 1) { > + while (trans->transaction->num_writers > 1 || > + trans->transaction->writers_while_asleep > 0) { > + > WARN_ON(cur_trans != trans->transaction); > + > prepare_to_wait(&trans->transaction->writer_wait, > &wait, TASK_UNINTERRUPTIBLE); > if (trans->transaction->num_writers <= 1) > break;I think you need the writers_while_asleep check here too (before the break). Thanks for trying this out. -chris