Wang Shilong
2013-Dec-02 17:33 UTC
[PATCH v3] Btrfs: fix wrong super generation mismatch when scrubbing supers
From: Wang Shilong <wangsl.fnst@cn.fujitsu.com> We came a race condition when scrubbing superblocks, the story is: In commiting transaction, we will update last_trans_commited after writting superblocks. if a scrub start after writting superblocks and before last_trans_commited, generation mismatch happens! We fix it by protecting writting superblock and updating last_trans_commited with tree_log_mutex. Reported-by: Sebastian Ochmann <ochmann@informatik.uni-bonn.de> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> --- Changelog: v2->v3:move tree_log_mutex out of device_list_mutex. v1->v2: use right way to fix the problem. --- fs/btrfs/scrub.c | 11 +++++++---- fs/btrfs/transaction.c | 13 ++++++++++--- 2 files changed, 17 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 561e2f1..a9ed102 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -2887,6 +2887,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, } + mutex_lock(&fs_info->tree_log_mutex); mutex_lock(&fs_info->fs_devices->device_list_mutex); dev = btrfs_find_device(fs_info, devid, NULL, NULL); if (!dev || (dev->missing && !is_dev_replace)) { @@ -2932,14 +2933,16 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, atomic_inc(&fs_info->scrubs_running); mutex_unlock(&fs_info->scrub_lock); + /* + * holding tree_log_mutex we can avoid generation mismatch while + * scrubbing superblocks, see comments in commiting transaction + * when updating last_trans_commited. + */ if (!is_dev_replace) { - /* - * by holding device list mutex, we can - * kick off writing super in log tree sync. - */ ret = scrub_supers(sctx, dev); } mutex_unlock(&fs_info->fs_devices->device_list_mutex); + mutex_unlock(&fs_info->tree_log_mutex); if (!ret) ret = scrub_enumerate_chunks(sctx, dev, start, end, diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index c6a872a..052eb22 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -1898,15 +1898,22 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, goto cleanup_transaction; } + btrfs_finish_extent_commit(trans, root); + + /* + * we must gurantee last_trans_commited update is protected by + * tree_log_mutex with write_ctree_super together, otherwise, + * scubbing super will come in before updating last_trans_commited + * and we will get generation mismatch when scrubbing superblocks. + */ + root->fs_info->last_trans_committed = cur_trans->transid; + /* * the super is written, we can safely allow the tree-loggers * to go about their business */ mutex_unlock(&root->fs_info->tree_log_mutex); - btrfs_finish_extent_commit(trans, root); - - root->fs_info->last_trans_committed = cur_trans->transid; /* * We needn''t acquire the lock here because there is no other task * which can change it. -- 1.8.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2013-Dec-03 04:57 UTC
Re: [PATCH v3] Btrfs: fix wrong super generation mismatch when scrubbing supers
On Tue, Dec 03, 2013 at 01:33:39AM +0800, Wang Shilong wrote:> From: Wang Shilong <wangsl.fnst@cn.fujitsu.com> > > We came a race condition when scrubbing superblocks, the story is: > > In commiting transaction, we will update last_trans_commited after > writting superblocks. if a scrub start after writting superblocks > and before last_trans_commited, generation mismatch happens! > > We fix it by protecting writting superblock and updating last_trans_commited > with tree_log_mutex. > > Reported-by: Sebastian Ochmann <ochmann@informatik.uni-bonn.de> > Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> > --- > Changelog: > v2->v3:move tree_log_mutex out of device_list_mutex. > v1->v2: use right way to fix the problem. > --- > fs/btrfs/scrub.c | 11 +++++++---- > fs/btrfs/transaction.c | 13 ++++++++++--- > 2 files changed, 17 insertions(+), 7 deletions(-) > > diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c > index 561e2f1..a9ed102 100644 > --- a/fs/btrfs/scrub.c > +++ b/fs/btrfs/scrub.c > @@ -2887,6 +2887,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, > } > > > + mutex_lock(&fs_info->tree_log_mutex); > mutex_lock(&fs_info->fs_devices->device_list_mutex); > dev = btrfs_find_device(fs_info, devid, NULL, NULL); > if (!dev || (dev->missing && !is_dev_replace)) { > @@ -2932,14 +2933,16 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, > atomic_inc(&fs_info->scrubs_running); > mutex_unlock(&fs_info->scrub_lock); > > + /* > + * holding tree_log_mutex we can avoid generation mismatch while > + * scrubbing superblocks, see comments in commiting transaction > + * when updating last_trans_commited. > + */ > if (!is_dev_replace) { > - /* > - * by holding device list mutex, we can > - * kick off writing super in log tree sync. > - */ > ret = scrub_supers(sctx, dev); > } > mutex_unlock(&fs_info->fs_devices->device_list_mutex); > + mutex_unlock(&fs_info->tree_log_mutex);IIRC, we already have btrfs_scrub_{pause, continue}() to avoid race situations between committing transaction and scrub processes, why not use that instead? (Actually I don''t like adding another lock unless it''s been proved necessary and correct with lockdep.) thanks, -liubo> > if (!ret) > ret = scrub_enumerate_chunks(sctx, dev, start, end, > diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c > index c6a872a..052eb22 100644 > --- a/fs/btrfs/transaction.c > +++ b/fs/btrfs/transaction.c > @@ -1898,15 +1898,22 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, > goto cleanup_transaction; > } > > + btrfs_finish_extent_commit(trans, root); > + > + /* > + * we must gurantee last_trans_commited update is protected by > + * tree_log_mutex with write_ctree_super together, otherwise, > + * scubbing super will come in before updating last_trans_commited > + * and we will get generation mismatch when scrubbing superblocks. > + */ > + root->fs_info->last_trans_committed = cur_trans->transid; > + > /* > * the super is written, we can safely allow the tree-loggers > * to go about their business > */ > mutex_unlock(&root->fs_info->tree_log_mutex); > > - btrfs_finish_extent_commit(trans, root); > - > - root->fs_info->last_trans_committed = cur_trans->transid; > /* > * We needn''t acquire the lock here because there is no other task > * which can change it. > -- > 1.8.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Wang Shilong
2013-Dec-03 05:06 UTC
Re: [PATCH v3] Btrfs: fix wrong super generation mismatch when scrubbing supers
Hi Liu, On 12/03/2013 12:57 PM, Liu Bo wrote:> On Tue, Dec 03, 2013 at 01:33:39AM +0800, Wang Shilong wrote: >> From: Wang Shilong <wangsl.fnst@cn.fujitsu.com> >> >> We came a race condition when scrubbing superblocks, the story is: >> >> In commiting transaction, we will update last_trans_commited after >> writting superblocks. if a scrub start after writting superblocks >> and before last_trans_commited, generation mismatch happens! >> >> We fix it by protecting writting superblock and updating last_trans_commited >> with tree_log_mutex. >> >> Reported-by: Sebastian Ochmann <ochmann@informatik.uni-bonn.de> >> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> >> --- >> Changelog: >> v2->v3:move tree_log_mutex out of device_list_mutex. >> v1->v2: use right way to fix the problem. >> --- >> fs/btrfs/scrub.c | 11 +++++++---- >> fs/btrfs/transaction.c | 13 ++++++++++--- >> 2 files changed, 17 insertions(+), 7 deletions(-) >> >> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c >> index 561e2f1..a9ed102 100644 >> --- a/fs/btrfs/scrub.c >> +++ b/fs/btrfs/scrub.c >> @@ -2887,6 +2887,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, >> } >> >> >> + mutex_lock(&fs_info->tree_log_mutex); >> mutex_lock(&fs_info->fs_devices->device_list_mutex); >> dev = btrfs_find_device(fs_info, devid, NULL, NULL); >> if (!dev || (dev->missing && !is_dev_replace)) { >> @@ -2932,14 +2933,16 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, >> atomic_inc(&fs_info->scrubs_running); >> mutex_unlock(&fs_info->scrub_lock); >> >> + /* >> + * holding tree_log_mutex we can avoid generation mismatch while >> + * scrubbing superblocks, see comments in commiting transaction >> + * when updating last_trans_commited. >> + */ >> if (!is_dev_replace) { >> - /* >> - * by holding device list mutex, we can >> - * kick off writing super in log tree sync. >> - */ >> ret = scrub_supers(sctx, dev); >> } >> mutex_unlock(&fs_info->fs_devices->device_list_mutex); >> + mutex_unlock(&fs_info->tree_log_mutex); > IIRC, we already have btrfs_scrub_{pause, continue}() to avoid race > situations between committing transaction and scrub processes, why not use that > instead?btrfs_scrub_{pause,continue} can not stop the following case from happening: thread 1 thread 2 |->write_supers |->start scrub |->using last_trans_commited(not updated yet) when scrubbing supers generation in disk is up to date but in memory is not. |->updating last_trans_commited Pleae correct me if i am wrong here. :-)> > (Actually I don''t like adding another lock unless it''s been proved necessary > and correct with lockdep.)Right, i should test if it can pass lockdep. Thanks for comments. Wang> > thanks, > -liubo > >> >> if (!ret) >> ret = scrub_enumerate_chunks(sctx, dev, start, end, >> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c >> index c6a872a..052eb22 100644 >> --- a/fs/btrfs/transaction.c >> +++ b/fs/btrfs/transaction.c >> @@ -1898,15 +1898,22 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, >> goto cleanup_transaction; >> } >> >> + btrfs_finish_extent_commit(trans, root); >> + >> + /* >> + * we must gurantee last_trans_commited update is protected by >> + * tree_log_mutex with write_ctree_super together, otherwise, >> + * scubbing super will come in before updating last_trans_commited >> + * and we will get generation mismatch when scrubbing superblocks. >> + */ >> + root->fs_info->last_trans_committed = cur_trans->transid; >> + >> /* >> * the super is written, we can safely allow the tree-loggers >> * to go about their business >> */ >> mutex_unlock(&root->fs_info->tree_log_mutex); >> >> - btrfs_finish_extent_commit(trans, root); >> - >> - root->fs_info->last_trans_committed = cur_trans->transid; >> /* >> * We needn''t acquire the lock here because there is no other task >> * which can change it. >> -- >> 1.8.4 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Dec-03 05:42 UTC
Re: [PATCH v3] Btrfs: fix wrong super generation mismatch when scrubbing supers
On tue, 03 Dec 2013 13:06:34 +0800, Wang Shilong wrote:> Hi Liu, > > On 12/03/2013 12:57 PM, Liu Bo wrote: >> On Tue, Dec 03, 2013 at 01:33:39AM +0800, Wang Shilong wrote: >>> From: Wang Shilong <wangsl.fnst@cn.fujitsu.com> >>> >>> We came a race condition when scrubbing superblocks, the story is: >>> >>> In commiting transaction, we will update last_trans_commited after >>> writting superblocks. if a scrub start after writting superblocks >>> and before last_trans_commited, generation mismatch happens! >>> >>> We fix it by protecting writting superblock and updating last_trans_commited >>> with tree_log_mutex. >>> >>> Reported-by: Sebastian Ochmann <ochmann@informatik.uni-bonn.de> >>> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> >>> --- >>> Changelog: >>> v2->v3:move tree_log_mutex out of device_list_mutex. >>> v1->v2: use right way to fix the problem. >>> --- >>> fs/btrfs/scrub.c | 11 +++++++---- >>> fs/btrfs/transaction.c | 13 ++++++++++--- >>> 2 files changed, 17 insertions(+), 7 deletions(-) >>> >>> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c >>> index 561e2f1..a9ed102 100644 >>> --- a/fs/btrfs/scrub.c >>> +++ b/fs/btrfs/scrub.c >>> @@ -2887,6 +2887,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, >>> } >>> + mutex_lock(&fs_info->tree_log_mutex); >>> mutex_lock(&fs_info->fs_devices->device_list_mutex); >>> dev = btrfs_find_device(fs_info, devid, NULL, NULL); >>> if (!dev || (dev->missing && !is_dev_replace)) { >>> @@ -2932,14 +2933,16 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, >>> atomic_inc(&fs_info->scrubs_running); >>> mutex_unlock(&fs_info->scrub_lock); >>> + /* >>> + * holding tree_log_mutex we can avoid generation mismatch while >>> + * scrubbing superblocks, see comments in commiting transaction >>> + * when updating last_trans_commited. >>> + */ >>> if (!is_dev_replace) { >>> - /* >>> - * by holding device list mutex, we can >>> - * kick off writing super in log tree sync. >>> - */ >>> ret = scrub_supers(sctx, dev); >>> } >>> mutex_unlock(&fs_info->fs_devices->device_list_mutex); >>> + mutex_unlock(&fs_info->tree_log_mutex); >> IIRC, we already have btrfs_scrub_{pause, continue}() to avoid race >> situations between committing transaction and scrub processes, why not use that >> instead? > btrfs_scrub_{pause,continue} can not stop the following case from happening: > > thread 1 thread 2 > |->write_supers > |->start scrub > |->using last_trans_commited(not updated yet) when scrubbing supers > generation in disk is up to date but in memory is not. > |->updating last_trans_commited > > Pleae correct me if i am wrong here. :-)Moving btrfs_finish_extent_commit() into the log mutex may make the log tasks be blocked for a lot time. I think the better way to fix is prevent the scrubber from starting while the transaction is being committed.(wait scrub_pause_req == 0 before scrubbing the super block) Thanks Miao>> >> (Actually I don''t like adding another lock unless it''s been proved necessary >> and correct with lockdep.) > Right, i should test if it can pass lockdep. > > Thanks for comments. > Wang >> >> thanks, >> -liubo >> >>> if (!ret) >>> ret = scrub_enumerate_chunks(sctx, dev, start, end, >>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c >>> index c6a872a..052eb22 100644 >>> --- a/fs/btrfs/transaction.c >>> +++ b/fs/btrfs/transaction.c >>> @@ -1898,15 +1898,22 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, >>> goto cleanup_transaction; >>> } >>> + btrfs_finish_extent_commit(trans, root); >>> + >>> + /* >>> + * we must gurantee last_trans_commited update is protected by >>> + * tree_log_mutex with write_ctree_super together, otherwise, >>> + * scubbing super will come in before updating last_trans_commited >>> + * and we will get generation mismatch when scrubbing superblocks. >>> + */ >>> + root->fs_info->last_trans_committed = cur_trans->transid; >>> + >>> /* >>> * the super is written, we can safely allow the tree-loggers >>> * to go about their business >>> */ >>> mutex_unlock(&root->fs_info->tree_log_mutex); >>> - btrfs_finish_extent_commit(trans, root); >>> - >>> - root->fs_info->last_trans_committed = cur_trans->transid; >>> /* >>> * We needn''t acquire the lock here because there is no other task >>> * which can change it. >>> -- >>> 1.8.4 >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2013-Dec-03 06:08 UTC
Re: [PATCH v3] Btrfs: fix wrong super generation mismatch when scrubbing supers
On Tue, Dec 03, 2013 at 01:06:34PM +0800, Wang Shilong wrote:> Hi Liu, > > On 12/03/2013 12:57 PM, Liu Bo wrote: > >On Tue, Dec 03, 2013 at 01:33:39AM +0800, Wang Shilong wrote: > >>From: Wang Shilong <wangsl.fnst@cn.fujitsu.com> > >> > >>We came a race condition when scrubbing superblocks, the story is: > >> > >>In commiting transaction, we will update last_trans_commited after > >>writting superblocks. if a scrub start after writting superblocks > >>and before last_trans_commited, generation mismatch happens! > >> > >>We fix it by protecting writting superblock and updating last_trans_commited > >>with tree_log_mutex. > >> > >>Reported-by: Sebastian Ochmann <ochmann@informatik.uni-bonn.de> > >>Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> > >>--- > >>Changelog: > >> v2->v3:move tree_log_mutex out of device_list_mutex. > >> v1->v2: use right way to fix the problem. > >>--- > >> fs/btrfs/scrub.c | 11 +++++++---- > >> fs/btrfs/transaction.c | 13 ++++++++++--- > >> 2 files changed, 17 insertions(+), 7 deletions(-) > >> > >>diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c > >>index 561e2f1..a9ed102 100644 > >>--- a/fs/btrfs/scrub.c > >>+++ b/fs/btrfs/scrub.c > >>@@ -2887,6 +2887,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, > >> } > >>+ mutex_lock(&fs_info->tree_log_mutex); > >> mutex_lock(&fs_info->fs_devices->device_list_mutex); > >> dev = btrfs_find_device(fs_info, devid, NULL, NULL); > >> if (!dev || (dev->missing && !is_dev_replace)) { > >>@@ -2932,14 +2933,16 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, > >> atomic_inc(&fs_info->scrubs_running); > >> mutex_unlock(&fs_info->scrub_lock); > >>+ /* > >>+ * holding tree_log_mutex we can avoid generation mismatch while > >>+ * scrubbing superblocks, see comments in commiting transaction > >>+ * when updating last_trans_commited. > >>+ */ > >> if (!is_dev_replace) { > >>- /* > >>- * by holding device list mutex, we can > >>- * kick off writing super in log tree sync. > >>- */ > >> ret = scrub_supers(sctx, dev); > >> } > >> mutex_unlock(&fs_info->fs_devices->device_list_mutex); > >>+ mutex_unlock(&fs_info->tree_log_mutex); > >IIRC, we already have btrfs_scrub_{pause, continue}() to avoid race > >situations between committing transaction and scrub processes, why not use that > >instead? > btrfs_scrub_{pause,continue} can not stop the following case from happening: > > thread 1 thread 2 > |->write_supers > |->start scrub > |->using last_trans_commited(not updated yet) when scrubbing supers > generation in disk is up to date but in memory is not. > |->updating last_trans_commited > > Pleae correct me if i am wrong here. :-)One possible way is to check @scrub_pause_req inside scrub_supers(), before starting the real scrubing super work. scrub_super() { while (scrub_pause_req) wait for (scrub_pause_req == 0); ... } As we have a atomic_inc(&fs_info->scrubs_running) before scrub_supers(), it''d force committing transaction to wait for scrub if the scrub process is the former one in timeline. thanks, -liubo> > > >(Actually I don''t like adding another lock unless it''s been proved necessary > >and correct with lockdep.) > Right, i should test if it can pass lockdep. > > Thanks for comments. > Wang > > > >thanks, > >-liubo > > > >> if (!ret) > >> ret = scrub_enumerate_chunks(sctx, dev, start, end, > >>diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c > >>index c6a872a..052eb22 100644 > >>--- a/fs/btrfs/transaction.c > >>+++ b/fs/btrfs/transaction.c > >>@@ -1898,15 +1898,22 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, > >> goto cleanup_transaction; > >> } > >>+ btrfs_finish_extent_commit(trans, root); > >>+ > >>+ /* > >>+ * we must gurantee last_trans_commited update is protected by > >>+ * tree_log_mutex with write_ctree_super together, otherwise, > >>+ * scubbing super will come in before updating last_trans_commited > >>+ * and we will get generation mismatch when scrubbing superblocks. > >>+ */ > >>+ root->fs_info->last_trans_committed = cur_trans->transid; > >>+ > >> /* > >> * the super is written, we can safely allow the tree-loggers > >> * to go about their business > >> */ > >> mutex_unlock(&root->fs_info->tree_log_mutex); > >>- btrfs_finish_extent_commit(trans, root); > >>- > >>- root->fs_info->last_trans_committed = cur_trans->transid; > >> /* > >> * We needn''t acquire the lock here because there is no other task > >> * which can change it. > >>-- > >>1.8.4 > >> > >>-- > >>To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >>the body of a message to majordomo@vger.kernel.org > >>More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Dec-03 08:31 UTC
Re: [PATCH v3] Btrfs: fix wrong super generation mismatch when scrubbing supers
On tue, 3 Dec 2013 14:08:24 +0800, Liu Bo wrote:> On Tue, Dec 03, 2013 at 01:06:34PM +0800, Wang Shilong wrote: >> Hi Liu, >> >> On 12/03/2013 12:57 PM, Liu Bo wrote: >>> On Tue, Dec 03, 2013 at 01:33:39AM +0800, Wang Shilong wrote: >>>> From: Wang Shilong <wangsl.fnst@cn.fujitsu.com> >>>> >>>> We came a race condition when scrubbing superblocks, the story is: >>>> >>>> In commiting transaction, we will update last_trans_commited after >>>> writting superblocks. if a scrub start after writting superblocks >>>> and before last_trans_commited, generation mismatch happens! >>>> >>>> We fix it by protecting writting superblock and updating last_trans_commited >>>> with tree_log_mutex. >>>> >>>> Reported-by: Sebastian Ochmann <ochmann@informatik.uni-bonn.de> >>>> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> >>>> --- >>>> Changelog: >>>> v2->v3:move tree_log_mutex out of device_list_mutex. >>>> v1->v2: use right way to fix the problem. >>>> --- >>>> fs/btrfs/scrub.c | 11 +++++++---- >>>> fs/btrfs/transaction.c | 13 ++++++++++--- >>>> 2 files changed, 17 insertions(+), 7 deletions(-) >>>> >>>> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c >>>> index 561e2f1..a9ed102 100644 >>>> --- a/fs/btrfs/scrub.c >>>> +++ b/fs/btrfs/scrub.c >>>> @@ -2887,6 +2887,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, >>>> } >>>> + mutex_lock(&fs_info->tree_log_mutex); >>>> mutex_lock(&fs_info->fs_devices->device_list_mutex); >>>> dev = btrfs_find_device(fs_info, devid, NULL, NULL); >>>> if (!dev || (dev->missing && !is_dev_replace)) { >>>> @@ -2932,14 +2933,16 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, >>>> atomic_inc(&fs_info->scrubs_running); >>>> mutex_unlock(&fs_info->scrub_lock); >>>> + /* >>>> + * holding tree_log_mutex we can avoid generation mismatch while >>>> + * scrubbing superblocks, see comments in commiting transaction >>>> + * when updating last_trans_commited. >>>> + */ >>>> if (!is_dev_replace) { >>>> - /* >>>> - * by holding device list mutex, we can >>>> - * kick off writing super in log tree sync. >>>> - */ >>>> ret = scrub_supers(sctx, dev); >>>> } >>>> mutex_unlock(&fs_info->fs_devices->device_list_mutex); >>>> + mutex_unlock(&fs_info->tree_log_mutex); >>> IIRC, we already have btrfs_scrub_{pause, continue}() to avoid race >>> situations between committing transaction and scrub processes, why not use that >>> instead? >> btrfs_scrub_{pause,continue} can not stop the following case from happening: >> >> thread 1 thread 2 >> |->write_supers >> |->start scrub >> |->using last_trans_commited(not updated yet) when scrubbing supers >> generation in disk is up to date but in memory is not. >> |->updating last_trans_commited >> >> Pleae correct me if i am wrong here. :-) > > One possible way is to check @scrub_pause_req inside scrub_supers(), > before starting the real scrubing super work. > > scrub_super() > { > while (scrub_pause_req) > wait for (scrub_pause_req == 0); > > ... > } > > As we have a atomic_inc(&fs_info->scrubs_running) before scrub_supers(), > it''d force committing transaction to wait for scrub if the scrub process > is the former one in timeline.Great minds think alike! Thanks Miao> > thanks, > -liubo > >>> >>> (Actually I don''t like adding another lock unless it''s been proved necessary >>> and correct with lockdep.) >> Right, i should test if it can pass lockdep. >> >> Thanks for comments. >> Wang >>> >>> thanks, >>> -liubo >>> >>>> if (!ret) >>>> ret = scrub_enumerate_chunks(sctx, dev, start, end, >>>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c >>>> index c6a872a..052eb22 100644 >>>> --- a/fs/btrfs/transaction.c >>>> +++ b/fs/btrfs/transaction.c >>>> @@ -1898,15 +1898,22 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, >>>> goto cleanup_transaction; >>>> } >>>> + btrfs_finish_extent_commit(trans, root); >>>> + >>>> + /* >>>> + * we must gurantee last_trans_commited update is protected by >>>> + * tree_log_mutex with write_ctree_super together, otherwise, >>>> + * scubbing super will come in before updating last_trans_commited >>>> + * and we will get generation mismatch when scrubbing superblocks. >>>> + */ >>>> + root->fs_info->last_trans_committed = cur_trans->transid; >>>> + >>>> /* >>>> * the super is written, we can safely allow the tree-loggers >>>> * to go about their business >>>> */ >>>> mutex_unlock(&root->fs_info->tree_log_mutex); >>>> - btrfs_finish_extent_commit(trans, root); >>>> - >>>> - root->fs_info->last_trans_committed = cur_trans->transid; >>>> /* >>>> * We needn''t acquire the lock here because there is no other task >>>> * which can change it. >>>> -- >>>> 1.8.4 >>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Sebastian Ochmann
2013-Dec-03 19:14 UTC
Re: [PATCH v3] Btrfs: fix wrong super generation mismatch when scrubbing supers
Hello, I know, the discussion on how to fix the problem best is still on-going, but I wanted to add that I tried v3 of the patch against btrfs-next on my machine. Without the patch, I was able to reproduce the problem within a few minutes; after applying it, I wasn''t able to trigger it for 50 minutes now. I can''t tell whether the problem would reoccur when running my little test for another week or so, but I can tell that my machine did not catch fire either. So it seems like you''re on the right track. :) I can also try another version of the patch when it becomes available. Thanks, Sebastian On 02.12.2013 18:33, Wang Shilong wrote:> From: Wang Shilong <wangsl.fnst@cn.fujitsu.com> > > We came a race condition when scrubbing superblocks, the story is: > > In commiting transaction, we will update last_trans_commited after > writting superblocks. if a scrub start after writting superblocks > and before last_trans_commited, generation mismatch happens! > > We fix it by protecting writting superblock and updating last_trans_commited > with tree_log_mutex. > > Reported-by: Sebastian Ochmann <ochmann@informatik.uni-bonn.de> > Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> > --- > Changelog: > v2->v3:move tree_log_mutex out of device_list_mutex. > v1->v2: use right way to fix the problem. > --- > fs/btrfs/scrub.c | 11 +++++++---- > fs/btrfs/transaction.c | 13 ++++++++++--- > 2 files changed, 17 insertions(+), 7 deletions(-) > > diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c > index 561e2f1..a9ed102 100644 > --- a/fs/btrfs/scrub.c > +++ b/fs/btrfs/scrub.c > @@ -2887,6 +2887,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, > } > > > + mutex_lock(&fs_info->tree_log_mutex); > mutex_lock(&fs_info->fs_devices->device_list_mutex); > dev = btrfs_find_device(fs_info, devid, NULL, NULL); > if (!dev || (dev->missing && !is_dev_replace)) { > @@ -2932,14 +2933,16 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, > atomic_inc(&fs_info->scrubs_running); > mutex_unlock(&fs_info->scrub_lock); > > + /* > + * holding tree_log_mutex we can avoid generation mismatch while > + * scrubbing superblocks, see comments in commiting transaction > + * when updating last_trans_commited. > + */ > if (!is_dev_replace) { > - /* > - * by holding device list mutex, we can > - * kick off writing super in log tree sync. > - */ > ret = scrub_supers(sctx, dev); > } > mutex_unlock(&fs_info->fs_devices->device_list_mutex); > + mutex_unlock(&fs_info->tree_log_mutex); > > if (!ret) > ret = scrub_enumerate_chunks(sctx, dev, start, end, > diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c > index c6a872a..052eb22 100644 > --- a/fs/btrfs/transaction.c > +++ b/fs/btrfs/transaction.c > @@ -1898,15 +1898,22 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, > goto cleanup_transaction; > } > > + btrfs_finish_extent_commit(trans, root); > + > + /* > + * we must gurantee last_trans_commited update is protected by > + * tree_log_mutex with write_ctree_super together, otherwise, > + * scubbing super will come in before updating last_trans_commited > + * and we will get generation mismatch when scrubbing superblocks. > + */ > + root->fs_info->last_trans_committed = cur_trans->transid; > + > /* > * the super is written, we can safely allow the tree-loggers > * to go about their business > */ > mutex_unlock(&root->fs_info->tree_log_mutex); > > - btrfs_finish_extent_commit(trans, root); > - > - root->fs_info->last_trans_committed = cur_trans->transid; > /* > * We needn''t acquire the lock here because there is no other task > * which can change it. >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Wang Shilong
2013-Dec-04 02:43 UTC
Re: [PATCH v3] Btrfs: fix wrong super generation mismatch when scrubbing supers
On 12/04/2013 03:14 AM, Sebastian Ochmann wrote:> Hello, > > I know, the discussion on how to fix the problem best is still > on-going, but I wanted to add that I tried v3 of the patch against > btrfs-next on my machine. Without the patch, I was able to reproduce > the problem within a few minutes; after applying it, I wasn''t able to > trigger it for 50 minutes now. > > I can''t tell whether the problem would reoccur when running my little > test for another week or so, but I can tell that my machine did not > catch fire either. So it seems like you''re on the right track. :) > > I can also try another version of the patch when it becomes available.Thanks very much, new patch will be sent later.:-) Thanks, Wang> > Thanks, > Sebastian > > On 02.12.2013 18:33, Wang Shilong wrote: >> From: Wang Shilong <wangsl.fnst@cn.fujitsu.com> >> >> We came a race condition when scrubbing superblocks, the story is: >> >> In commiting transaction, we will update last_trans_commited after >> writting superblocks. if a scrub start after writting superblocks >> and before last_trans_commited, generation mismatch happens! >> >> We fix it by protecting writting superblock and updating >> last_trans_commited >> with tree_log_mutex. >> >> Reported-by: Sebastian Ochmann <ochmann@informatik.uni-bonn.de> >> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> >> --- >> Changelog: >> v2->v3:move tree_log_mutex out of device_list_mutex. >> v1->v2: use right way to fix the problem. >> --- >> fs/btrfs/scrub.c | 11 +++++++---- >> fs/btrfs/transaction.c | 13 ++++++++++--- >> 2 files changed, 17 insertions(+), 7 deletions(-) >> >> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c >> index 561e2f1..a9ed102 100644 >> --- a/fs/btrfs/scrub.c >> +++ b/fs/btrfs/scrub.c >> @@ -2887,6 +2887,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info >> *fs_info, u64 devid, u64 start, >> } >> >> >> + mutex_lock(&fs_info->tree_log_mutex); >> mutex_lock(&fs_info->fs_devices->device_list_mutex); >> dev = btrfs_find_device(fs_info, devid, NULL, NULL); >> if (!dev || (dev->missing && !is_dev_replace)) { >> @@ -2932,14 +2933,16 @@ int btrfs_scrub_dev(struct btrfs_fs_info >> *fs_info, u64 devid, u64 start, >> atomic_inc(&fs_info->scrubs_running); >> mutex_unlock(&fs_info->scrub_lock); >> >> + /* >> + * holding tree_log_mutex we can avoid generation mismatch while >> + * scrubbing superblocks, see comments in commiting transaction >> + * when updating last_trans_commited. >> + */ >> if (!is_dev_replace) { >> - /* >> - * by holding device list mutex, we can >> - * kick off writing super in log tree sync. >> - */ >> ret = scrub_supers(sctx, dev); >> } >> mutex_unlock(&fs_info->fs_devices->device_list_mutex); >> + mutex_unlock(&fs_info->tree_log_mutex); >> >> if (!ret) >> ret = scrub_enumerate_chunks(sctx, dev, start, end, >> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c >> index c6a872a..052eb22 100644 >> --- a/fs/btrfs/transaction.c >> +++ b/fs/btrfs/transaction.c >> @@ -1898,15 +1898,22 @@ int btrfs_commit_transaction(struct >> btrfs_trans_handle *trans, >> goto cleanup_transaction; >> } >> >> + btrfs_finish_extent_commit(trans, root); >> + >> + /* >> + * we must gurantee last_trans_commited update is protected by >> + * tree_log_mutex with write_ctree_super together, otherwise, >> + * scubbing super will come in before updating last_trans_commited >> + * and we will get generation mismatch when scrubbing superblocks. >> + */ >> + root->fs_info->last_trans_committed = cur_trans->transid; >> + >> /* >> * the super is written, we can safely allow the tree-loggers >> * to go about their business >> */ >> mutex_unlock(&root->fs_info->tree_log_mutex); >> >> - btrfs_finish_extent_commit(trans, root); >> - >> - root->fs_info->last_trans_committed = cur_trans->transid; >> /* >> * We needn''t acquire the lock here because there is no other task >> * which can change it. >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html