Patch 1-5 can make balance process bail out gracefully after we get into readonly by aborting transaction. Patch 6 addresses a hang when we have two or more trans handle aborted and committed themselves. Liu Bo (6): Btrfs: check for NULL pointer in updating reloc roots Btrfs: build up error handling for merge_reloc_roots Btrfs: free all recorded tree blocks on error Btrfs: do not BUG_ON in prepare_to_reloc Btrfs: do not BUG_ON on aborted situation Btrfs: avoid deadlock on transaction waiting list fs/btrfs/relocation.c | 74 +++++++++++++++++++++++++++++++++++++----------- fs/btrfs/transaction.c | 7 ++++ fs/btrfs/volumes.c | 9 ++++- 3 files changed, 71 insertions(+), 19 deletions(-) -- 1.7.7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2013-Mar-04 16:25 UTC
[PATCH 1/6] Btrfs: check for NULL pointer in updating reloc roots
Add a check for NULL pointer to avoid invalid reference. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- fs/btrfs/relocation.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 300e09a..d9be73b 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -1269,6 +1269,8 @@ static int __update_reloc_root(struct btrfs_root *root, int del) } spin_unlock(&rc->reloc_root_tree.lock); + if (!node) + return 0; BUG_ON((struct btrfs_root *)node->data != root); if (!del) { -- 1.7.7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2013-Mar-04 16:25 UTC
[PATCH 2/6] Btrfs: build up error handling for merge_reloc_roots
We first use btrfs_std_error hook to replace with BUG_ON, and we also need to cleanup what is left, including reloc roots rbtree and reloc roots list. Here we use a helper function to cleanup both rbtree and list, and since this function can also be used in the balance recover path, we also make the change as well to keep code simple. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- fs/btrfs/relocation.c | 47 +++++++++++++++++++++++++++++++++++------------ 1 files changed, 35 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index d9be73b..8608afb 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -2240,13 +2240,28 @@ again: } static noinline_for_stack +void free_reloc_roots(struct list_head *list) +{ + struct btrfs_root *reloc_root; + + while (!list_empty(list)) { + reloc_root = list_entry(list->next, struct btrfs_root, + root_list); + __update_reloc_root(reloc_root, 1); + free_extent_buffer(reloc_root->node); + free_extent_buffer(reloc_root->commit_root); + kfree(reloc_root); + } +} + +static noinline_for_stack int merge_reloc_roots(struct reloc_control *rc) { struct btrfs_root *root; struct btrfs_root *reloc_root; LIST_HEAD(reloc_roots); int found = 0; - int ret; + int ret = 0; again: root = rc->extent_root; @@ -2272,20 +2287,33 @@ again: BUG_ON(root->reloc_root != reloc_root); ret = merge_reloc_root(rc, root); - BUG_ON(ret); + if (ret) + goto out; } else { list_del_init(&reloc_root->root_list); } ret = btrfs_drop_snapshot(reloc_root, rc->block_rsv, 0, 1); - BUG_ON(ret < 0); + if (ret < 0) { + if (list_empty(&reloc_root->root_list)) + list_add_tail(&reloc_root->root_list, + &reloc_roots); + goto out; + } } if (found) { found = 0; goto again; } +out: + if (ret) { + btrfs_std_error(root->fs_info, ret); + if (!list_empty(&reloc_roots)) + free_reloc_roots(&reloc_roots); + } + BUG_ON(!RB_EMPTY_ROOT(&rc->reloc_root_tree.rb_root)); - return 0; + return ret; } static void free_block_list(struct rb_root *blocks) @@ -4266,14 +4294,9 @@ int btrfs_recover_relocation(struct btrfs_root *root) out_free: kfree(rc); out: - while (!list_empty(&reloc_roots)) { - reloc_root = list_entry(reloc_roots.next, - struct btrfs_root, root_list); - list_del(&reloc_root->root_list); - free_extent_buffer(reloc_root->node); - free_extent_buffer(reloc_root->commit_root); - kfree(reloc_root); - } + if (!list_empty(&reloc_roots)) + free_reloc_roots(&reloc_roots); + btrfs_free_path(path); if (err == 0) { -- 1.7.7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
We''ve missed the ''free blocks'' part on ENOMEM error. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- fs/btrfs/relocation.c | 9 ++++++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 8608afb..8cae3d7 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -2848,8 +2848,10 @@ int relocate_tree_blocks(struct btrfs_trans_handle *trans, int err = 0; path = btrfs_alloc_path(); - if (!path) - return -ENOMEM; + if (!path) { + err = -ENOMEM; + goto out_path; + } rb_node = rb_first(blocks); while (rb_node) { @@ -2888,10 +2890,11 @@ int relocate_tree_blocks(struct btrfs_trans_handle *trans, rb_node = rb_next(rb_node); } out: - free_block_list(blocks); err = finish_pending_nodes(trans, rc, path, err); btrfs_free_path(path); +out_path: + free_block_list(blocks); return err; } -- 1.7.7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
We can bail out from here gracefully instead of a cold BUG_ON. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- fs/btrfs/relocation.c | 10 +++++++++- 1 files changed, 9 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 8cae3d7..c45c833 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -3731,7 +3731,15 @@ int prepare_to_relocate(struct reloc_control *rc) set_reloc_control(rc); trans = btrfs_join_transaction(rc->extent_root); - BUG_ON(IS_ERR(trans)); + if (IS_ERR(trans)) { + unset_reloc_control(rc); + /* + * extent tree is not a ref_cow tree and has no reloc_root to + * cleanup. And callers are responsible to free the above + * block rsv. + */ + return PTR_ERR(trans); + } btrfs_commit_transaction(trans, rc->extent_root); return 0; } -- 1.7.7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Btrfs balance can easily hit BUG_ON in these places, but we want to it bail out gracefully after we force the whole filesystem to readonly. So we use btrfs_std_error hook in place of BUG_ON. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- fs/btrfs/relocation.c | 6 +++++- fs/btrfs/volumes.c | 9 +++++++-- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index c45c833..9fd63e9 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -3771,7 +3771,11 @@ static noinline_for_stack int relocate_block_group(struct reloc_control *rc) while (1) { progress++; trans = btrfs_start_transaction(rc->extent_root, 0); - BUG_ON(IS_ERR(trans)); + if (IS_ERR(trans)) { + err = PTR_ERR(trans); + trans = NULL; + break; + } restart: if (update_backref_cache(trans, &rc->backref_cache)) { btrfs_end_transaction(trans, rc->extent_root); diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 5349e17..0a1559f 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2357,7 +2357,11 @@ static int btrfs_relocate_chunk(struct btrfs_root *root, return ret; trans = btrfs_start_transaction(root, 0); - BUG_ON(IS_ERR(trans)); + if (IS_ERR(trans)) { + ret = PTR_ERR(trans); + btrfs_std_error(root->fs_info, ret); + return ret; + } lock_chunks(root); @@ -3024,7 +3028,8 @@ static void __cancel_balance(struct btrfs_fs_info *fs_info) unset_balance_control(fs_info); ret = del_balance_item(fs_info->tree_root); - BUG_ON(ret); + if (ret) + btrfs_std_error(fs_info, ret); atomic_set(&fs_info->mutually_exclusive_operation_running, 0); } -- 1.7.7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2013-Mar-04 16:25 UTC
[PATCH 6/6] Btrfs: avoid deadlock on transaction waiting list
Only let one trans handle to wait for other handles, otherwise we will get ABBA issues. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- fs/btrfs/transaction.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 4caa1fa..7cc38ed 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -1394,6 +1394,13 @@ static void cleanup_transaction(struct btrfs_trans_handle *trans, btrfs_abort_transaction(trans, root, err); spin_lock(&root->fs_info->trans_lock); + + if (list_empty(&cur_trans->list)) { + spin_unlock(&root->fs_info->trans_lock); + btrfs_end_transaction(trans, root); + return; + } + list_del_init(&cur_trans->list); if (cur_trans == root->fs_info->running_transaction) { root->fs_info->trans_no_join = 1; -- 1.7.7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2013-Mar-18 14:16 UTC
Re: [PATCH 2/6] Btrfs: build up error handling for merge_reloc_roots
On Tue, Mar 05, 2013 at 12:25:37AM +0800, Liu Bo wrote:> We first use btrfs_std_error hook to replace with BUG_ON, and we > also need to cleanup what is left, including reloc roots rbtree > and reloc roots list. > Here we use a helper function to cleanup both rbtree and list, and > since this function can also be used in the balance recover path, > we also make the change as well to keep code simple.I''ve noticed that return value from merge_reloc_roots is never checked in the callers. Did you verify that this is ok? For example when called from btrfs_recover_relocation, is it possible that the reloc-to-be-recovered is still left unprocessed but the filesystem is going to be silently mounted read-write? The mount and remount callpaths check for recover_relocation errors and do not proceed otherwise. In RO mount, it''s not called. So, either merge_reloc_roots callers should catch the errors or there are none by design (ie. the whole reloc operation is restartable). david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2013-Mar-19 12:16 UTC
Re: [PATCH 2/6] Btrfs: build up error handling for merge_reloc_roots
On Mon, Mar 18, 2013 at 03:16:12PM +0100, David Sterba wrote:> On Tue, Mar 05, 2013 at 12:25:37AM +0800, Liu Bo wrote: > > We first use btrfs_std_error hook to replace with BUG_ON, and we > > also need to cleanup what is left, including reloc roots rbtree > > and reloc roots list. > > Here we use a helper function to cleanup both rbtree and list, and > > since this function can also be used in the balance recover path, > > we also make the change as well to keep code simple. > > I''ve noticed that return value from merge_reloc_roots is never checked > in the callers. Did you verify that this is ok?Yeah, it''s fine. Actually we set fs to RO once we get error here, as we have recorded a balance item and , balance can start over again the next time.> > For example when called from btrfs_recover_relocation, is it possible > that the reloc-to-be-recovered is still left unprocessed but the > filesystem is going to be silently mounted read-write?hmm, I''m not able to picture such a case...> > The mount and remount callpaths check for recover_relocation errors and > do not proceed otherwise. In RO mount, it''s not called. > > So, either merge_reloc_roots callers should catch the errors or there > are none by design (ie. the whole reloc operation is restartable). >it''s desigend to be always ready for a crash or a power off. thanks, liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2013-Mar-28 14:15 UTC
Re: [PATCH 2/6] Btrfs: build up error handling for merge_reloc_roots
On Tue, Mar 19, 2013 at 08:16:41PM +0800, Liu Bo wrote:> On Mon, Mar 18, 2013 at 03:16:12PM +0100, David Sterba wrote: > > I''ve noticed that return value from merge_reloc_roots is never checked > > in the callers. Did you verify that this is ok? > > Yeah, it''s fine.Then it''s ok to change return value to ''void'' so it does not look like an unhandled errorcode.> Actually we set fs to RO once we get error here, as we have recorded a balance > item and , balance can start over again the next time.Yeah, if we get there via transaction abort. My background motivation is to implement a (much) more responsive balance cancel. This is different from the poweroff "cancel", because it does not keep any in-memory state. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html