Hi Chris, I''ve got 5 small fixes for the tree mod log and a comment fix, here. Each of them is making the overall situation a little better. I''m not sure whether these also fix the problems liubo was seeing, we''re still sorting that out. So there''s probably one more fix to come, but if so, I don''t expect anything huge (hrhr). You can also pull them from the usual location: git://git.jan-o-sch.net/btrfs-unstable for-chris Thanks, -Jan Jan Schmidt (6): Btrfs: don''t put removals from push_node_left into tree mod log twice Btrfs: fix a tree mod logging issue for root replacement operations Btrfs: tree mod log''s old roots could still be part of the tree Btrfs: determine level of old roots Btrfs: fix extent buffer reference for tree mod log roots Btrfs: comment for loop in tree_mod_log_insert_move fs/btrfs/backref.c | 4 +-- fs/btrfs/ctree.c | 59 ++++++++++++++++++++++++++++++++++++++++----------- fs/btrfs/ctree.h | 1 + 3 files changed, 48 insertions(+), 16 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Schmidt
2012-Oct-23 13:55 UTC
[PATCH 1/6] Btrfs: don''t put removals from push_node_left into tree mod log twice
Independant of the check (push_items < src_items) tree_mod_log_eb_copy did log the removal of the old data entries from the source buffer. Therefore, we must not call tree_mod_log_eb_move if the check evaluates to true, as that would log the removal twice, finally resulting in (rewinded) buffers with wrong values for header_nritems. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> --- fs/btrfs/ctree.c | 9 +++++++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index b334362..44a7e25 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1225,6 +1225,8 @@ tree_mod_log_rewind(struct btrfs_fs_info *fs_info, struct extent_buffer *eb, free_extent_buffer(eb); __tree_mod_log_rewind(eb_rewin, time_seq, tm); + WARN_ON(btrfs_header_nritems(eb_rewin) > + BTRFS_NODEPTRS_PER_BLOCK(fs_info->fs_root)); return eb_rewin; } @@ -1280,6 +1282,7 @@ get_old_root(struct btrfs_root *root, u64 time_seq) else WARN_ON(btrfs_header_level(eb) != 0); extent_buffer_get(eb); + WARN_ON(btrfs_header_nritems(eb) > BTRFS_NODEPTRS_PER_BLOCK(root)); return eb; } @@ -2970,8 +2973,10 @@ static int push_node_left(struct btrfs_trans_handle *trans, push_items * sizeof(struct btrfs_key_ptr)); if (push_items < src_nritems) { - tree_mod_log_eb_move(root->fs_info, src, 0, push_items, - src_nritems - push_items); + /* + * don''t call tree_mod_log_eb_move here, key removal was already + * fully logged by tree_mod_log_eb_copy above. + */ memmove_extent_buffer(src, btrfs_node_key_ptr_offset(0), btrfs_node_key_ptr_offset(push_items), (src_nritems - push_items) * -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Schmidt
2012-Oct-23 13:55 UTC
[PATCH 2/6] Btrfs: fix a tree mod logging issue for root replacement operations
Avoid the implicit free by tree_mod_log_set_root_pointer, which is wrong in two places. Where needed, we call tree_mod_log_free_eb explicitly now. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> --- fs/btrfs/ctree.c | 10 ++-------- 1 files changed, 2 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 44a7e25..e6b75cc 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -647,8 +647,6 @@ tree_mod_log_insert_root(struct btrfs_fs_info *fs_info, if (tree_mod_dont_log(fs_info, NULL)) return 0; - __tree_mod_log_free_eb(fs_info, old_root); - ret = tree_mod_alloc(fs_info, flags, &tm); if (ret < 0) goto out; @@ -926,12 +924,7 @@ static noinline int update_ref_for_cow(struct btrfs_trans_handle *trans, ret = btrfs_dec_ref(trans, root, buf, 1, 1); BUG_ON(ret); /* -ENOMEM */ } - /* - * don''t log freeing in case we''re freeing the root node, this - * is done by tree_mod_log_set_root_pointer later - */ - if (buf != root->node && btrfs_header_level(buf) != 0) - tree_mod_log_free_eb(root->fs_info, buf); + tree_mod_log_free_eb(root->fs_info, buf); clean_tree_block(trans, root, buf); *last_ref = 1; } @@ -1728,6 +1721,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, goto enospc; } + tree_mod_log_free_eb(root->fs_info, root->node); tree_mod_log_set_root_pointer(root, child); rcu_assign_pointer(root->node, child); -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Schmidt
2012-Oct-23 13:55 UTC
[PATCH 3/6] Btrfs: tree mod log''s old roots could still be part of the tree
Tree mod log treated old root buffers as always empty buffers when starting the rewind operations. However, the old root may still be part of the current tree at a lower level, with still some valid entries. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> --- fs/btrfs/ctree.c | 16 ++++++++++++++-- 1 files changed, 14 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index e6b75cc..1b60c56 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1239,6 +1239,7 @@ get_old_root(struct btrfs_root *root, u64 time_seq) struct tree_mod_root *old_root = NULL; u64 old_generation = 0; u64 logical; + u32 blocksize; eb = btrfs_read_lock_root_node(root); tm = __tree_mod_log_oldest_root(root->fs_info, root, time_seq); @@ -1254,10 +1255,21 @@ get_old_root(struct btrfs_root *root, u64 time_seq) } tm = tree_mod_log_search(root->fs_info, logical, time_seq); - if (old_root) + if (old_root && tm && tm->op != MOD_LOG_KEY_REMOVE_WHILE_FREEING) { + blocksize = btrfs_level_size(root, old_root->level); + eb = read_tree_block(root, logical, blocksize, 0); + if (!eb) { + pr_warn("btrfs: failed to read tree block %llu from get_old_root\n", + logical); + WARN_ON(1); + } else { + eb = btrfs_clone_extent_buffer(eb); + } + } else if (old_root) { eb = alloc_dummy_extent_buffer(logical, root->nodesize); - else + } else { eb = btrfs_clone_extent_buffer(root->node); + } btrfs_tree_read_unlock(root->node); free_extent_buffer(root->node); if (!eb) -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
In btrfs_find_all_roots'' termination condition, we compare the level of the old buffer we got from btrfs_search_old_slot to the level of the current root node. We''d better compare it to the level of the rewinded root node. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> --- fs/btrfs/backref.c | 4 +--- fs/btrfs/ctree.c | 17 +++++++++++++++++ fs/btrfs/ctree.h | 1 + 3 files changed, 19 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c index f318793..65608fb 100644 --- a/fs/btrfs/backref.c +++ b/fs/btrfs/backref.c @@ -283,9 +283,7 @@ static int __resolve_indirect_ref(struct btrfs_fs_info *fs_info, goto out; } - rcu_read_lock(); - root_level = btrfs_header_level(root->node); - rcu_read_unlock(); + root_level = btrfs_old_root_level(root, time_seq); if (root_level + 1 == level) goto out; diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 1b60c56..7093ff5 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1292,6 +1292,23 @@ get_old_root(struct btrfs_root *root, u64 time_seq) return eb; } +int btrfs_old_root_level(struct btrfs_root *root, u64 time_seq) +{ + struct tree_mod_elem *tm; + int level; + + tm = __tree_mod_log_oldest_root(root->fs_info, root, time_seq); + if (tm && tm->op == MOD_LOG_ROOT_REPLACE) { + level = tm->old_root.level; + } else { + rcu_read_lock(); + level = btrfs_header_level(root->node); + rcu_read_unlock(); + } + + return level; +} + static inline int should_cow_block(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct extent_buffer *buf) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 1630be8..34c5a44 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3120,6 +3120,7 @@ static inline u64 btrfs_inc_tree_mod_seq(struct btrfs_fs_info *fs_info) { return atomic_inc_return(&fs_info->tree_mod_seq); } +int btrfs_old_root_level(struct btrfs_root *root, u64 time_seq); /* root-item.c */ int btrfs_find_root_ref(struct btrfs_root *tree_root, -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Schmidt
2012-Oct-23 13:55 UTC
[PATCH 5/6] Btrfs: fix extent buffer reference for tree mod log roots
In get_old_root we grab a lock on the extent buffer before we obtain a reference on that buffer. That order is changed now. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> --- fs/btrfs/ctree.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 7093ff5..bed06e8 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1274,6 +1274,7 @@ get_old_root(struct btrfs_root *root, u64 time_seq) free_extent_buffer(root->node); if (!eb) return NULL; + extent_buffer_get(eb); btrfs_tree_read_lock(eb); if (old_root) { btrfs_set_header_bytenr(eb, eb->start); @@ -1286,7 +1287,6 @@ get_old_root(struct btrfs_root *root, u64 time_seq) __tree_mod_log_rewind(eb, time_seq, tm); else WARN_ON(btrfs_header_level(eb) != 0); - extent_buffer_get(eb); WARN_ON(btrfs_header_nritems(eb) > BTRFS_NODEPTRS_PER_BLOCK(root)); return eb; -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Schmidt
2012-Oct-23 13:55 UTC
[PATCH 6/6] Btrfs: comment for loop in tree_mod_log_insert_move
Emphasis the way tree_mod_log_insert_move avoids adding MOD_LOG_KEY_REMOVE_WHILE_MOVING operations, depending on the direction of the move operation. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> --- fs/btrfs/ctree.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index bed06e8..c608b3c 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -596,6 +596,11 @@ tree_mod_log_insert_move(struct btrfs_fs_info *fs_info, if (tree_mod_dont_log(fs_info, eb)) return 0; + /* + * When we override something during the move, we log these removals. + * This can only happen when we move towards the beginning of the + * buffer, i.e. dst_slot < src_slot. + */ for (i = 0; i + dst_slot < src_slot && i < nr_items; i++) { ret = tree_mod_log_insert_key_locked(fs_info, eb, i + dst_slot, MOD_LOG_KEY_REMOVE_WHILE_MOVING); -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Oct-23 15:23 UTC
Re: [PATCH 1/6] Btrfs: don''t put removals from push_node_left into tree mod log twice
On 10/23/2012 09:55 PM, Jan Schmidt wrote:> Independant of the check (push_items < src_items) tree_mod_log_eb_copy did > log the removal of the old data entries from the source buffer. Therefore, > we must not call tree_mod_log_eb_move if the check evaluates to true, as > that would log the removal twice, finally resulting in (rewinded) buffers > with wrong values for header_nritems. >Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Tested-by: Liu Bo <bo.li.liu@oracle.com> thanks, liubo> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> > --- > fs/btrfs/ctree.c | 9 +++++++-- > 1 files changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index b334362..44a7e25 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -1225,6 +1225,8 @@ tree_mod_log_rewind(struct btrfs_fs_info *fs_info, struct extent_buffer *eb, > free_extent_buffer(eb); > > __tree_mod_log_rewind(eb_rewin, time_seq, tm); > + WARN_ON(btrfs_header_nritems(eb_rewin) > > + BTRFS_NODEPTRS_PER_BLOCK(fs_info->fs_root)); > > return eb_rewin; > } > @@ -1280,6 +1282,7 @@ get_old_root(struct btrfs_root *root, u64 time_seq) > else > WARN_ON(btrfs_header_level(eb) != 0); > extent_buffer_get(eb); > + WARN_ON(btrfs_header_nritems(eb) > BTRFS_NODEPTRS_PER_BLOCK(root)); > > return eb; > } > @@ -2970,8 +2973,10 @@ static int push_node_left(struct btrfs_trans_handle *trans, > push_items * sizeof(struct btrfs_key_ptr)); > > if (push_items < src_nritems) { > - tree_mod_log_eb_move(root->fs_info, src, 0, push_items, > - src_nritems - push_items); > + /* > + * don''t call tree_mod_log_eb_move here, key removal was already > + * fully logged by tree_mod_log_eb_copy above. > + */ > memmove_extent_buffer(src, btrfs_node_key_ptr_offset(0), > btrfs_node_key_ptr_offset(push_items), > (src_nritems - push_items) * >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Oct-23 15:28 UTC
Re: [PATCH 2/6] Btrfs: fix a tree mod logging issue for root replacement operations
On 10/23/2012 09:55 PM, Jan Schmidt wrote:> Avoid the implicit free by tree_mod_log_set_root_pointer, which is wrong in > two places. Where needed, we call tree_mod_log_free_eb explicitly now. > > Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> > --- > fs/btrfs/ctree.c | 10 ++-------- > 1 files changed, 2 insertions(+), 8 deletions(-) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index 44a7e25..e6b75cc 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -647,8 +647,6 @@ tree_mod_log_insert_root(struct btrfs_fs_info *fs_info, > if (tree_mod_dont_log(fs_info, NULL)) > return 0; > > - __tree_mod_log_free_eb(fs_info, old_root); > - > ret = tree_mod_alloc(fs_info, flags, &tm); > if (ret < 0) > goto out; > @@ -926,12 +924,7 @@ static noinline int update_ref_for_cow(struct btrfs_trans_handle *trans, > ret = btrfs_dec_ref(trans, root, buf, 1, 1); > BUG_ON(ret); /* -ENOMEM */ > } > - /* > - * don''t log freeing in case we''re freeing the root node, this > - * is done by tree_mod_log_set_root_pointer later > - */ > - if (buf != root->node && btrfs_header_level(buf) != 0) > - tree_mod_log_free_eb(root->fs_info, buf); > + tree_mod_log_free_eb(root->fs_info, buf);Since we''ve owned a check for if eb''s level is 0 in tree_mod_dont_log(), we can also get rid of these level checks in other places, can''t we?> clean_tree_block(trans, root, buf); > *last_ref = 1; > } > @@ -1728,6 +1721,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, > goto enospc; > } > > + tree_mod_log_free_eb(root->fs_info, root->node); > tree_mod_log_set_root_pointer(root, child); > rcu_assign_pointer(root->node, child); > >Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Tested-by: Liu Bo <bo.li.liu@oracle.com> thanks, liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Oct-23 15:30 UTC
Re: [PATCH 3/6] Btrfs: tree mod log''s old roots could still be part of the tree
On 10/23/2012 09:55 PM, Jan Schmidt wrote:> Tree mod log treated old root buffers as always empty buffers when starting > the rewind operations. However, the old root may still be part of the > current tree at a lower level, with still some valid entries. >Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Tested-by: Liu Bo <bo.li.liu@oracle.com> thanks, liubo> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> > --- > fs/btrfs/ctree.c | 16 ++++++++++++++-- > 1 files changed, 14 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index e6b75cc..1b60c56 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -1239,6 +1239,7 @@ get_old_root(struct btrfs_root *root, u64 time_seq) > struct tree_mod_root *old_root = NULL; > u64 old_generation = 0; > u64 logical; > + u32 blocksize; > > eb = btrfs_read_lock_root_node(root); > tm = __tree_mod_log_oldest_root(root->fs_info, root, time_seq); > @@ -1254,10 +1255,21 @@ get_old_root(struct btrfs_root *root, u64 time_seq) > } > > tm = tree_mod_log_search(root->fs_info, logical, time_seq); > - if (old_root) > + if (old_root && tm && tm->op != MOD_LOG_KEY_REMOVE_WHILE_FREEING) { > + blocksize = btrfs_level_size(root, old_root->level); > + eb = read_tree_block(root, logical, blocksize, 0); > + if (!eb) { > + pr_warn("btrfs: failed to read tree block %llu from get_old_root\n", > + logical); > + WARN_ON(1); > + } else { > + eb = btrfs_clone_extent_buffer(eb); > + } > + } else if (old_root) { > eb = alloc_dummy_extent_buffer(logical, root->nodesize); > - else > + } else { > eb = btrfs_clone_extent_buffer(root->node); > + } > btrfs_tree_read_unlock(root->node); > free_extent_buffer(root->node); > if (!eb) >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/23/2012 09:55 PM, Jan Schmidt wrote:> In btrfs_find_all_roots'' termination condition, we compare the level of the > old buffer we got from btrfs_search_old_slot to the level of the current > root node. We''d better compare it to the level of the rewinded root node. >Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Tested-by: Liu Bo <bo.li.liu@oracle.com> thanks, liubo> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> > --- > fs/btrfs/backref.c | 4 +--- > fs/btrfs/ctree.c | 17 +++++++++++++++++ > fs/btrfs/ctree.h | 1 + > 3 files changed, 19 insertions(+), 3 deletions(-) > > diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c > index f318793..65608fb 100644 > --- a/fs/btrfs/backref.c > +++ b/fs/btrfs/backref.c > @@ -283,9 +283,7 @@ static int __resolve_indirect_ref(struct btrfs_fs_info *fs_info, > goto out; > } > > - rcu_read_lock(); > - root_level = btrfs_header_level(root->node); > - rcu_read_unlock(); > + root_level = btrfs_old_root_level(root, time_seq); > > if (root_level + 1 == level) > goto out; > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index 1b60c56..7093ff5 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -1292,6 +1292,23 @@ get_old_root(struct btrfs_root *root, u64 time_seq) > return eb; > } > > +int btrfs_old_root_level(struct btrfs_root *root, u64 time_seq) > +{ > + struct tree_mod_elem *tm; > + int level; > + > + tm = __tree_mod_log_oldest_root(root->fs_info, root, time_seq); > + if (tm && tm->op == MOD_LOG_ROOT_REPLACE) { > + level = tm->old_root.level; > + } else { > + rcu_read_lock(); > + level = btrfs_header_level(root->node); > + rcu_read_unlock(); > + } > + > + return level; > +} > + > static inline int should_cow_block(struct btrfs_trans_handle *trans, > struct btrfs_root *root, > struct extent_buffer *buf) > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h > index 1630be8..34c5a44 100644 > --- a/fs/btrfs/ctree.h > +++ b/fs/btrfs/ctree.h > @@ -3120,6 +3120,7 @@ static inline u64 btrfs_inc_tree_mod_seq(struct btrfs_fs_info *fs_info) > { > return atomic_inc_return(&fs_info->tree_mod_seq); > } > +int btrfs_old_root_level(struct btrfs_root *root, u64 time_seq); > > /* root-item.c */ > int btrfs_find_root_ref(struct btrfs_root *tree_root, >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Oct-23 15:32 UTC
Re: [PATCH 5/6] Btrfs: fix extent buffer reference for tree mod log roots
On 10/23/2012 09:55 PM, Jan Schmidt wrote:> In get_old_root we grab a lock on the extent buffer before we obtain a > reference on that buffer. That order is changed now. >Reviewed-by: Liu Bo <bo.li.liu@oracle.com> thanks, liubo> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> > --- > fs/btrfs/ctree.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index 7093ff5..bed06e8 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -1274,6 +1274,7 @@ get_old_root(struct btrfs_root *root, u64 time_seq) > free_extent_buffer(root->node); > if (!eb) > return NULL; > + extent_buffer_get(eb); > btrfs_tree_read_lock(eb); > if (old_root) { > btrfs_set_header_bytenr(eb, eb->start); > @@ -1286,7 +1287,6 @@ get_old_root(struct btrfs_root *root, u64 time_seq) > __tree_mod_log_rewind(eb, time_seq, tm); > else > WARN_ON(btrfs_header_level(eb) != 0); > - extent_buffer_get(eb); > WARN_ON(btrfs_header_nritems(eb) > BTRFS_NODEPTRS_PER_BLOCK(root)); > > return eb; >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Oct-23 15:32 UTC
Re: [PATCH 6/6] Btrfs: comment for loop in tree_mod_log_insert_move
On 10/23/2012 09:55 PM, Jan Schmidt wrote:> Emphasis the way tree_mod_log_insert_move avoids adding > MOD_LOG_KEY_REMOVE_WHILE_MOVING operations, depending on the direction of > the move operation. >Good work. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> thanks, liubo> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> > --- > fs/btrfs/ctree.c | 5 +++++ > 1 files changed, 5 insertions(+), 0 deletions(-) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index bed06e8..c608b3c 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -596,6 +596,11 @@ tree_mod_log_insert_move(struct btrfs_fs_info *fs_info, > if (tree_mod_dont_log(fs_info, eb)) > return 0; > > + /* > + * When we override something during the move, we log these removals. > + * This can only happen when we move towards the beginning of the > + * buffer, i.e. dst_slot < src_slot. > + */ > for (i = 0; i + dst_slot < src_slot && i < nr_items; i++) { > ret = tree_mod_log_insert_key_locked(fs_info, eb, i + dst_slot, > MOD_LOG_KEY_REMOVE_WHILE_MOVING); >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Schmidt
2012-Oct-23 15:51 UTC
Re: [PATCH 2/6] Btrfs: fix a tree mod logging issue for root replacement operations
On Tue, October 23, 2012 at 17:28 (+0200), Liu Bo wrote:> On 10/23/2012 09:55 PM, Jan Schmidt wrote: >> Avoid the implicit free by tree_mod_log_set_root_pointer, which is wrong in >> two places. Where needed, we call tree_mod_log_free_eb explicitly now. >> >> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> >> --- >> fs/btrfs/ctree.c | 10 ++-------- >> 1 files changed, 2 insertions(+), 8 deletions(-) >> >> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c >> index 44a7e25..e6b75cc 100644 >> --- a/fs/btrfs/ctree.c >> +++ b/fs/btrfs/ctree.c >> @@ -647,8 +647,6 @@ tree_mod_log_insert_root(struct btrfs_fs_info *fs_info, >> if (tree_mod_dont_log(fs_info, NULL)) >> return 0; >> >> - __tree_mod_log_free_eb(fs_info, old_root); >> - >> ret = tree_mod_alloc(fs_info, flags, &tm); >> if (ret < 0) >> goto out; >> @@ -926,12 +924,7 @@ static noinline int update_ref_for_cow(struct btrfs_trans_handle *trans, >> ret = btrfs_dec_ref(trans, root, buf, 1, 1); >> BUG_ON(ret); /* -ENOMEM */ >> } >> - /* >> - * don''t log freeing in case we''re freeing the root node, this >> - * is done by tree_mod_log_set_root_pointer later >> - */ >> - if (buf != root->node && btrfs_header_level(buf) != 0) >> - tree_mod_log_free_eb(root->fs_info, buf); >> + tree_mod_log_free_eb(root->fs_info, buf); > > Since we''ve owned a check for if eb''s level is 0 in tree_mod_dont_log(), > we can also get rid of these level checks in other places, can''t we?We can, yes. There may be hot paths where its worth an extra check to save the overhead. That would require looking closer at the individual sites. Here it definitely wasn''t worth any extra check, so I dropped it en passant. Thanks for reviewing and testing! -Jan>> clean_tree_block(trans, root, buf); >> *last_ref = 1; >> } >> @@ -1728,6 +1721,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, >> goto enospc; >> } >> >> + tree_mod_log_free_eb(root->fs_info, root->node); >> tree_mod_log_set_root_pointer(root, child); >> rcu_assign_pointer(root->node, child); >> >> > > Reviewed-by: Liu Bo <bo.li.liu@oracle.com> > Tested-by: Liu Bo <bo.li.liu@oracle.com> > > thanks, > liubo-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Oct-24 09:55 UTC
Re: [PATCH 3/6] Btrfs: tree mod log''s old roots could still be part of the tree
On 10/23/2012 09:55 PM, Jan Schmidt wrote:> Tree mod log treated old root buffers as always empty buffers when starting > the rewind operations. However, the old root may still be part of the > current tree at a lower level, with still some valid entries. > > Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> > --- > fs/btrfs/ctree.c | 16 ++++++++++++++-- > 1 files changed, 14 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index e6b75cc..1b60c56 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -1239,6 +1239,7 @@ get_old_root(struct btrfs_root *root, u64 time_seq) > struct tree_mod_root *old_root = NULL; > u64 old_generation = 0; > u64 logical; > + u32 blocksize; > > eb = btrfs_read_lock_root_node(root); > tm = __tree_mod_log_oldest_root(root->fs_info, root, time_seq); > @@ -1254,10 +1255,21 @@ get_old_root(struct btrfs_root *root, u64 time_seq) > } > > tm = tree_mod_log_search(root->fs_info, logical, time_seq); > - if (old_root) > + if (old_root && tm && tm->op != MOD_LOG_KEY_REMOVE_WHILE_FREEING) { > + blocksize = btrfs_level_size(root, old_root->level); > + eb = read_tree_block(root, logical, blocksize, 0);I found that read_tree_block() may use GFP_NOFS if we need to create an eb, and this may block read_lock(). It''s not always blocked but it did block, so we''d better give an atomic version of this. thanks, liubo> + if (!eb) { > + pr_warn("btrfs: failed to read tree block %llu from get_old_root\n", > + logical); > + WARN_ON(1); > + } else { > + eb = btrfs_clone_extent_buffer(eb); > + } > + } else if (old_root) { > eb = alloc_dummy_extent_buffer(logical, root->nodesize); > - else > + } else { > eb = btrfs_clone_extent_buffer(root->node); > + } > btrfs_tree_read_unlock(root->node); > free_extent_buffer(root->node); > if (!eb) >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Schmidt
2012-Oct-24 10:45 UTC
[PATCH] Btrfs: drop locks before calling read_tree_block from get_old_root
get_old_root needs its lock on the root node only to clone that buffer. However, when we call read_tree_block, we know already that we''re not going to clone the root node. Thus we can safely unlock before, to avoid issues where read_tree_block makes potentially sleeping allocations. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> --- Thanks for pointing that out, liubo! I added this additional patch on top of my patch staple at: git://git.jan-o-sch.net/btrfs-unstable for-chris I also made a second branch, because this last commit actually fixes that''s introduced by one of the patches in the patch set. So I folded it into the respective commit: git://git.jan-o-sch.net/btrfs-unstable for-chris-fixed I made sure that the resulting code is the same. Chris: please pull either one for rc3. -Jan --- fs/btrfs/ctree.c | 9 +++++++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index c608b3c..eba44b0 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1261,6 +1261,8 @@ get_old_root(struct btrfs_root *root, u64 time_seq) tm = tree_mod_log_search(root->fs_info, logical, time_seq); if (old_root && tm && tm->op != MOD_LOG_KEY_REMOVE_WHILE_FREEING) { + btrfs_tree_read_unlock(root->node); + free_extent_buffer(root->node); blocksize = btrfs_level_size(root, old_root->level); eb = read_tree_block(root, logical, blocksize, 0); if (!eb) { @@ -1271,12 +1273,15 @@ get_old_root(struct btrfs_root *root, u64 time_seq) eb = btrfs_clone_extent_buffer(eb); } } else if (old_root) { + btrfs_tree_read_unlock(root->node); + free_extent_buffer(root->node); eb = alloc_dummy_extent_buffer(logical, root->nodesize); } else { eb = btrfs_clone_extent_buffer(root->node); + btrfs_tree_read_unlock(root->node); + free_extent_buffer(root->node); } - btrfs_tree_read_unlock(root->node); - free_extent_buffer(root->node); + if (!eb) return NULL; extent_buffer_get(eb); -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Oct-24 14:18 UTC
Re: [PATCH] Btrfs: drop locks before calling read_tree_block from get_old_root
On 10/24/2012 06:45 PM, Jan Schmidt wrote:> get_old_root needs its lock on the root node only to clone that buffer. > However, when we call read_tree_block, we know already that we''re not > going to clone the root node. Thus we can safely unlock before, to avoid > issues where read_tree_block makes potentially sleeping allocations. >I tested this for 20 times, it''s no more blocked. So you can add Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Tested-by: Liu Bo <bo.li.liu@oracle.com> thanks, liubo> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> > --- > Thanks for pointing that out, liubo! > > I added this additional patch on top of my patch staple at: > > git://git.jan-o-sch.net/btrfs-unstable for-chris > > I also made a second branch, because this last commit actually fixes > that''s introduced by one of the patches in the patch set. So I folded > it into the respective commit: > > git://git.jan-o-sch.net/btrfs-unstable for-chris-fixed > > I made sure that the resulting code is the same. > > Chris: please pull either one for rc3. > > -Jan > > --- > fs/btrfs/ctree.c | 9 +++++++-- > 1 files changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index c608b3c..eba44b0 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -1261,6 +1261,8 @@ get_old_root(struct btrfs_root *root, u64 time_seq) > > tm = tree_mod_log_search(root->fs_info, logical, time_seq); > if (old_root && tm && tm->op != MOD_LOG_KEY_REMOVE_WHILE_FREEING) { > + btrfs_tree_read_unlock(root->node); > + free_extent_buffer(root->node); > blocksize = btrfs_level_size(root, old_root->level); > eb = read_tree_block(root, logical, blocksize, 0); > if (!eb) { > @@ -1271,12 +1273,15 @@ get_old_root(struct btrfs_root *root, u64 time_seq) > eb = btrfs_clone_extent_buffer(eb); > } > } else if (old_root) { > + btrfs_tree_read_unlock(root->node); > + free_extent_buffer(root->node); > eb = alloc_dummy_extent_buffer(logical, root->nodesize); > } else { > eb = btrfs_clone_extent_buffer(root->node); > + btrfs_tree_read_unlock(root->node); > + free_extent_buffer(root->node); > } > - btrfs_tree_read_unlock(root->node); > - free_extent_buffer(root->node); > + > if (!eb) > return NULL; > extent_buffer_get(eb); >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html