Several users reported this crash of NULL pointer or general protection, the story is that we add a rbtree for speedup ulist iteration, and we use krealloc() to address ulist growth, and krealloc() use memcpy to copy old data to new memory area, so it''s OK for an array as it doesn''t use pointers while it''s not OK for a rbtree as it uses pointers. So krealloc() will mess up our rbtree and it ends up with crash. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- fs/btrfs/ulist.c | 13 ++++++++++++- 1 files changed, 12 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/ulist.c b/fs/btrfs/ulist.c index 7b417e2..69a9c32 100644 --- a/fs/btrfs/ulist.c +++ b/fs/btrfs/ulist.c @@ -73,7 +73,6 @@ void ulist_fini(struct ulist *ulist) if (ulist->nodes_alloced > ULIST_SIZE) kfree(ulist->nodes); ulist->nodes_alloced = 0; /* in case ulist_fini is called twice */ - ulist->root = RB_ROOT; } EXPORT_SYMBOL(ulist_fini); @@ -205,6 +204,7 @@ int ulist_add_merge(struct ulist *ulist, u64 val, u64 aux, u64 new_alloced = ulist->nodes_alloced + 128; struct ulist_node *new_nodes; void *old = NULL; + int i; /* * if nodes_alloced == ULIST_SIZE no memory has been allocated @@ -222,6 +222,17 @@ int ulist_add_merge(struct ulist *ulist, u64 val, u64 aux, memcpy(new_nodes, ulist->int_nodes, sizeof(ulist->int_nodes)); + /* + * krealloc actually uses memcpy, which does not copy rb_node + * pointers, so we have to do it ourselves. Otherwise we may + * be bitten by crashes. + */ + for (i = 0; i < ulist->nnodes; i++) { + rb_erase(&ulist->nodes[i].rb_node, &ulist->root); + ret = ulist_rbtree_insert(ulist, &new_nodes[i]); + BUG_ON(ret); + } + ulist->nodes = new_nodes; ulist->nodes_alloced = new_alloced; } -- 1.7.7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Josef Bacik
2013-Jun-26 12:38 UTC
Re: [PATCH] Btrfs: fix crash regarding to ulist_add_merge
On Wed, Jun 26, 2013 at 12:02:51PM +0800, Liu Bo wrote:> Several users reported this crash of NULL pointer or general protection, > the story is that we add a rbtree for speedup ulist iteration, and we > use krealloc() to address ulist growth, and krealloc() use memcpy to copy > old data to new memory area, so it''s OK for an array as it doesn''t use > pointers while it''s not OK for a rbtree as it uses pointers. > > So krealloc() will mess up our rbtree and it ends up with crash. > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > --- > fs/btrfs/ulist.c | 13 ++++++++++++- > 1 files changed, 12 insertions(+), 1 deletions(-) > > diff --git a/fs/btrfs/ulist.c b/fs/btrfs/ulist.c > index 7b417e2..69a9c32 100644 > --- a/fs/btrfs/ulist.c > +++ b/fs/btrfs/ulist.c > @@ -73,7 +73,6 @@ void ulist_fini(struct ulist *ulist) > if (ulist->nodes_alloced > ULIST_SIZE) > kfree(ulist->nodes); > ulist->nodes_alloced = 0; /* in case ulist_fini is called twice */ > - ulist->root = RB_ROOT;Why this change ^^? Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Zach Brown
2013-Jun-26 20:18 UTC
Re: [PATCH] Btrfs: fix crash regarding to ulist_add_merge
On Wed, Jun 26, 2013 at 12:02:51PM +0800, Liu Bo wrote:> Several users reported this crash of NULL pointer or general protection, > the story is that we add a rbtree for speedup ulist iteration, and we > use krealloc() to address ulist growth, and krealloc() use memcpy to copy > old data to new memory area, so it''s OK for an array as it doesn''t use > pointers while it''s not OK for a rbtree as it uses pointers. > > So krealloc() will mess up our rbtree and it ends up with crash.It''s not just krealloc(), the initial memcpy() out of the int_nodes array is also buggy.> + /* > + * krealloc actually uses memcpy, which does not copy rb_node > + * pointers, so we have to do it ourselves. Otherwise we may > + * be bitten by crashes. > + */ > + for (i = 0; i < ulist->nnodes; i++) { > + rb_erase(&ulist->nodes[i].rb_node, &ulist->root); > + ret = ulist_rbtree_insert(ulist, &new_nodes[i]); > + BUG_ON(ret); > + }This''ll work for the int_nodes case where you still have access to the old pointers for rb_erase() to reference. But in the krealloc() case the rb_erase() will be trying to reference freed memmory because krealloc() frees the old pointer on success. And all the tree balancing in the deletion and re-insertion dance is totally unneeded. And another f-ing BUG_ON()! Just fixup all the pointers: ptrdiff_t diff = new_nodes - old; ulist->root.rb_node += diff; for (i = 0; i < ulist->nnodes; i++) { ulist->nodes[i].rb_node.rb_left += diff; ulist->nodes[i].rb_node.rb_left += diff; } Yeah, it''s insane, but no more so than using krealloc() for an array with internal pointers in the first place. - z -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Zach Brown
2013-Jun-26 20:19 UTC
Re: [PATCH] Btrfs: fix crash regarding to ulist_add_merge
> ptrdiff_t diff = new_nodes - old; > ulist->root.rb_node += diff; > for (i = 0; i < ulist->nnodes; i++) { > ulist->nodes[i].rb_node.rb_left += diff; > ulist->nodes[i].rb_node.rb_left += diff; > }(rb_right in the second case, obviously. Programming in mutt leaves a bit to be desired! - z) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 26, 2013 at 08:38:21AM -0400, Josef Bacik wrote:> On Wed, Jun 26, 2013 at 12:02:51PM +0800, Liu Bo wrote: > > Several users reported this crash of NULL pointer or general protection, > > the story is that we add a rbtree for speedup ulist iteration, and we > > use krealloc() to address ulist growth, and krealloc() use memcpy to copy > > old data to new memory area, so it''s OK for an array as it doesn''t use > > pointers while it''s not OK for a rbtree as it uses pointers. > > > > So krealloc() will mess up our rbtree and it ends up with crash. > > > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > > --- > > fs/btrfs/ulist.c | 13 ++++++++++++- > > 1 files changed, 12 insertions(+), 1 deletions(-) > > > > diff --git a/fs/btrfs/ulist.c b/fs/btrfs/ulist.c > > index 7b417e2..69a9c32 100644 > > --- a/fs/btrfs/ulist.c > > +++ b/fs/btrfs/ulist.c > > @@ -73,7 +73,6 @@ void ulist_fini(struct ulist *ulist) > > if (ulist->nodes_alloced > ULIST_SIZE) > > kfree(ulist->nodes); > > ulist->nodes_alloced = 0; /* in case ulist_fini is called twice */ > > - ulist->root = RB_ROOT; > > Why this change ^^?Ahh, another finger error...actually I was thinking that this ulist_fini() will be followed by ulist_init() in ulist_reinit() or just be freed in ulist_free(). thanks, liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 26, 2013 at 01:18:29PM -0700, Zach Brown wrote:> On Wed, Jun 26, 2013 at 12:02:51PM +0800, Liu Bo wrote: > > Several users reported this crash of NULL pointer or general protection, > > the story is that we add a rbtree for speedup ulist iteration, and we > > use krealloc() to address ulist growth, and krealloc() use memcpy to copy > > old data to new memory area, so it''s OK for an array as it doesn''t use > > pointers while it''s not OK for a rbtree as it uses pointers. > > > > So krealloc() will mess up our rbtree and it ends up with crash. > > It''s not just krealloc(), the initial memcpy() out of the int_nodes > array is also buggy. > > > + /* > > + * krealloc actually uses memcpy, which does not copy rb_node > > + * pointers, so we have to do it ourselves. Otherwise we may > > + * be bitten by crashes. > > + */ > > + for (i = 0; i < ulist->nnodes; i++) { > > + rb_erase(&ulist->nodes[i].rb_node, &ulist->root); > > + ret = ulist_rbtree_insert(ulist, &new_nodes[i]); > > + BUG_ON(ret); > > + } > > This''ll work for the int_nodes case where you still have access to the > old pointers for rb_erase() to reference. > > But in the krealloc() case the rb_erase() will be trying to reference > freed memmory because krealloc() frees the old pointer on success.Yeah, I realize that you''re absolutely right, but my box didn''t complain about the abused old pointers when we''re not in int_nodes case, which is weird...> > And all the tree balancing in the deletion and re-insertion dance is > totally unneeded. And another f-ing BUG_ON()! > > Just fixup all the pointers: > > ptrdiff_t diff = new_nodes - old; > ulist->root.rb_node += diff; > for (i = 0; i < ulist->nnodes; i++) { > ulist->nodes[i].rb_node.rb_left += diff; > ulist->nodes[i].rb_node.rb_left += diff; > } > > Yeah, it''s insane, but no more so than using krealloc() for an array > with internal pointers in the first place.I doubt if it can work, I''d prefer the re-insert dance. thanks, liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 26, 2013 at 01:18:29PM -0700, Zach Brown wrote:> On Wed, Jun 26, 2013 at 12:02:51PM +0800, Liu Bo wrote: > > Several users reported this crash of NULL pointer or general protection, > > the story is that we add a rbtree for speedup ulist iteration, and we > > use krealloc() to address ulist growth, and krealloc() use memcpy to copy > > old data to new memory area, so it''s OK for an array as it doesn''t use > > pointers while it''s not OK for a rbtree as it uses pointers. > > > > So krealloc() will mess up our rbtree and it ends up with crash. > > It''s not just krealloc(), the initial memcpy() out of the int_nodes > array is also buggy. > > > + /* > > + * krealloc actually uses memcpy, which does not copy rb_node > > + * pointers, so we have to do it ourselves. Otherwise we may > > + * be bitten by crashes. > > + */ > > + for (i = 0; i < ulist->nnodes; i++) { > > + rb_erase(&ulist->nodes[i].rb_node, &ulist->root); > > + ret = ulist_rbtree_insert(ulist, &new_nodes[i]); > > + BUG_ON(ret); > > + } > > This''ll work for the int_nodes case where you still have access to the > old pointers for rb_erase() to reference. > > But in the krealloc() case the rb_erase() will be trying to reference > freed memmory because krealloc() frees the old pointer on success. > > And all the tree balancing in the deletion and re-insertion dance is > totally unneeded. And another f-ing BUG_ON()! > > Just fixup all the pointers: > > ptrdiff_t diff = new_nodes - old; > ulist->root.rb_node += diff; > for (i = 0; i < ulist->nnodes; i++) { > ulist->nodes[i].rb_node.rb_left += diff; > ulist->nodes[i].rb_node.rb_left += diff; > } > > Yeah, it''s insane, but no more so than using krealloc() for an array > with internal pointers in the first place.Well, it doesn''t work, [ 9062.312202] ulist_add_merge: diff 18446612136922603520 [ 9062.312231] general protection fault: 0000 [#1] SMP thanks, liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Zach Brown
2013-Jun-27 02:23 UTC
Re: [PATCH] Btrfs: fix crash regarding to ulist_add_merge
> > But in the krealloc() case the rb_erase() will be trying to reference > > freed memmory because krealloc() frees the old pointer on success. > > Yeah, I realize that you''re absolutely right, but my box > didn''t complain about the abused old pointers when we''re not in int_nodes > case, which is weird...The freed space probably just hasn''t been reused yet. Have you tried with CONFIG_DEBUG_PAGEALLOC or CONFIG_DEBUG_SLAB?> > Yeah, it''s insane, but no more so than using krealloc() for an array > > with internal pointers in the first place. > > I doubt if it can work, I''d prefer the re-insert dance.It should, but it is a disgusting hack. Not worth it if you can''t get it going. Re-initializing the nodes instead of removing them after they''re moved should work. But really, this is all bonkers. A ulist implementation that doesn''t require this fixup would be better. Maybe lose the array and have a simple list_head and slab of allocated structs. Reliable first, performant second, presuming there''s data to justify it. - z -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 26, 2013 at 07:23:41PM -0700, Zach Brown wrote:> > > But in the krealloc() case the rb_erase() will be trying to reference > > > freed memmory because krealloc() frees the old pointer on success. > > > > Yeah, I realize that you''re absolutely right, but my box > > didn''t complain about the abused old pointers when we''re not in int_nodes > > case, which is weird... > > The freed space probably just hasn''t been reused yet. Have you tried > with CONFIG_DEBUG_PAGEALLOC or CONFIG_DEBUG_SLAB? > > > > Yeah, it''s insane, but no more so than using krealloc() for an array > > > with internal pointers in the first place. > > > > I doubt if it can work, I''d prefer the re-insert dance. > > It should, but it is a disgusting hack. Not worth it if you can''t get > it going. > > Re-initializing the nodes instead of removing them after they''re moved > should work. > > But really, this is all bonkers. A ulist implementation that doesn''t > require this fixup would be better. Maybe lose the array and have a > simple list_head and slab of allocated structs. Reliable first, > performant second, presuming there''s data to justify it.I agree, I''m trying to work it out and will test it with DEBUG_PAGEALLOC ;) thanks, liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html