Filipe David Borba Manana
2013-Aug-26 18:00 UTC
[RFC PATCH] Btrfs: WIP, fix broken ctree.c:btrfs_prev_leaf()
Note: this is a work in progress patch, not yet complete, not meant to be yet taken. The btrfs_prev_leaf function often returns the same leaf again and a return status of 0 (success) - this is wrong for 2 reasons: 1) Shouldn''t return 0 when it''s the same leaf as before; 2) More importantly, it returns the same leaf in some cases where there''s actually a previous (to the left) leaf. Consider the following example of an fs tree captured from a btrfs-debug-tree output (with item specific data removed for the sake of readability): fs tree key (FS_TREE ROOT_ITEM 0) node 37023744 level 1 items 5 free 116 generation 15 owner 5 fs uuid 35176913-421b-4be9-9c11-6cdf81dd39a9 chunk uuid e13b5f43-8ea1-4867-b6c0-fa4b97caeb43 key (256 INODE_ITEM 0) block 38113280 (9305) gen 15 key (257 EXTENT_DATA 2097901568) block 31617024 (7719) gen 14 key (257 EXTENT_DATA 5093457920) block 34889728 (8518) gen 14 key (257 EXTENT_DATA 5093797888) block 34881536 (8516) gen 14 key (257 EXTENT_DATA 5093969920) block 37027840 (9040) gen 15 leaf 38113280 items 49 free space 71 generation 15 owner 5 fs uuid 35176913-421b-4be9-9c11-6cdf81dd39a9 chunk uuid e13b5f43-8ea1-4867-b6c0-fa4b97caeb43 item 0 key (256 INODE_ITEM 0) itemoff 3835 itemsize 160 item 1 key (256 INODE_REF 256) itemoff 3823 itemsize 12 item 2 key (256 DIR_ITEM 496027801) itemoff 3787 itemsize 36 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 item 6 key (257 EXTENT_DATA 0) itemoff 3522 itemsize 53 item 7 key (257 EXTENT_DATA 40960000) itemoff 3469 itemsize 53 item 8 key (257 EXTENT_DATA 806969344) itemoff 3416 itemsize 53 item 9 key (257 EXTENT_DATA 1572978688) itemoff 3363 itemsize 53 item 10 key (257 EXTENT_DATA 2053185536) itemoff 3310 itemsize 53 item 11 key (257 EXTENT_DATA 2093875200) itemoff 3257 itemsize 53 item 12 key (257 EXTENT_DATA 2097242112) itemoff 3204 itemsize 53 item 13 key (257 EXTENT_DATA 2097537024) itemoff 3151 itemsize 53 item 14 key (257 EXTENT_DATA 2097582080) itemoff 3098 itemsize 53 item 15 key (257 EXTENT_DATA 2097594368) itemoff 3045 itemsize 53 item 16 key (257 EXTENT_DATA 2097602560) itemoff 2992 itemsize 53 item 17 key (257 EXTENT_DATA 2097623040) itemoff 2939 itemsize 53 item 18 key (257 EXTENT_DATA 2097635328) itemoff 2886 itemsize 53 item 19 key (257 EXTENT_DATA 2097643520) itemoff 2833 itemsize 53 item 20 key (257 EXTENT_DATA 2097651712) itemoff 2780 itemsize 53 item 21 key (257 EXTENT_DATA 2097659904) itemoff 2727 itemsize 53 item 22 key (257 EXTENT_DATA 2097668096) itemoff 2674 itemsize 53 item 23 key (257 EXTENT_DATA 2097676288) itemoff 2621 itemsize 53 item 24 key (257 EXTENT_DATA 2097684480) itemoff 2568 itemsize 53 item 25 key (257 EXTENT_DATA 2097692672) itemoff 2515 itemsize 53 item 26 key (257 EXTENT_DATA 2097704960) itemoff 2462 itemsize 53 item 27 key (257 EXTENT_DATA 2097713152) itemoff 2409 itemsize 53 item 28 key (257 EXTENT_DATA 2097721344) itemoff 2356 itemsize 53 item 29 key (257 EXTENT_DATA 2097729536) itemoff 2303 itemsize 53 item 30 key (257 EXTENT_DATA 2097737728) itemoff 2250 itemsize 53 item 31 key (257 EXTENT_DATA 2097745920) itemoff 2197 itemsize 53 item 32 key (257 EXTENT_DATA 2097754112) itemoff 2144 itemsize 53 item 33 key (257 EXTENT_DATA 2097766400) itemoff 2091 itemsize 53 item 34 key (257 EXTENT_DATA 2097774592) itemoff 2038 itemsize 53 item 35 key (257 EXTENT_DATA 2097782784) itemoff 1985 itemsize 53 item 36 key (257 EXTENT_DATA 2097790976) itemoff 1932 itemsize 53 item 37 key (257 EXTENT_DATA 2097799168) itemoff 1879 itemsize 53 item 38 key (257 EXTENT_DATA 2097807360) itemoff 1826 itemsize 53 item 39 key (257 EXTENT_DATA 2097815552) itemoff 1773 itemsize 53 item 40 key (257 EXTENT_DATA 2097823744) itemoff 1720 itemsize 53 item 41 key (257 EXTENT_DATA 2097831936) itemoff 1667 itemsize 53 item 42 key (257 EXTENT_DATA 2097840128) itemoff 1614 itemsize 53 item 43 key (257 EXTENT_DATA 2097852416) itemoff 1561 itemsize 53 item 44 key (257 EXTENT_DATA 2097860608) itemoff 1508 itemsize 53 item 45 key (257 EXTENT_DATA 2097868800) itemoff 1455 itemsize 53 item 46 key (257 EXTENT_DATA 2097876992) itemoff 1402 itemsize 53 item 47 key (257 EXTENT_DATA 2097885184) itemoff 1349 itemsize 53 item 48 key (257 EXTENT_DATA 2097893376) itemoff 1296 itemsize 53 leaf 31617024 items 26 free space 1967 generation 14 owner 5 fs uuid 35176913-421b-4be9-9c11-6cdf81dd39a9 chunk uuid e13b5f43-8ea1-4867-b6c0-fa4b97caeb43 item 0 key (257 EXTENT_DATA 2097901568) itemoff 3942 itemsize 53 item 1 key (257 EXTENT_DATA 2097909760) itemoff 3889 itemsize 53 item 2 key (257 EXTENT_DATA 2097917952) itemoff 3836 itemsize 53 item 3 key (257 EXTENT_DATA 2862002176) itemoff 3783 itemsize 53 item 4 key (257 EXTENT_DATA 3626090496) itemoff 3730 itemsize 53 item 5 key (257 EXTENT_DATA 4065660928) itemoff 3677 itemsize 53 item 6 key (257 EXTENT_DATA 4077395968) itemoff 3624 itemsize 53 item 7 key (257 EXTENT_DATA 4077895680) itemoff 3571 itemsize 53 item 8 key (257 EXTENT_DATA 5054418944) itemoff 3518 itemsize 53 item 9 key (257 EXTENT_DATA 5092200448) itemoff 3465 itemsize 53 item 10 key (257 EXTENT_DATA 5093294080) itemoff 3412 itemsize 53 item 11 key (257 EXTENT_DATA 5093355520) itemoff 3359 itemsize 53 item 12 key (257 EXTENT_DATA 5093363712) itemoff 3306 itemsize 53 item 13 key (257 EXTENT_DATA 5093371904) itemoff 3253 itemsize 53 item 14 key (257 EXTENT_DATA 5093380096) itemoff 3200 itemsize 53 item 15 key (257 EXTENT_DATA 5093384192) itemoff 3147 itemsize 53 item 16 key (257 EXTENT_DATA 5093392384) itemoff 3094 itemsize 53 item 17 key (257 EXTENT_DATA 5093400576) itemoff 3041 itemsize 53 item 18 key (257 EXTENT_DATA 5093404672) itemoff 2988 itemsize 53 item 19 key (257 EXTENT_DATA 5093412864) itemoff 2935 itemsize 53 item 20 key (257 EXTENT_DATA 5093416960) itemoff 2882 itemsize 53 item 21 key (257 EXTENT_DATA 5093425152) itemoff 2829 itemsize 53 item 22 key (257 EXTENT_DATA 5093433344) itemoff 2776 itemsize 53 item 23 key (257 EXTENT_DATA 5093437440) itemoff 2723 itemsize 53 item 24 key (257 EXTENT_DATA 5093445632) itemoff 2670 itemsize 53 item 25 key (257 EXTENT_DATA 5093453824) itemoff 2617 itemsize 53 leaf 34889728 items 51 free space 17 generation 14 owner 5 fs uuid 35176913-421b-4be9-9c11-6cdf81dd39a9 chunk uuid e13b5f43-8ea1-4867-b6c0-fa4b97caeb43 item 0 key (257 EXTENT_DATA 5093457920) itemoff 3942 itemsize 53 item 1 key (257 EXTENT_DATA 5093466112) itemoff 3889 itemsize 53 item 2 key (257 EXTENT_DATA 5093470208) itemoff 3836 itemsize 53 item 3 key (257 EXTENT_DATA 5093478400) itemoff 3783 itemsize 53 item 4 key (257 EXTENT_DATA 5093482496) itemoff 3730 itemsize 53 item 5 key (257 EXTENT_DATA 5093490688) itemoff 3677 itemsize 53 item 6 key (257 EXTENT_DATA 5093494784) itemoff 3624 itemsize 53 item 7 key (257 EXTENT_DATA 5093502976) itemoff 3571 itemsize 53 item 8 key (257 EXTENT_DATA 5093511168) itemoff 3518 itemsize 53 item 9 key (257 EXTENT_DATA 5093515264) itemoff 3465 itemsize 53 item 10 key (257 EXTENT_DATA 5093523456) itemoff 3412 itemsize 53 item 11 key (257 EXTENT_DATA 5093527552) itemoff 3359 itemsize 53 item 12 key (257 EXTENT_DATA 5093539840) itemoff 3306 itemsize 53 item 13 key (257 EXTENT_DATA 5093543936) itemoff 3253 itemsize 53 item 14 key (257 EXTENT_DATA 5093552128) itemoff 3200 itemsize 53 item 15 key (257 EXTENT_DATA 5093560320) itemoff 3147 itemsize 53 item 16 key (257 EXTENT_DATA 5093564416) itemoff 3094 itemsize 53 item 17 key (257 EXTENT_DATA 5093572608) itemoff 3041 itemsize 53 item 18 key (257 EXTENT_DATA 5093580800) itemoff 2988 itemsize 53 item 19 key (257 EXTENT_DATA 5093584896) itemoff 2935 itemsize 53 item 20 key (257 EXTENT_DATA 5093593088) itemoff 2882 itemsize 53 item 21 key (257 EXTENT_DATA 5093597184) itemoff 2829 itemsize 53 item 22 key (257 EXTENT_DATA 5093605376) itemoff 2776 itemsize 53 item 23 key (257 EXTENT_DATA 5093613568) itemoff 2723 itemsize 53 item 24 key (257 EXTENT_DATA 5093617664) itemoff 2670 itemsize 53 item 25 key (257 EXTENT_DATA 5093625856) itemoff 2617 itemsize 53 item 26 key (257 EXTENT_DATA 5093629952) itemoff 2564 itemsize 53 item 27 key (257 EXTENT_DATA 5093638144) itemoff 2511 itemsize 53 item 28 key (257 EXTENT_DATA 5093642240) itemoff 2458 itemsize 53 item 29 key (257 EXTENT_DATA 5093650432) itemoff 2405 itemsize 53 item 30 key (257 EXTENT_DATA 5093654528) itemoff 2352 itemsize 53 item 31 key (257 EXTENT_DATA 5093662720) itemoff 2299 itemsize 53 item 32 key (257 EXTENT_DATA 5093670912) itemoff 2246 itemsize 53 item 33 key (257 EXTENT_DATA 5093675008) itemoff 2193 itemsize 53 item 34 key (257 EXTENT_DATA 5093683200) itemoff 2140 itemsize 53 item 35 key (257 EXTENT_DATA 5093691392) itemoff 2087 itemsize 53 item 36 key (257 EXTENT_DATA 5093695488) itemoff 2034 itemsize 53 item 37 key (257 EXTENT_DATA 5093703680) itemoff 1981 itemsize 53 item 38 key (257 EXTENT_DATA 5093711872) itemoff 1928 itemsize 53 item 39 key (257 EXTENT_DATA 5093715968) itemoff 1875 itemsize 53 item 40 key (257 EXTENT_DATA 5093724160) itemoff 1822 itemsize 53 item 41 key (257 EXTENT_DATA 5093728256) itemoff 1769 itemsize 53 item 42 key (257 EXTENT_DATA 5093736448) itemoff 1716 itemsize 53 item 43 key (257 EXTENT_DATA 5093740544) itemoff 1663 itemsize 53 item 44 key (257 EXTENT_DATA 5093748736) itemoff 1610 itemsize 53 item 45 key (257 EXTENT_DATA 5093756928) itemoff 1557 itemsize 53 item 46 key (257 EXTENT_DATA 5093761024) itemoff 1504 itemsize 53 item 47 key (257 EXTENT_DATA 5093769216) itemoff 1451 itemsize 53 item 48 key (257 EXTENT_DATA 5093777408) itemoff 1398 itemsize 53 item 49 key (257 EXTENT_DATA 5093781504) itemoff 1345 itemsize 53 item 50 key (257 EXTENT_DATA 5093789696) itemoff 1292 itemsize 53 leaf 34881536 items 26 free space 1967 generation 14 owner 5 fs uuid 35176913-421b-4be9-9c11-6cdf81dd39a9 chunk uuid e13b5f43-8ea1-4867-b6c0-fa4b97caeb43 item 0 key (257 EXTENT_DATA 5093797888) itemoff 3942 itemsize 53 item 1 key (257 EXTENT_DATA 5093801984) itemoff 3889 itemsize 53 item 2 key (257 EXTENT_DATA 5093810176) itemoff 3836 itemsize 53 item 3 key (257 EXTENT_DATA 5093814272) itemoff 3783 itemsize 53 item 4 key (257 EXTENT_DATA 5093822464) itemoff 3730 itemsize 53 item 5 key (257 EXTENT_DATA 5093830656) itemoff 3677 itemsize 53 item 6 key (257 EXTENT_DATA 5093834752) itemoff 3624 itemsize 53 item 7 key (257 EXTENT_DATA 5093842944) itemoff 3571 itemsize 53 item 8 key (257 EXTENT_DATA 5093851136) itemoff 3518 itemsize 53 item 9 key (257 EXTENT_DATA 5093855232) itemoff 3465 itemsize 53 item 10 key (257 EXTENT_DATA 5093863424) itemoff 3412 itemsize 53 item 11 key (257 EXTENT_DATA 5093871616) itemoff 3359 itemsize 53 item 12 key (257 EXTENT_DATA 5093875712) itemoff 3306 itemsize 53 item 13 key (257 EXTENT_DATA 5093883904) itemoff 3253 itemsize 53 item 14 key (257 EXTENT_DATA 5093892096) itemoff 3200 itemsize 53 item 15 key (257 EXTENT_DATA 5093896192) itemoff 3147 itemsize 53 item 16 key (257 EXTENT_DATA 5093904384) itemoff 3094 itemsize 53 item 17 key (257 EXTENT_DATA 5093912576) itemoff 3041 itemsize 53 item 18 key (257 EXTENT_DATA 5093916672) itemoff 2988 itemsize 53 item 19 key (257 EXTENT_DATA 5093924864) itemoff 2935 itemsize 53 item 20 key (257 EXTENT_DATA 5093933056) itemoff 2882 itemsize 53 item 21 key (257 EXTENT_DATA 5093937152) itemoff 2829 itemsize 53 item 22 key (257 EXTENT_DATA 5093945344) itemoff 2776 itemsize 53 item 23 key (257 EXTENT_DATA 5093949440) itemoff 2723 itemsize 53 item 24 key (257 EXTENT_DATA 5093957632) itemoff 2670 itemsize 53 item 25 key (257 EXTENT_DATA 5093961728) itemoff 2617 itemsize 53 leaf 37027840 items 34 free space 1343 generation 15 owner 5 fs uuid 35176913-421b-4be9-9c11-6cdf81dd39a9 chunk uuid e13b5f43-8ea1-4867-b6c0-fa4b97caeb43 item 0 key (257 EXTENT_DATA 5093969920) itemoff 3942 itemsize 53 item 1 key (257 EXTENT_DATA 5093978112) itemoff 3889 itemsize 53 item 2 key (257 EXTENT_DATA 5093982208) itemoff 3836 itemsize 53 item 3 key (257 EXTENT_DATA 5093990400) itemoff 3783 itemsize 53 item 4 key (257 EXTENT_DATA 5093994496) itemoff 3730 itemsize 53 item 5 key (257 EXTENT_DATA 5094002688) itemoff 3677 itemsize 53 item 6 key (257 EXTENT_DATA 5094006784) itemoff 3624 itemsize 53 item 7 key (257 EXTENT_DATA 5094014976) itemoff 3571 itemsize 53 item 8 key (257 EXTENT_DATA 5094023168) itemoff 3518 itemsize 53 item 9 key (257 EXTENT_DATA 5094027264) itemoff 3465 itemsize 53 item 10 key (257 EXTENT_DATA 5094035456) itemoff 3412 itemsize 53 item 11 key (257 EXTENT_DATA 5094039552) itemoff 3359 itemsize 53 item 12 key (257 EXTENT_DATA 5094047744) itemoff 3306 itemsize 53 item 13 key (257 EXTENT_DATA 5094055936) itemoff 3253 itemsize 53 item 14 key (257 EXTENT_DATA 5094064128) itemoff 3200 itemsize 53 item 15 key (257 EXTENT_DATA 5094072320) itemoff 3147 itemsize 53 item 16 key (257 EXTENT_DATA 5094076416) itemoff 3094 itemsize 53 item 17 key (257 EXTENT_DATA 5094084608) itemoff 3041 itemsize 53 item 18 key (257 EXTENT_DATA 5094088704) itemoff 2988 itemsize 53 item 19 key (257 EXTENT_DATA 5094096896) itemoff 2935 itemsize 53 item 20 key (257 EXTENT_DATA 5094100992) itemoff 2882 itemsize 53 item 21 key (257 EXTENT_DATA 5094109184) itemoff 2829 itemsize 53 item 22 key (257 EXTENT_DATA 5094117376) itemoff 2776 itemsize 53 item 23 key (257 EXTENT_DATA 5094121472) itemoff 2723 itemsize 53 item 24 key (257 EXTENT_DATA 5094129664) itemoff 2670 itemsize 53 item 25 key (257 EXTENT_DATA 5094137856) itemoff 2617 itemsize 53 item 26 key (257 EXTENT_DATA 5094141952) itemoff 2564 itemsize 53 item 27 key (257 EXTENT_DATA 5094150144) itemoff 2511 itemsize 53 item 28 key (257 EXTENT_DATA 5094154240) itemoff 2458 itemsize 53 item 29 key (257 EXTENT_DATA 5094158336) itemoff 2405 itemsize 53 item 30 key (257 EXTENT_DATA 5876654080) itemoff 2352 itemsize 53 item 31 key (257 EXTENT_DATA 6659149824) itemoff 2299 itemsize 53 item 32 key (257 EXTENT_DATA 7001882624) itemoff 2246 itemsize 53 item 33 key (257 EXTENT_DATA 7008460800) itemoff 2193 itemsize 53 Using the current btrfs_prev_leaf implementation, to lookup for the key (257 BTRFS_EXTENT_DATA_KEY 5093457920), which is the left most item of the third leaf, and then calling btrfs_leaf_prev against the path produced by btrfs_search_slot, returns the same leaf with a success return value (0): struct btrfs_path *path; struct btrfs_key key; struct btrfs_disk_key disk_key; struct extent_buffer *leaf; int ret, slot; path = btrfs_alloc_path(); key.objectid = 257; key.type = BTRFS_EXTENT_DATA_KEY; key.offset = 5093457920; ret = btrfs_search_slot(NULL, fs_info->fs_root, &key, path, 0, 0); BUG_ON(ret != 0); leaf = path->nodes[0]; slot = path->slots[0]; btrfs_item_key(leaf, &disk_key, slot); printf("\tslot: %d, parent slot: %d, key: ", slot, path->slots[1]); btrfs_print_key(&disk_key); printf("\n"); ret = btrfs_prev_leaf(info->fs_root, path); leaf = path->nodes[0]; slot = path->slots[0]; btrfs_item_key(leaf, &disk_key, slot); printf("\tprev leaf ret: %d, prev leaf key: ", ret); btrfs_print_key(&disk_key); printf("\n"); The output sent to stdout from the above code when using the current implementation of btrfs_prev_leaf is: slot: 0, parent slot: 2, key: key (257 EXTENT_DATA 5093457920) prev leaf ret: 0, prev leaf key: key (257 EXTENT_DATA 5093457920) Using the new version of btrfs_prev_leaf, introduced by this change and which is based on the version found in btrfs-progs, produces correct results, i.e. it returns the previous leaf with the correct slot set (right most item of the previous leaf). The output sent to stdout with the fixed btrfs_prev_leaf is: slot: 0, parent slot: 2, key: key (257 EXTENT_DATA 5093457920) prev leaf ret: 0, prev leaf key: key (257 EXTENT_DATA 5093453824) The above fs tree was produced with the following code: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ dd if=/dev/zero of=/mnt/btrfs/foobar bs=4096 count=10000000 $ umount /mnt/btrfs $ btrfs-debug-tree /dev/sdb3 Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> --- fs/btrfs/ctree.c | 113 +++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 82 insertions(+), 31 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 5fa521b..df68600 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -42,6 +42,7 @@ static void del_ptr(struct btrfs_root *root, struct btrfs_path *path, static void tree_mod_log_free_eb(struct btrfs_fs_info *fs_info, struct extent_buffer *eb); static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path); +static void read_lock_node(struct extent_buffer *b, struct btrfs_path *path); struct btrfs_path *btrfs_alloc_path(void) { @@ -2646,16 +2647,8 @@ cow_done: BTRFS_WRITE_LOCK); } p->locks[level] = BTRFS_WRITE_LOCK; - } else { - err = btrfs_try_tree_read_lock(b); - if (!err) { - btrfs_set_path_blocking(p); - btrfs_tree_read_lock(b); - btrfs_clear_path_blocking(p, b, - BTRFS_READ_LOCK); - } - p->locks[level] = BTRFS_READ_LOCK; - } + } else + read_lock_node(b, p); p->nodes[level] = b; } } else { @@ -4776,30 +4769,74 @@ int btrfs_del_items(struct btrfs_trans_handle *trans, struct btrfs_root *root, */ static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path) { - struct btrfs_key key; - struct btrfs_disk_key found_key; - int ret; + int slot; + int level = 1; + int ret = 1; + struct extent_buffer *c; + struct extent_buffer *next = NULL, *next2 = NULL; + int *locks; - btrfs_item_key_to_cpu(path->nodes[0], &key, 0); + locks = kzalloc(sizeof(int) * BTRFS_MAX_LEVEL, GFP_NOFS); + if (!locks) + return -ENOMEM; + locks[0] = path->locks[0]; - if (key.offset > 0) - key.offset--; - else if (key.type > 0) - key.type--; - else if (key.objectid > 0) - key.objectid--; - else - return 1; + while (level < BTRFS_MAX_LEVEL) { + if (!path->nodes[level]) + goto out; - btrfs_release_path(path); - ret = btrfs_search_slot(NULL, root, &key, path, 0, 0); - if (ret < 0) - return ret; - btrfs_item_key(path->nodes[0], &found_key, 0); - ret = comp_keys(&found_key, &key); - if (ret < 0) - return 0; - return 1; + slot = path->slots[level]; + c = path->nodes[level]; + locks[level] = path->locks[level]; + if (slot == 0) { + level++; + if (level == BTRFS_MAX_LEVEL) + goto out; + continue; + } + slot--; + /* + * TODO: need to lock ''c'' ? + * Perhaps need to lock it, then check if + * extent_buffer_uptodate(c) returns true. If yes, all is ok? + * If it''s not, then repeat the tree search for the key at the + * original path->nodes[0] (item at the original path->slots[0]). + * Is this safe? Can it cause infinite retries? + */ + next = read_node_slot(root, c, slot); + if (!path->skip_locking) + read_lock_node(next, path); + break; + } + path->slots[level] = slot; + while (1) { + level--; + c = path->nodes[level]; + if (locks[level]) + btrfs_tree_unlock_rw(c, locks[level]); + free_extent_buffer(c); + slot = btrfs_header_nritems(next) - 1; + if (slot < 0) + goto out; + path->nodes[level] = next; + path->slots[level] = slot; + if (level == 0) { + ret = 0; + goto out; + } + next2 = read_node_slot(root, next, slot); + if (!path->skip_locking) + read_lock_node(next2, path); + if (locks[level + 1]) { + btrfs_tree_unlock_rw(next, locks[level + 1]); + locks[level + 1] = 0; + } + next = next2; + } + +out: + kfree(locks); + return ret; } /* @@ -5636,3 +5673,17 @@ int btrfs_previous_item(struct btrfs_root *root, } return 1; } + +static void read_lock_node(struct extent_buffer *b, struct btrfs_path *path) +{ + int success; + int level = btrfs_header_level(b); + + success = btrfs_try_tree_read_lock(b); + if (!success) { + btrfs_set_path_blocking(path); + btrfs_tree_read_lock(b); + btrfs_clear_path_blocking(path, b, BTRFS_READ_LOCK); + } + path->locks[level] = BTRFS_READ_LOCK; +} -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Filipe David Manana
2013-Aug-27 12:02 UTC
Re: [RFC PATCH] Btrfs: WIP, fix broken ctree.c:btrfs_prev_leaf()
On Mon, Aug 26, 2013 at 7:00 PM, Filipe David Borba Manana <fdmanana@gmail.com> wrote:> Note: this is a work in progress patch, not yet complete, not meant to > be yet taken. > > The btrfs_prev_leaf function often returns the same leaf again and > a return status of 0 (success) - this is wrong for 2 reasons: > > 1) Shouldn''t return 0 when it''s the same leaf as before; > > 2) More importantly, it returns the same leaf in some cases where > there''s actually a previous (to the left) leaf. > > Consider the following example of an fs tree captured from a > btrfs-debug-tree output (with item specific data removed for > the sake of readability): > > fs tree key (FS_TREE ROOT_ITEM 0) > node 37023744 level 1 items 5 free 116 generation 15 owner 5 > fs uuid 35176913-421b-4be9-9c11-6cdf81dd39a9 > chunk uuid e13b5f43-8ea1-4867-b6c0-fa4b97caeb43 > key (256 INODE_ITEM 0) block 38113280 (9305) gen 15 > key (257 EXTENT_DATA 2097901568) block 31617024 (7719) gen 14 > key (257 EXTENT_DATA 5093457920) block 34889728 (8518) gen 14 > key (257 EXTENT_DATA 5093797888) block 34881536 (8516) gen 14 > key (257 EXTENT_DATA 5093969920) block 37027840 (9040) gen 15 > leaf 38113280 items 49 free space 71 generation 15 owner 5 > fs uuid 35176913-421b-4be9-9c11-6cdf81dd39a9 > chunk uuid e13b5f43-8ea1-4867-b6c0-fa4b97caeb43 > item 0 key (256 INODE_ITEM 0) itemoff 3835 itemsize 160 > item 1 key (256 INODE_REF 256) itemoff 3823 itemsize 12 > item 2 key (256 DIR_ITEM 496027801) itemoff 3787 itemsize 36 > item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 > item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 > item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 > item 6 key (257 EXTENT_DATA 0) itemoff 3522 itemsize 53 > item 7 key (257 EXTENT_DATA 40960000) itemoff 3469 itemsize 53 > item 8 key (257 EXTENT_DATA 806969344) itemoff 3416 itemsize 53 > item 9 key (257 EXTENT_DATA 1572978688) itemoff 3363 itemsize 53 > item 10 key (257 EXTENT_DATA 2053185536) itemoff 3310 itemsize 53 > item 11 key (257 EXTENT_DATA 2093875200) itemoff 3257 itemsize 53 > item 12 key (257 EXTENT_DATA 2097242112) itemoff 3204 itemsize 53 > item 13 key (257 EXTENT_DATA 2097537024) itemoff 3151 itemsize 53 > item 14 key (257 EXTENT_DATA 2097582080) itemoff 3098 itemsize 53 > item 15 key (257 EXTENT_DATA 2097594368) itemoff 3045 itemsize 53 > item 16 key (257 EXTENT_DATA 2097602560) itemoff 2992 itemsize 53 > item 17 key (257 EXTENT_DATA 2097623040) itemoff 2939 itemsize 53 > item 18 key (257 EXTENT_DATA 2097635328) itemoff 2886 itemsize 53 > item 19 key (257 EXTENT_DATA 2097643520) itemoff 2833 itemsize 53 > item 20 key (257 EXTENT_DATA 2097651712) itemoff 2780 itemsize 53 > item 21 key (257 EXTENT_DATA 2097659904) itemoff 2727 itemsize 53 > item 22 key (257 EXTENT_DATA 2097668096) itemoff 2674 itemsize 53 > item 23 key (257 EXTENT_DATA 2097676288) itemoff 2621 itemsize 53 > item 24 key (257 EXTENT_DATA 2097684480) itemoff 2568 itemsize 53 > item 25 key (257 EXTENT_DATA 2097692672) itemoff 2515 itemsize 53 > item 26 key (257 EXTENT_DATA 2097704960) itemoff 2462 itemsize 53 > item 27 key (257 EXTENT_DATA 2097713152) itemoff 2409 itemsize 53 > item 28 key (257 EXTENT_DATA 2097721344) itemoff 2356 itemsize 53 > item 29 key (257 EXTENT_DATA 2097729536) itemoff 2303 itemsize 53 > item 30 key (257 EXTENT_DATA 2097737728) itemoff 2250 itemsize 53 > item 31 key (257 EXTENT_DATA 2097745920) itemoff 2197 itemsize 53 > item 32 key (257 EXTENT_DATA 2097754112) itemoff 2144 itemsize 53 > item 33 key (257 EXTENT_DATA 2097766400) itemoff 2091 itemsize 53 > item 34 key (257 EXTENT_DATA 2097774592) itemoff 2038 itemsize 53 > item 35 key (257 EXTENT_DATA 2097782784) itemoff 1985 itemsize 53 > item 36 key (257 EXTENT_DATA 2097790976) itemoff 1932 itemsize 53 > item 37 key (257 EXTENT_DATA 2097799168) itemoff 1879 itemsize 53 > item 38 key (257 EXTENT_DATA 2097807360) itemoff 1826 itemsize 53 > item 39 key (257 EXTENT_DATA 2097815552) itemoff 1773 itemsize 53 > item 40 key (257 EXTENT_DATA 2097823744) itemoff 1720 itemsize 53 > item 41 key (257 EXTENT_DATA 2097831936) itemoff 1667 itemsize 53 > item 42 key (257 EXTENT_DATA 2097840128) itemoff 1614 itemsize 53 > item 43 key (257 EXTENT_DATA 2097852416) itemoff 1561 itemsize 53 > item 44 key (257 EXTENT_DATA 2097860608) itemoff 1508 itemsize 53 > item 45 key (257 EXTENT_DATA 2097868800) itemoff 1455 itemsize 53 > item 46 key (257 EXTENT_DATA 2097876992) itemoff 1402 itemsize 53 > item 47 key (257 EXTENT_DATA 2097885184) itemoff 1349 itemsize 53 > item 48 key (257 EXTENT_DATA 2097893376) itemoff 1296 itemsize 53 > leaf 31617024 items 26 free space 1967 generation 14 owner 5 > fs uuid 35176913-421b-4be9-9c11-6cdf81dd39a9 > chunk uuid e13b5f43-8ea1-4867-b6c0-fa4b97caeb43 > item 0 key (257 EXTENT_DATA 2097901568) itemoff 3942 itemsize 53 > item 1 key (257 EXTENT_DATA 2097909760) itemoff 3889 itemsize 53 > item 2 key (257 EXTENT_DATA 2097917952) itemoff 3836 itemsize 53 > item 3 key (257 EXTENT_DATA 2862002176) itemoff 3783 itemsize 53 > item 4 key (257 EXTENT_DATA 3626090496) itemoff 3730 itemsize 53 > item 5 key (257 EXTENT_DATA 4065660928) itemoff 3677 itemsize 53 > item 6 key (257 EXTENT_DATA 4077395968) itemoff 3624 itemsize 53 > item 7 key (257 EXTENT_DATA 4077895680) itemoff 3571 itemsize 53 > item 8 key (257 EXTENT_DATA 5054418944) itemoff 3518 itemsize 53 > item 9 key (257 EXTENT_DATA 5092200448) itemoff 3465 itemsize 53 > item 10 key (257 EXTENT_DATA 5093294080) itemoff 3412 itemsize 53 > item 11 key (257 EXTENT_DATA 5093355520) itemoff 3359 itemsize 53 > item 12 key (257 EXTENT_DATA 5093363712) itemoff 3306 itemsize 53 > item 13 key (257 EXTENT_DATA 5093371904) itemoff 3253 itemsize 53 > item 14 key (257 EXTENT_DATA 5093380096) itemoff 3200 itemsize 53 > item 15 key (257 EXTENT_DATA 5093384192) itemoff 3147 itemsize 53 > item 16 key (257 EXTENT_DATA 5093392384) itemoff 3094 itemsize 53 > item 17 key (257 EXTENT_DATA 5093400576) itemoff 3041 itemsize 53 > item 18 key (257 EXTENT_DATA 5093404672) itemoff 2988 itemsize 53 > item 19 key (257 EXTENT_DATA 5093412864) itemoff 2935 itemsize 53 > item 20 key (257 EXTENT_DATA 5093416960) itemoff 2882 itemsize 53 > item 21 key (257 EXTENT_DATA 5093425152) itemoff 2829 itemsize 53 > item 22 key (257 EXTENT_DATA 5093433344) itemoff 2776 itemsize 53 > item 23 key (257 EXTENT_DATA 5093437440) itemoff 2723 itemsize 53 > item 24 key (257 EXTENT_DATA 5093445632) itemoff 2670 itemsize 53 > item 25 key (257 EXTENT_DATA 5093453824) itemoff 2617 itemsize 53 > leaf 34889728 items 51 free space 17 generation 14 owner 5 > fs uuid 35176913-421b-4be9-9c11-6cdf81dd39a9 > chunk uuid e13b5f43-8ea1-4867-b6c0-fa4b97caeb43 > item 0 key (257 EXTENT_DATA 5093457920) itemoff 3942 itemsize 53 > item 1 key (257 EXTENT_DATA 5093466112) itemoff 3889 itemsize 53 > item 2 key (257 EXTENT_DATA 5093470208) itemoff 3836 itemsize 53 > item 3 key (257 EXTENT_DATA 5093478400) itemoff 3783 itemsize 53 > item 4 key (257 EXTENT_DATA 5093482496) itemoff 3730 itemsize 53 > item 5 key (257 EXTENT_DATA 5093490688) itemoff 3677 itemsize 53 > item 6 key (257 EXTENT_DATA 5093494784) itemoff 3624 itemsize 53 > item 7 key (257 EXTENT_DATA 5093502976) itemoff 3571 itemsize 53 > item 8 key (257 EXTENT_DATA 5093511168) itemoff 3518 itemsize 53 > item 9 key (257 EXTENT_DATA 5093515264) itemoff 3465 itemsize 53 > item 10 key (257 EXTENT_DATA 5093523456) itemoff 3412 itemsize 53 > item 11 key (257 EXTENT_DATA 5093527552) itemoff 3359 itemsize 53 > item 12 key (257 EXTENT_DATA 5093539840) itemoff 3306 itemsize 53 > item 13 key (257 EXTENT_DATA 5093543936) itemoff 3253 itemsize 53 > item 14 key (257 EXTENT_DATA 5093552128) itemoff 3200 itemsize 53 > item 15 key (257 EXTENT_DATA 5093560320) itemoff 3147 itemsize 53 > item 16 key (257 EXTENT_DATA 5093564416) itemoff 3094 itemsize 53 > item 17 key (257 EXTENT_DATA 5093572608) itemoff 3041 itemsize 53 > item 18 key (257 EXTENT_DATA 5093580800) itemoff 2988 itemsize 53 > item 19 key (257 EXTENT_DATA 5093584896) itemoff 2935 itemsize 53 > item 20 key (257 EXTENT_DATA 5093593088) itemoff 2882 itemsize 53 > item 21 key (257 EXTENT_DATA 5093597184) itemoff 2829 itemsize 53 > item 22 key (257 EXTENT_DATA 5093605376) itemoff 2776 itemsize 53 > item 23 key (257 EXTENT_DATA 5093613568) itemoff 2723 itemsize 53 > item 24 key (257 EXTENT_DATA 5093617664) itemoff 2670 itemsize 53 > item 25 key (257 EXTENT_DATA 5093625856) itemoff 2617 itemsize 53 > item 26 key (257 EXTENT_DATA 5093629952) itemoff 2564 itemsize 53 > item 27 key (257 EXTENT_DATA 5093638144) itemoff 2511 itemsize 53 > item 28 key (257 EXTENT_DATA 5093642240) itemoff 2458 itemsize 53 > item 29 key (257 EXTENT_DATA 5093650432) itemoff 2405 itemsize 53 > item 30 key (257 EXTENT_DATA 5093654528) itemoff 2352 itemsize 53 > item 31 key (257 EXTENT_DATA 5093662720) itemoff 2299 itemsize 53 > item 32 key (257 EXTENT_DATA 5093670912) itemoff 2246 itemsize 53 > item 33 key (257 EXTENT_DATA 5093675008) itemoff 2193 itemsize 53 > item 34 key (257 EXTENT_DATA 5093683200) itemoff 2140 itemsize 53 > item 35 key (257 EXTENT_DATA 5093691392) itemoff 2087 itemsize 53 > item 36 key (257 EXTENT_DATA 5093695488) itemoff 2034 itemsize 53 > item 37 key (257 EXTENT_DATA 5093703680) itemoff 1981 itemsize 53 > item 38 key (257 EXTENT_DATA 5093711872) itemoff 1928 itemsize 53 > item 39 key (257 EXTENT_DATA 5093715968) itemoff 1875 itemsize 53 > item 40 key (257 EXTENT_DATA 5093724160) itemoff 1822 itemsize 53 > item 41 key (257 EXTENT_DATA 5093728256) itemoff 1769 itemsize 53 > item 42 key (257 EXTENT_DATA 5093736448) itemoff 1716 itemsize 53 > item 43 key (257 EXTENT_DATA 5093740544) itemoff 1663 itemsize 53 > item 44 key (257 EXTENT_DATA 5093748736) itemoff 1610 itemsize 53 > item 45 key (257 EXTENT_DATA 5093756928) itemoff 1557 itemsize 53 > item 46 key (257 EXTENT_DATA 5093761024) itemoff 1504 itemsize 53 > item 47 key (257 EXTENT_DATA 5093769216) itemoff 1451 itemsize 53 > item 48 key (257 EXTENT_DATA 5093777408) itemoff 1398 itemsize 53 > item 49 key (257 EXTENT_DATA 5093781504) itemoff 1345 itemsize 53 > item 50 key (257 EXTENT_DATA 5093789696) itemoff 1292 itemsize 53 > leaf 34881536 items 26 free space 1967 generation 14 owner 5 > fs uuid 35176913-421b-4be9-9c11-6cdf81dd39a9 > chunk uuid e13b5f43-8ea1-4867-b6c0-fa4b97caeb43 > item 0 key (257 EXTENT_DATA 5093797888) itemoff 3942 itemsize 53 > item 1 key (257 EXTENT_DATA 5093801984) itemoff 3889 itemsize 53 > item 2 key (257 EXTENT_DATA 5093810176) itemoff 3836 itemsize 53 > item 3 key (257 EXTENT_DATA 5093814272) itemoff 3783 itemsize 53 > item 4 key (257 EXTENT_DATA 5093822464) itemoff 3730 itemsize 53 > item 5 key (257 EXTENT_DATA 5093830656) itemoff 3677 itemsize 53 > item 6 key (257 EXTENT_DATA 5093834752) itemoff 3624 itemsize 53 > item 7 key (257 EXTENT_DATA 5093842944) itemoff 3571 itemsize 53 > item 8 key (257 EXTENT_DATA 5093851136) itemoff 3518 itemsize 53 > item 9 key (257 EXTENT_DATA 5093855232) itemoff 3465 itemsize 53 > item 10 key (257 EXTENT_DATA 5093863424) itemoff 3412 itemsize 53 > item 11 key (257 EXTENT_DATA 5093871616) itemoff 3359 itemsize 53 > item 12 key (257 EXTENT_DATA 5093875712) itemoff 3306 itemsize 53 > item 13 key (257 EXTENT_DATA 5093883904) itemoff 3253 itemsize 53 > item 14 key (257 EXTENT_DATA 5093892096) itemoff 3200 itemsize 53 > item 15 key (257 EXTENT_DATA 5093896192) itemoff 3147 itemsize 53 > item 16 key (257 EXTENT_DATA 5093904384) itemoff 3094 itemsize 53 > item 17 key (257 EXTENT_DATA 5093912576) itemoff 3041 itemsize 53 > item 18 key (257 EXTENT_DATA 5093916672) itemoff 2988 itemsize 53 > item 19 key (257 EXTENT_DATA 5093924864) itemoff 2935 itemsize 53 > item 20 key (257 EXTENT_DATA 5093933056) itemoff 2882 itemsize 53 > item 21 key (257 EXTENT_DATA 5093937152) itemoff 2829 itemsize 53 > item 22 key (257 EXTENT_DATA 5093945344) itemoff 2776 itemsize 53 > item 23 key (257 EXTENT_DATA 5093949440) itemoff 2723 itemsize 53 > item 24 key (257 EXTENT_DATA 5093957632) itemoff 2670 itemsize 53 > item 25 key (257 EXTENT_DATA 5093961728) itemoff 2617 itemsize 53 > leaf 37027840 items 34 free space 1343 generation 15 owner 5 > fs uuid 35176913-421b-4be9-9c11-6cdf81dd39a9 > chunk uuid e13b5f43-8ea1-4867-b6c0-fa4b97caeb43 > item 0 key (257 EXTENT_DATA 5093969920) itemoff 3942 itemsize 53 > item 1 key (257 EXTENT_DATA 5093978112) itemoff 3889 itemsize 53 > item 2 key (257 EXTENT_DATA 5093982208) itemoff 3836 itemsize 53 > item 3 key (257 EXTENT_DATA 5093990400) itemoff 3783 itemsize 53 > item 4 key (257 EXTENT_DATA 5093994496) itemoff 3730 itemsize 53 > item 5 key (257 EXTENT_DATA 5094002688) itemoff 3677 itemsize 53 > item 6 key (257 EXTENT_DATA 5094006784) itemoff 3624 itemsize 53 > item 7 key (257 EXTENT_DATA 5094014976) itemoff 3571 itemsize 53 > item 8 key (257 EXTENT_DATA 5094023168) itemoff 3518 itemsize 53 > item 9 key (257 EXTENT_DATA 5094027264) itemoff 3465 itemsize 53 > item 10 key (257 EXTENT_DATA 5094035456) itemoff 3412 itemsize 53 > item 11 key (257 EXTENT_DATA 5094039552) itemoff 3359 itemsize 53 > item 12 key (257 EXTENT_DATA 5094047744) itemoff 3306 itemsize 53 > item 13 key (257 EXTENT_DATA 5094055936) itemoff 3253 itemsize 53 > item 14 key (257 EXTENT_DATA 5094064128) itemoff 3200 itemsize 53 > item 15 key (257 EXTENT_DATA 5094072320) itemoff 3147 itemsize 53 > item 16 key (257 EXTENT_DATA 5094076416) itemoff 3094 itemsize 53 > item 17 key (257 EXTENT_DATA 5094084608) itemoff 3041 itemsize 53 > item 18 key (257 EXTENT_DATA 5094088704) itemoff 2988 itemsize 53 > item 19 key (257 EXTENT_DATA 5094096896) itemoff 2935 itemsize 53 > item 20 key (257 EXTENT_DATA 5094100992) itemoff 2882 itemsize 53 > item 21 key (257 EXTENT_DATA 5094109184) itemoff 2829 itemsize 53 > item 22 key (257 EXTENT_DATA 5094117376) itemoff 2776 itemsize 53 > item 23 key (257 EXTENT_DATA 5094121472) itemoff 2723 itemsize 53 > item 24 key (257 EXTENT_DATA 5094129664) itemoff 2670 itemsize 53 > item 25 key (257 EXTENT_DATA 5094137856) itemoff 2617 itemsize 53 > item 26 key (257 EXTENT_DATA 5094141952) itemoff 2564 itemsize 53 > item 27 key (257 EXTENT_DATA 5094150144) itemoff 2511 itemsize 53 > item 28 key (257 EXTENT_DATA 5094154240) itemoff 2458 itemsize 53 > item 29 key (257 EXTENT_DATA 5094158336) itemoff 2405 itemsize 53 > item 30 key (257 EXTENT_DATA 5876654080) itemoff 2352 itemsize 53 > item 31 key (257 EXTENT_DATA 6659149824) itemoff 2299 itemsize 53 > item 32 key (257 EXTENT_DATA 7001882624) itemoff 2246 itemsize 53 > item 33 key (257 EXTENT_DATA 7008460800) itemoff 2193 itemsize 53 > > Using the current btrfs_prev_leaf implementation, to lookup for the key > (257 BTRFS_EXTENT_DATA_KEY 5093457920), which is the left most item of > the third leaf, and then calling btrfs_leaf_prev against the path produced > by btrfs_search_slot, returns the same leaf with a success return value (0): > > struct btrfs_path *path; > struct btrfs_key key; > struct btrfs_disk_key disk_key; > struct extent_buffer *leaf; > int ret, slot; > > path = btrfs_alloc_path(); > key.objectid = 257; > key.type = BTRFS_EXTENT_DATA_KEY; > key.offset = 5093457920; > ret = btrfs_search_slot(NULL, fs_info->fs_root, &key, path, 0, 0); > BUG_ON(ret != 0); > > leaf = path->nodes[0]; > slot = path->slots[0]; > btrfs_item_key(leaf, &disk_key, slot); > printf("\tslot: %d, parent slot: %d, key: ", slot, path->slots[1]); > btrfs_print_key(&disk_key); > printf("\n"); > > ret = btrfs_prev_leaf(info->fs_root, path); > leaf = path->nodes[0]; > slot = path->slots[0]; > btrfs_item_key(leaf, &disk_key, slot); > printf("\tprev leaf ret: %d, prev leaf key: ", ret); > btrfs_print_key(&disk_key); > printf("\n"); > > The output sent to stdout from the above code when using the current implementation > of btrfs_prev_leaf is: > > slot: 0, parent slot: 2, key: key (257 EXTENT_DATA 5093457920) > prev leaf ret: 0, prev leaf key: key (257 EXTENT_DATA 5093457920) > > Using the new version of btrfs_prev_leaf, introduced by this change and which is > based on the version found in btrfs-progs, produces correct results, i.e. it returns > the previous leaf with the correct slot set (right most item of the previous leaf). > The output sent to stdout with the fixed btrfs_prev_leaf is: > > slot: 0, parent slot: 2, key: key (257 EXTENT_DATA 5093457920) > prev leaf ret: 0, prev leaf key: key (257 EXTENT_DATA 5093453824) > > The above fs tree was produced with the following code: > > $ mkfs.btrfs -f /dev/sdb3 > $ mount /dev/sdb3 /mnt/btrfs > $ dd if=/dev/zero of=/mnt/btrfs/foobar bs=4096 count=10000000 > $ umount /mnt/btrfs > $ btrfs-debug-tree /dev/sdb3 > > Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> > --- > fs/btrfs/ctree.c | 113 +++++++++++++++++++++++++++++++++++++++--------------- > 1 file changed, 82 insertions(+), 31 deletions(-) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index 5fa521b..df68600 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -42,6 +42,7 @@ static void del_ptr(struct btrfs_root *root, struct btrfs_path *path, > static void tree_mod_log_free_eb(struct btrfs_fs_info *fs_info, > struct extent_buffer *eb); > static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path); > +static void read_lock_node(struct extent_buffer *b, struct btrfs_path *path); > > struct btrfs_path *btrfs_alloc_path(void) > { > @@ -2646,16 +2647,8 @@ cow_done: > BTRFS_WRITE_LOCK); > } > p->locks[level] = BTRFS_WRITE_LOCK; > - } else { > - err = btrfs_try_tree_read_lock(b); > - if (!err) { > - btrfs_set_path_blocking(p); > - btrfs_tree_read_lock(b); > - btrfs_clear_path_blocking(p, b, > - BTRFS_READ_LOCK); > - } > - p->locks[level] = BTRFS_READ_LOCK; > - } > + } else > + read_lock_node(b, p); > p->nodes[level] = b; > } > } else { > @@ -4776,30 +4769,74 @@ int btrfs_del_items(struct btrfs_trans_handle *trans, struct btrfs_root *root, > */ > static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path) > { > - struct btrfs_key key; > - struct btrfs_disk_key found_key; > - int ret; > + int slot; > + int level = 1; > + int ret = 1; > + struct extent_buffer *c; > + struct extent_buffer *next = NULL, *next2 = NULL; > + int *locks; > > - btrfs_item_key_to_cpu(path->nodes[0], &key, 0); > + locks = kzalloc(sizeof(int) * BTRFS_MAX_LEVEL, GFP_NOFS); > + if (!locks) > + return -ENOMEM; > + locks[0] = path->locks[0]; > > - if (key.offset > 0) > - key.offset--; > - else if (key.type > 0) > - key.type--; > - else if (key.objectid > 0) > - key.objectid--; > - else > - return 1; > + while (level < BTRFS_MAX_LEVEL) { > + if (!path->nodes[level]) > + goto out; > > - btrfs_release_path(path); > - ret = btrfs_search_slot(NULL, root, &key, path, 0, 0); > - if (ret < 0) > - return ret; > - btrfs_item_key(path->nodes[0], &found_key, 0); > - ret = comp_keys(&found_key, &key); > - if (ret < 0) > - return 0; > - return 1; > + slot = path->slots[level]; > + c = path->nodes[level]; > + locks[level] = path->locks[level]; > + if (slot == 0) { > + level++; > + if (level == BTRFS_MAX_LEVEL) > + goto out; > + continue; > + } > + slot--; > + /* > + * TODO: need to lock ''c'' ? > + * Perhaps need to lock it, then check if > + * extent_buffer_uptodate(c) returns true. If yes, all is ok? > + * If it''s not, then repeat the tree search for the key at the > + * original path->nodes[0] (item at the original path->slots[0]). > + * Is this safe? Can it cause infinite retries? > + */ > + next = read_node_slot(root, c, slot); > + if (!path->skip_locking) > + read_lock_node(next, path); > + break; > + } > + path->slots[level] = slot; > + while (1) { > + level--; > + c = path->nodes[level]; > + if (locks[level]) > + btrfs_tree_unlock_rw(c, locks[level]); > + free_extent_buffer(c); > + slot = btrfs_header_nritems(next) - 1; > + if (slot < 0) > + goto out; > + path->nodes[level] = next; > + path->slots[level] = slot; > + if (level == 0) { > + ret = 0; > + goto out; > + } > + next2 = read_node_slot(root, next, slot); > + if (!path->skip_locking) > + read_lock_node(next2, path); > + if (locks[level + 1]) { > + btrfs_tree_unlock_rw(next, locks[level + 1]); > + locks[level + 1] = 0; > + } > + next = next2; > + } > + > +out: > + kfree(locks); > + return ret; > } > > /* > @@ -5636,3 +5673,17 @@ int btrfs_previous_item(struct btrfs_root *root, > } > return 1; > } > + > +static void read_lock_node(struct extent_buffer *b, struct btrfs_path *path) > +{ > + int success; > + int level = btrfs_header_level(b); > + > + success = btrfs_try_tree_read_lock(b); > + if (!success) { > + btrfs_set_path_blocking(path); > + btrfs_tree_read_lock(b); > + btrfs_clear_path_blocking(path, b, BTRFS_READ_LOCK); > + } > + path->locks[level] = BTRFS_READ_LOCK; > +} > -- > 1.7.9.5 >Please ignore this. After debugging a little more, it turned out there''s no problem at all. Confusion came from the fact that btrfs_prev_leaf from btrfs-progs sets path->slots[0] correctly, while the kernel version doesn''t set it and forces the caller to use a slot of btrfs_header_nritems(leaf) - 1. It happened that in my test using the kernel version in btrfs-progs, I was using path->slots[0], which wasn''t modified by btrfs_prev_leaf and by coincidence using it pointed to a memory location containing the same leaf/item. Thanks Josef for some time spent on debugging this with me. -- Filipe David Manana, "Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That''s why all progress depends on unreasonable men." -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html