Miao Xie
2013-Jul-03 13:25 UTC
[RFC PATCH 00/12] Btrfs-progs: introduce chunk recover function
This patchset introduced chunk recover function, which was implemented by scanning the whoel disks in the filesystem. Now, we can recover Single, Dup, RAID1 chunks, and RAID0, RAID10, RAID5, RAID6 metadata chunks. Miao Xie (11): Btrfs-progs: don''t close the file descriptor 0 when closing a device Btrfs-progs: Don''t free the devices when close the ctree Btrfs-progs: cleanup similar code in open_ctree_* and close_ctree Btrfs-progs: introduce common insert/search/delete functions for rb-tree Btrfs-progs: use rb-tree instead of extent cache tree for fs/file roots Btrfs-progs: extend the extent cache for the device extent Btrfs-progs: Add block group check funtion Btrfs-progs: Add chunk recover function - using old chunk items Btrfs-progs: introduce list_{first, next}_entry/list_splice_tail{_init} Btrfs-progs: Add chunk rebuild function for RAID1/SINGLE/DUP Btrfs-progs: recover raid0/raid10/raid5/raid6 metadata chunk Wang Shilong (1): Btrfs-progs: fix missing recow roots when making btrfs filesystem Makefile | 4 +- btrfs-find-root.c | 155 +---- btrfs-list.c | 19 +- btrfs.c | 1 + btrfsck.h | 183 ++++++ cmds-check.c | 810 +++++++++++++++++++---- cmds-chunk.c | 1837 +++++++++++++++++++++++++++++++++++++++++++++++++++++ commands.h | 2 + ctree.h | 4 +- disk-io.c | 581 +++++++++-------- disk-io.h | 15 +- extent-cache.c | 262 +++++--- extent-cache.h | 46 +- extent-tree.c | 6 - extent_io.c | 64 +- extent_io.h | 6 + list.h | 68 +- mkfs.c | 56 +- rbtree.c | 63 ++ rbtree.h | 24 +- repair.c | 2 +- volumes.c | 69 +- volumes.h | 9 +- 23 files changed, 3547 insertions(+), 739 deletions(-) create mode 100644 btrfsck.h create mode 100644 cmds-chunk.c -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Jul-03 13:25 UTC
[PATCH 01/12] Btrfs-progs: fix missing recow roots when making btrfs filesystem
From: Wang Shilong <wangsl-fnst@cn.fujitsu.com> When making btrfs filesystem. we firstly write root leaf to specified filed, and then we recow the root. If we don''t recow, some trees are not in the correct block group. Steps to reproduce: dd if=/dev/zero of=test.img bs=1M count=100 mkfs.btrfs -f test.img btrfs-debug-tree test.img extent tree key (EXTENT_TREE ROOT_ITEM 0) leaf 4210688 items 10 free space 3349 generation 4 owner 2 fs uuid 2e08fd93-f24d-4f44-a226-e2116fcd544f chunk uuid dc482988-6246-46ce-9329-68bcf6d3683c item 0 key (0 BLOCK_GROUP_ITEM 4194304) itemoff 3971 itemsize 24 block group used 12288 chunk_objectid 256 flags 2 [..snip..] item 3 key (1138688 EXTENT_ITEM 4096) itemoff 3827 itemsize 42 extent refs 1 gen 1 flags 2 tree block key (0 UNKNOWN.0 0) level 0 item 4 key (1138688 TREE_BLOCK_REF 7) itemoff 3827 itemsize 0 tree block backref [..snip..] checksum tree key (CSUM_TREE ROOT_ITEM 0) leaf 1138688 items 0 free space 3995 generation 1 owner 7 fs uuid 2e08fd93-f24d-4f44-a226-e2116fcd544f chunk uuid dc482988-6246-46ce-9329-68bcf6d3683c For the above example, csum root leaf comes into system block group which is wrong,csum root leaf should be in metadata block group. Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com> Reviewed-by: Miao Xie <miaox@cn.fujitsu.com> --- mkfs.c | 56 +++++++++++++++++++++----------------------------------- 1 file changed, 21 insertions(+), 35 deletions(-) diff --git a/mkfs.c b/mkfs.c index 7ff60e5..b412b7e 100644 --- a/mkfs.c +++ b/mkfs.c @@ -145,45 +145,31 @@ err: return ret; } -static int recow_roots(struct btrfs_trans_handle *trans, - struct btrfs_root *root) +static void __recow_root(struct btrfs_trans_handle *trans, + struct btrfs_root *root) { int ret; struct extent_buffer *tmp; - struct btrfs_fs_info *info = root->fs_info; - - ret = __btrfs_cow_block(trans, info->fs_root, info->fs_root->node, - NULL, 0, &tmp, 0, 0); - BUG_ON(ret); - free_extent_buffer(tmp); - - ret = __btrfs_cow_block(trans, info->tree_root, info->tree_root->node, - NULL, 0, &tmp, 0, 0); - BUG_ON(ret); - free_extent_buffer(tmp); - - ret = __btrfs_cow_block(trans, info->extent_root, - info->extent_root->node, NULL, 0, &tmp, 0, 0); - BUG_ON(ret); - free_extent_buffer(tmp); - - ret = __btrfs_cow_block(trans, info->chunk_root, info->chunk_root->node, - NULL, 0, &tmp, 0, 0); - BUG_ON(ret); - free_extent_buffer(tmp); + if (trans->transid != btrfs_root_generation(&root->root_item)) { + ret = __btrfs_cow_block(trans, root, root->node, + NULL, 0, &tmp, 0, 0); + BUG_ON(ret); + free_extent_buffer(tmp); + } +} - ret = __btrfs_cow_block(trans, info->dev_root, info->dev_root->node, - NULL, 0, &tmp, 0, 0); - BUG_ON(ret); - free_extent_buffer(tmp); - - ret = __btrfs_cow_block(trans, info->csum_root, info->csum_root->node, - NULL, 0, &tmp, 0, 0); - BUG_ON(ret); - free_extent_buffer(tmp); +static void recow_roots(struct btrfs_trans_handle *trans, + struct btrfs_root *root) +{ + struct btrfs_fs_info *info = root->fs_info; - return 0; + __recow_root(trans, info->fs_root); + __recow_root(trans, info->tree_root); + __recow_root(trans, info->extent_root); + __recow_root(trans, info->chunk_root); + __recow_root(trans, info->dev_root); + __recow_root(trans, info->csum_root); } static int create_one_raid_group(struct btrfs_trans_handle *trans, @@ -281,8 +267,6 @@ static int create_raid_groups(struct btrfs_trans_handle *trans, (allowed & metadata_profile)); BUG_ON(ret); - ret = recow_roots(trans, root); - BUG_ON(ret); } if (!mixed && num_devices > 1 && (allowed & data_profile)) { ret = create_one_raid_group(trans, root, @@ -290,6 +274,8 @@ static int create_raid_groups(struct btrfs_trans_handle *trans, (allowed & data_profile)); BUG_ON(ret); } + recow_roots(trans, root); + return 0; } -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Jul-03 13:25 UTC
[PATCH 02/12] Btrfs-progs: don''t close the file descriptor 0 when closing a device
As we know, the file descriptor 0 is a special number, so we shouldn''t use it to initialize the file descriptor of the devices, or we might close this special file descriptor by mistake when we close the devices. "-1" is a better choice. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> --- btrfs-find-root.c | 5 ++++- disk-io.c | 5 +++-- volumes.c | 7 +++++-- 3 files changed, 12 insertions(+), 5 deletions(-) diff --git a/btrfs-find-root.c b/btrfs-find-root.c index 810d835..3e1396d 100644 --- a/btrfs-find-root.c +++ b/btrfs-find-root.c @@ -76,7 +76,10 @@ static int close_all_devices(struct btrfs_fs_info *fs_info) list = &fs_info->fs_devices->devices; list_for_each(next, list) { device = list_entry(next, struct btrfs_device, dev_list); - close(device->fd); + if (device->fd != -1) { + close(device->fd); + device->fd = -1; + } } return 0; } diff --git a/disk-io.c b/disk-io.c index 9ffe6e4..4003636 100644 --- a/disk-io.c +++ b/disk-io.c @@ -1270,12 +1270,13 @@ static int close_all_devices(struct btrfs_fs_info *fs_info) while (!list_empty(list)) { device = list_entry(list->next, struct btrfs_device, dev_list); list_del_init(&device->dev_list); - if (device->fd) { + if (device->fd != -1) { fsync(device->fd); if (posix_fadvise(device->fd, 0, 0, POSIX_FADV_DONTNEED)) fprintf(stderr, "Warning, could not drop caches\n"); + close(device->fd); + device->fd = -1; } - close(device->fd); kfree(device->name); kfree(device->label); kfree(device); diff --git a/volumes.c b/volumes.c index d6f81f8..b88385b 100644 --- a/volumes.c +++ b/volumes.c @@ -116,6 +116,7 @@ static int device_list_add(const char *path, /* we can safely leave the fs_devices entry around */ return -ENOMEM; } + device->fd = -1; device->devid = devid; memcpy(device->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE); @@ -161,8 +162,10 @@ int btrfs_close_devices(struct btrfs_fs_devices *fs_devices) again: list_for_each(cur, &fs_devices->devices) { device = list_entry(cur, struct btrfs_device, dev_list); - close(device->fd); - device->fd = -1; + if (device->fd != -1) { + close(device->fd); + device->fd = -1; + } device->writeable = 0; } -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Jul-03 13:25 UTC
[PATCH 03/12] Btrfs-progs: Don''t free the devices when close the ctree
Some commands(such as btrfs-convert) access the devices again after we close the ctree, so it is better that we don''t free the devices objects when the ctree is closed, or we need re-allocate the memory for the devices. We needn''t worry the memory leak problem, because all the memory will be freed after the taskes die. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> --- btrfs-find-root.c | 21 +-------------------- disk-io.c | 30 ++---------------------------- volumes.c | 3 +++ 3 files changed, 6 insertions(+), 48 deletions(-) diff --git a/btrfs-find-root.c b/btrfs-find-root.c index 3e1396d..da22c1d 100644 --- a/btrfs-find-root.c +++ b/btrfs-find-root.c @@ -65,25 +65,6 @@ int csum_block(void *buf, u32 len) return ret; } -static int close_all_devices(struct btrfs_fs_info *fs_info) -{ - struct list_head *list; - struct list_head *next; - struct btrfs_device *device; - - return 0; - - list = &fs_info->fs_devices->devices; - list_for_each(next, list) { - device = list_entry(next, struct btrfs_device, dev_list); - if (device->fd != -1) { - close(device->fd); - device->fd = -1; - } - } - return 0; -} - static struct btrfs_root *open_ctree_broken(int fd, const char *device) { u32 sectorsize; @@ -217,7 +198,7 @@ static struct btrfs_root *open_ctree_broken(int fd, const char *device) out_chunk: free_extent_buffer(fs_info->chunk_root->node); out_devices: - close_all_devices(fs_info); + btrfs_close_devices(fs_info->fs_devices); out_cleanup: extent_io_tree_cleanup(&fs_info->extent_cache); extent_io_tree_cleanup(&fs_info->free_space_cache); diff --git a/disk-io.c b/disk-io.c index 4003636..a8176a5 100644 --- a/disk-io.c +++ b/disk-io.c @@ -35,8 +35,6 @@ #include "utils.h" #include "print-tree.h" -static int close_all_devices(struct btrfs_fs_info *fs_info); - static int check_tree_block(struct btrfs_root *root, struct extent_buffer *buf) { @@ -1028,7 +1026,7 @@ out_chunk: if (fs_info->chunk_root) free_extent_buffer(fs_info->chunk_root->node); out_devices: - close_all_devices(fs_info); + btrfs_close_devices(fs_info->fs_devices); out_cleanup: extent_io_tree_cleanup(&fs_info->extent_cache); extent_io_tree_cleanup(&fs_info->free_space_cache); @@ -1261,30 +1259,6 @@ int write_ctree_super(struct btrfs_trans_handle *trans, return ret; } -static int close_all_devices(struct btrfs_fs_info *fs_info) -{ - struct list_head *list; - struct btrfs_device *device; - - list = &fs_info->fs_devices->devices; - while (!list_empty(list)) { - device = list_entry(list->next, struct btrfs_device, dev_list); - list_del_init(&device->dev_list); - if (device->fd != -1) { - fsync(device->fd); - if (posix_fadvise(device->fd, 0, 0, POSIX_FADV_DONTNEED)) - fprintf(stderr, "Warning, could not drop caches\n"); - close(device->fd); - device->fd = -1; - } - kfree(device->name); - kfree(device->label); - kfree(device); - } - kfree(fs_info->fs_devices); - return 0; -} - static void free_mapping_cache(struct btrfs_fs_info *fs_info) { struct cache_tree *cache_tree = &fs_info->mapping_tree.cache_tree; @@ -1337,7 +1311,7 @@ int close_ctree(struct btrfs_root *root) free(fs_info->log_root_tree); } - close_all_devices(fs_info); + btrfs_close_devices(fs_info->fs_devices); free_mapping_cache(fs_info); extent_io_tree_cleanup(&fs_info->extent_cache); extent_io_tree_cleanup(&fs_info->free_space_cache); diff --git a/volumes.c b/volumes.c index b88385b..0f6a35b 100644 --- a/volumes.c +++ b/volumes.c @@ -163,6 +163,9 @@ again: list_for_each(cur, &fs_devices->devices) { device = list_entry(cur, struct btrfs_device, dev_list); if (device->fd != -1) { + fsync(device->fd); + if (posix_fadvise(device->fd, 0, 0, POSIX_FADV_DONTNEED)) + fprintf(stderr, "Warning, could not drop caches\n"); close(device->fd); device->fd = -1; } -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Jul-03 13:25 UTC
[PATCH 04/12] Btrfs-progs: cleanup similar code in open_ctree_* and close_ctree
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> --- btrfs-find-root.c | 137 +++------------- disk-io.c | 473 +++++++++++++++++++++++++++++++----------------------- disk-io.h | 12 ++ 3 files changed, 307 insertions(+), 315 deletions(-) diff --git a/btrfs-find-root.c b/btrfs-find-root.c index da22c1d..f2cc1bf 100644 --- a/btrfs-find-root.c +++ b/btrfs-find-root.c @@ -67,74 +67,31 @@ int csum_block(void *buf, u32 len) static struct btrfs_root *open_ctree_broken(int fd, const char *device) { - u32 sectorsize; - u32 nodesize; - u32 leafsize; - u32 blocksize; - u32 stripesize; - u64 generation; - struct btrfs_root *tree_root = malloc(sizeof(struct btrfs_root)); - struct btrfs_root *extent_root = malloc(sizeof(struct btrfs_root)); - struct btrfs_root *chunk_root = malloc(sizeof(struct btrfs_root)); - struct btrfs_root *dev_root = malloc(sizeof(struct btrfs_root)); - struct btrfs_root *csum_root = malloc(sizeof(struct btrfs_root)); - struct btrfs_fs_info *fs_info = malloc(sizeof(*fs_info)); - int ret; + struct btrfs_fs_info *fs_info; struct btrfs_super_block *disk_super; struct btrfs_fs_devices *fs_devices = NULL; - u64 total_devs; - u64 features; - - ret = btrfs_scan_one_device(fd, device, &fs_devices, - &total_devs, BTRFS_SUPER_INFO_OFFSET); + struct extent_buffer *eb; + int ret; - if (ret) { - fprintf(stderr, "No valid Btrfs found on %s\n", device); - goto out; + fs_info = btrfs_new_fs_info(0, BTRFS_SUPER_INFO_OFFSET); + if (!fs_info) { + fprintf(stderr, "Failed to allocate memory for fs_info\n"); + return NULL; } - if (total_devs != 1) { - ret = btrfs_scan_for_fsid(fs_devices, total_devs, 1); - if (ret) - goto out; - } - - memset(fs_info, 0, sizeof(*fs_info)); - fs_info->super_copy = calloc(1, BTRFS_SUPER_INFO_SIZE); - fs_info->tree_root = tree_root; - fs_info->extent_root = extent_root; - fs_info->chunk_root = chunk_root; - fs_info->dev_root = dev_root; - fs_info->csum_root = csum_root; - - fs_info->readonly = 1; - - extent_io_tree_init(&fs_info->extent_cache); - extent_io_tree_init(&fs_info->free_space_cache); - extent_io_tree_init(&fs_info->block_group_cache); - extent_io_tree_init(&fs_info->pinned_extents); - extent_io_tree_init(&fs_info->pending_del); - extent_io_tree_init(&fs_info->extent_ins); - cache_tree_init(&fs_info->fs_root_cache); - - cache_tree_init(&fs_info->mapping_tree.cache_tree); + ret = btrfs_scan_fs_devices(fd, device, &fs_devices); + if (ret) + goto out; - mutex_init(&fs_info->fs_mutex); fs_info->fs_devices = fs_devices; - INIT_LIST_HEAD(&fs_info->dirty_cowonly_roots); - INIT_LIST_HEAD(&fs_info->space_info); - - __setup_root(4096, 4096, 4096, 4096, tree_root, - fs_info, BTRFS_ROOT_TREE_OBJECTID); ret = btrfs_open_devices(fs_devices, O_RDONLY); if (ret) - goto out_cleanup; + goto out_devices; - fs_info->super_bytenr = BTRFS_SUPER_INFO_OFFSET; disk_super = fs_info->super_copy; ret = btrfs_read_dev_super(fs_devices->latest_bdev, - disk_super, BTRFS_SUPER_INFO_OFFSET); + disk_super, fs_info->super_bytenr); if (ret) { printk("No valid btrfs found\n"); goto out_devices; @@ -142,77 +99,27 @@ static struct btrfs_root *open_ctree_broken(int fd, const char *device) memcpy(fs_info->fsid, &disk_super->fsid, BTRFS_FSID_SIZE); - - features = btrfs_super_incompat_flags(disk_super) & - ~BTRFS_FEATURE_INCOMPAT_SUPP; - if (features) { - printk("couldn''t open because of unsupported " - "option features (%Lx).\n", features); - goto out_devices; - } - - features = btrfs_super_incompat_flags(disk_super); - if (!(features & BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF)) { - features |= BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF; - btrfs_set_super_incompat_flags(disk_super, features); - } - - nodesize = btrfs_super_nodesize(disk_super); - leafsize = btrfs_super_leafsize(disk_super); - sectorsize = btrfs_super_sectorsize(disk_super); - stripesize = btrfs_super_stripesize(disk_super); - tree_root->nodesize = nodesize; - tree_root->leafsize = leafsize; - tree_root->sectorsize = sectorsize; - tree_root->stripesize = stripesize; - - ret = btrfs_read_sys_array(tree_root); + ret = btrfs_check_fs_compatibility(disk_super, 0); if (ret) goto out_devices; - blocksize = btrfs_level_size(tree_root, - btrfs_super_chunk_root_level(disk_super)); - generation = btrfs_super_chunk_root_generation(disk_super); - - __setup_root(nodesize, leafsize, sectorsize, stripesize, - chunk_root, fs_info, BTRFS_CHUNK_TREE_OBJECTID); - - chunk_root->node = read_tree_block(chunk_root, - btrfs_super_chunk_root(disk_super), - blocksize, generation); - if (!chunk_root->node) { - printk("Couldn''t read chunk root\n"); - goto out_devices; - } - read_extent_buffer(chunk_root->node, fs_info->chunk_tree_uuid, - (unsigned long)btrfs_header_chunk_tree_uuid(chunk_root->node), - BTRFS_UUID_SIZE); + ret = btrfs_setup_chunk_tree_and_device_map(fs_info); + if (ret) + goto out_chunk; - if (!(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_METADUMP)) { - ret = btrfs_read_chunk_tree(chunk_root); - if (ret) - goto out_chunk; - } + eb = fs_info->chunk_root->node; + read_extent_buffer(eb, fs_info->chunk_tree_uuid, + (unsigned long)btrfs_header_chunk_tree_uuid(eb), + BTRFS_UUID_SIZE); return fs_info->chunk_root; out_chunk: free_extent_buffer(fs_info->chunk_root->node); + btrfs_cleanup_all_caches(fs_info); out_devices: btrfs_close_devices(fs_info->fs_devices); -out_cleanup: - extent_io_tree_cleanup(&fs_info->extent_cache); - extent_io_tree_cleanup(&fs_info->free_space_cache); - extent_io_tree_cleanup(&fs_info->block_group_cache); - extent_io_tree_cleanup(&fs_info->pinned_extents); - extent_io_tree_cleanup(&fs_info->pending_del); - extent_io_tree_cleanup(&fs_info->extent_ins); out: - free(tree_root); - free(extent_root); - free(chunk_root); - free(dev_root); - free(csum_root); - free(fs_info); + btrfs_free_fs_info(fs_info); return NULL; } diff --git a/disk-io.c b/disk-io.c index a8176a5..8116afc 100644 --- a/disk-io.c +++ b/disk-io.c @@ -796,61 +796,46 @@ struct btrfs_root *btrfs_read_fs_root(struct btrfs_fs_info *fs_info, return root; } -static struct btrfs_fs_info *__open_ctree_fd(int fp, const char *path, - u64 sb_bytenr, - u64 root_tree_bytenr, int writes, - int partial) +void btrfs_free_fs_info(struct btrfs_fs_info *fs_info) { - u32 sectorsize; - u32 nodesize; - u32 leafsize; - u32 blocksize; - u32 stripesize; - u64 generation; - struct btrfs_key key; - struct btrfs_root *tree_root = malloc(sizeof(struct btrfs_root)); - struct btrfs_root *extent_root = malloc(sizeof(struct btrfs_root)); - struct btrfs_root *chunk_root = malloc(sizeof(struct btrfs_root)); - struct btrfs_root *dev_root = malloc(sizeof(struct btrfs_root)); - struct btrfs_root *csum_root = malloc(sizeof(struct btrfs_root)); - struct btrfs_fs_info *fs_info = malloc(sizeof(*fs_info)); - int ret; - struct btrfs_super_block *disk_super; - struct btrfs_fs_devices *fs_devices = NULL; - u64 total_devs; - u64 features; - - if (sb_bytenr == 0) - sb_bytenr = BTRFS_SUPER_INFO_OFFSET; + free(fs_info->tree_root); + free(fs_info->extent_root); + free(fs_info->chunk_root); + free(fs_info->dev_root); + free(fs_info->csum_root); + free(fs_info->super_copy); + free(fs_info->log_root_tree); + free(fs_info); +} - /* try to drop all the caches */ - if (posix_fadvise(fp, 0, 0, POSIX_FADV_DONTNEED)) - fprintf(stderr, "Warning, could not drop caches\n"); +struct btrfs_fs_info *btrfs_new_fs_info(int writable, u64 sb_bytenr) +{ + struct btrfs_fs_info *fs_info; - ret = btrfs_scan_one_device(fp, path, &fs_devices, - &total_devs, sb_bytenr); + fs_info = malloc(sizeof(struct btrfs_fs_info)); + if (!fs_info) + return NULL; - if (ret) { - fprintf(stderr, "No valid Btrfs found on %s\n", path); - goto out; - } + memset(fs_info, 0, sizeof(struct btrfs_fs_info)); - if (total_devs != 1) { - ret = btrfs_scan_for_fsid(fs_devices, total_devs, 1); - if (ret) - goto out; - } + fs_info->tree_root = malloc(sizeof(struct btrfs_root)); + fs_info->extent_root = malloc(sizeof(struct btrfs_root)); + fs_info->chunk_root = malloc(sizeof(struct btrfs_root)); + fs_info->dev_root = malloc(sizeof(struct btrfs_root)); + fs_info->csum_root = malloc(sizeof(struct btrfs_root)); + fs_info->super_copy = malloc(BTRFS_SUPER_INFO_SIZE); - memset(fs_info, 0, sizeof(*fs_info)); - fs_info->super_copy = calloc(1, BTRFS_SUPER_INFO_SIZE); - fs_info->tree_root = tree_root; - fs_info->extent_root = extent_root; - fs_info->chunk_root = chunk_root; - fs_info->dev_root = dev_root; - fs_info->csum_root = csum_root; + if (!fs_info->tree_root || !fs_info->extent_root || + !fs_info->chunk_root || !fs_info->dev_root || + !fs_info->csum_root || !fs_info->super_copy) + goto free_all; - if (!writes) - fs_info->readonly = 1; + memset(fs_info->super_copy, 0, BTRFS_SUPER_INFO_SIZE); + memset(fs_info->tree_root, 0, sizeof(struct btrfs_root)); + memset(fs_info->extent_root, 0, sizeof(struct btrfs_root)); + memset(fs_info->chunk_root, 0, sizeof(struct btrfs_root)); + memset(fs_info->dev_root, 0, sizeof(struct btrfs_root)); + memset(fs_info->csum_root, 0, sizeof(struct btrfs_root)); extent_io_tree_init(&fs_info->extent_cache); extent_io_tree_init(&fs_info->free_space_cache); @@ -858,139 +843,121 @@ static struct btrfs_fs_info *__open_ctree_fd(int fp, const char *path, extent_io_tree_init(&fs_info->pinned_extents); extent_io_tree_init(&fs_info->pending_del); extent_io_tree_init(&fs_info->extent_ins); - cache_tree_init(&fs_info->fs_root_cache); + cache_tree_init(&fs_info->fs_root_cache); cache_tree_init(&fs_info->mapping_tree.cache_tree); mutex_init(&fs_info->fs_mutex); - fs_info->fs_devices = fs_devices; INIT_LIST_HEAD(&fs_info->dirty_cowonly_roots); INIT_LIST_HEAD(&fs_info->space_info); - __setup_root(4096, 4096, 4096, 4096, tree_root, - fs_info, BTRFS_ROOT_TREE_OBJECTID); - - if (writes) - ret = btrfs_open_devices(fs_devices, O_RDWR); - else - ret = btrfs_open_devices(fs_devices, O_RDONLY); - if (ret) - goto out_cleanup; + if (!writable) + fs_info->readonly = 1; fs_info->super_bytenr = sb_bytenr; - disk_super = fs_info->super_copy; - ret = btrfs_read_dev_super(fs_devices->latest_bdev, - disk_super, sb_bytenr); - if (ret) { - printk("No valid btrfs found\n"); - goto out_devices; - } - - memcpy(fs_info->fsid, &disk_super->fsid, BTRFS_FSID_SIZE); + fs_info->data_alloc_profile = (u64)-1; + fs_info->metadata_alloc_profile = (u64)-1; + fs_info->system_alloc_profile = fs_info->metadata_alloc_profile; + return fs_info; +free_all: + btrfs_free_fs_info(fs_info); + return NULL; +} +int btrfs_check_fs_compatibility(struct btrfs_super_block *sb, int writable) +{ + u64 features; - features = btrfs_super_incompat_flags(disk_super) & + features = btrfs_super_incompat_flags(sb) & ~BTRFS_FEATURE_INCOMPAT_SUPP; if (features) { printk("couldn''t open because of unsupported " "option features (%Lx).\n", (unsigned long long)features); - goto out_devices; + return -ENOTSUP; } - features = btrfs_super_incompat_flags(disk_super); + features = btrfs_super_incompat_flags(sb); if (!(features & BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF)) { features |= BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF; - btrfs_set_super_incompat_flags(disk_super, features); + btrfs_set_super_incompat_flags(sb, features); } - features = btrfs_super_compat_ro_flags(disk_super) & + features = btrfs_super_compat_ro_flags(sb) & ~BTRFS_FEATURE_COMPAT_RO_SUPP; - if (writes && features) { + if (writable && features) { printk("couldn''t open RDWR because of unsupported " "option features (%Lx).\n", (unsigned long long)features); - goto out_devices; + return -ENOTSUP; } + return 0; +} - nodesize = btrfs_super_nodesize(disk_super); - leafsize = btrfs_super_leafsize(disk_super); - sectorsize = btrfs_super_sectorsize(disk_super); - stripesize = btrfs_super_stripesize(disk_super); - tree_root->nodesize = nodesize; - tree_root->leafsize = leafsize; - tree_root->sectorsize = sectorsize; - tree_root->stripesize = stripesize; +int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, + u64 root_tree_bytenr, int partial) +{ + struct btrfs_super_block *sb = fs_info->super_copy; + struct btrfs_root *root; + struct btrfs_key key; + u32 sectorsize; + u32 nodesize; + u32 leafsize; + u32 stripesize; + u64 generation; + u32 blocksize; + int ret; - ret = btrfs_read_sys_array(tree_root); - if (ret) - goto out_devices; - blocksize = btrfs_level_size(tree_root, - btrfs_super_chunk_root_level(disk_super)); - generation = btrfs_super_chunk_root_generation(disk_super); + nodesize = btrfs_super_nodesize(sb); + leafsize = btrfs_super_leafsize(sb); + sectorsize = btrfs_super_sectorsize(sb); + stripesize = btrfs_super_stripesize(sb); + root = fs_info->tree_root; __setup_root(nodesize, leafsize, sectorsize, stripesize, - chunk_root, fs_info, BTRFS_CHUNK_TREE_OBJECTID); - - chunk_root->node = read_tree_block(chunk_root, - btrfs_super_chunk_root(disk_super), - blocksize, generation); - if (!extent_buffer_uptodate(chunk_root->node)) { - printk("Couldn''t read chunk root\n"); - goto out_devices; - } - - read_extent_buffer(chunk_root->node, fs_info->chunk_tree_uuid, - (unsigned long)btrfs_header_chunk_tree_uuid(chunk_root->node), - BTRFS_UUID_SIZE); - - if (!(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_METADUMP)) { - ret = btrfs_read_chunk_tree(chunk_root); - if (ret) { - printk("Couldn''t read chunk tree\n"); - goto out_chunk; - } - } - - blocksize = btrfs_level_size(tree_root, - btrfs_super_root_level(disk_super)); - generation = btrfs_super_generation(disk_super); + root, fs_info, BTRFS_ROOT_TREE_OBJECTID); + blocksize = btrfs_level_size(root, btrfs_super_root_level(sb)); + generation = btrfs_super_generation(sb); if (!root_tree_bytenr) - root_tree_bytenr = btrfs_super_root(disk_super); - tree_root->node = read_tree_block(tree_root, - root_tree_bytenr, - blocksize, generation); - if (!extent_buffer_uptodate(tree_root->node)) { - printk("Couldn''t read tree root\n"); - goto out_failed; + root_tree_bytenr = btrfs_super_root(sb); + root->node = read_tree_block(root, root_tree_bytenr, blocksize, + generation); + if (!extent_buffer_uptodate(root->node)) { + fprintf(stderr, "Couldn''t read tree root\n"); + return -EIO; } - ret = find_and_setup_root(tree_root, fs_info, - BTRFS_EXTENT_TREE_OBJECTID, extent_root); + + ret = find_and_setup_root(root, fs_info, BTRFS_EXTENT_TREE_OBJECTID, + fs_info->extent_root); if (ret) { printk("Couldn''t setup extent tree\n"); - goto out_failed; + return -EIO; } - extent_root->track_dirty = 1; + fs_info->extent_root->track_dirty = 1; - ret = find_and_setup_root(tree_root, fs_info, - BTRFS_DEV_TREE_OBJECTID, dev_root); + ret = find_and_setup_root(root, fs_info, BTRFS_DEV_TREE_OBJECTID, + fs_info->dev_root); if (ret) { printk("Couldn''t setup device tree\n"); - goto out_failed; + return -EIO; } - dev_root->track_dirty = 1; + fs_info->dev_root->track_dirty = 1; - ret = find_and_setup_root(tree_root, fs_info, - BTRFS_CSUM_TREE_OBJECTID, csum_root); + ret = find_and_setup_root(root, fs_info, BTRFS_CSUM_TREE_OBJECTID, + fs_info->csum_root); if (ret) { printk("Couldn''t setup csum tree\n"); if (!partial) - goto out_failed; + return -EIO; } - csum_root->track_dirty = 1; + fs_info->csum_root->track_dirty = 1; - find_and_setup_log_root(tree_root, fs_info, disk_super); + ret = find_and_setup_log_root(root, fs_info, sb); + if (ret) { + printk("Couldn''t setup log root tree\n"); + return -EIO; + } fs_info->generation = generation; fs_info->last_trans_committed = generation; @@ -1002,18 +969,12 @@ static struct btrfs_fs_info *__open_ctree_fd(int fp, const char *path, fs_info->fs_root = btrfs_read_fs_root(fs_info, &key); if (!fs_info->fs_root) - goto out_failed; - - fs_info->data_alloc_profile = (u64)-1; - fs_info->metadata_alloc_profile = (u64)-1; - fs_info->system_alloc_profile = fs_info->metadata_alloc_profile; - - return fs_info; - -out_failed: - if (partial) - return fs_info; + return -EIO; + return 0; +} +void btrfs_release_all_roots(struct btrfs_fs_info *fs_info) +{ if (fs_info->csum_root) free_extent_buffer(fs_info->csum_root->node); if (fs_info->dev_root) @@ -1022,25 +983,179 @@ out_failed: free_extent_buffer(fs_info->extent_root->node); if (fs_info->tree_root) free_extent_buffer(fs_info->tree_root->node); -out_chunk: + if (fs_info->log_root_tree) + free_extent_buffer(fs_info->log_root_tree->node); if (fs_info->chunk_root) free_extent_buffer(fs_info->chunk_root->node); -out_devices: - btrfs_close_devices(fs_info->fs_devices); -out_cleanup: +} + +static void free_mapping_cache(struct btrfs_fs_info *fs_info) +{ + struct cache_tree *cache_tree = &fs_info->mapping_tree.cache_tree; + struct cache_extent *ce; + struct map_lookup *map; + + while ((ce = find_first_cache_extent(cache_tree, 0))) { + map = container_of(ce, struct map_lookup, ce); + remove_cache_extent(cache_tree, ce); + kfree(map); + } +} + +void btrfs_cleanup_all_caches(struct btrfs_fs_info *fs_info) +{ + free_mapping_cache(fs_info); extent_io_tree_cleanup(&fs_info->extent_cache); extent_io_tree_cleanup(&fs_info->free_space_cache); extent_io_tree_cleanup(&fs_info->block_group_cache); extent_io_tree_cleanup(&fs_info->pinned_extents); extent_io_tree_cleanup(&fs_info->pending_del); extent_io_tree_cleanup(&fs_info->extent_ins); +} + +int btrfs_scan_fs_devices(int fd, const char *path, + struct btrfs_fs_devices **fs_devices) +{ + u64 total_devs; + int ret; + + ret = btrfs_scan_one_device(fd, path, fs_devices, + &total_devs, BTRFS_SUPER_INFO_OFFSET); + if (ret) { + fprintf(stderr, "No valid Btrfs found on %s\n", path); + return ret; + } + + if (total_devs != 1) { + ret = btrfs_scan_for_fsid(*fs_devices, total_devs, 1); + if (ret) + return ret; + } + return 0; +} + +int btrfs_setup_chunk_tree_and_device_map(struct btrfs_fs_info *fs_info) +{ + struct btrfs_super_block *sb = fs_info->super_copy; + u32 sectorsize; + u32 nodesize; + u32 leafsize; + u32 blocksize; + u32 stripesize; + u64 generation; + int ret; + + nodesize = btrfs_super_nodesize(sb); + leafsize = btrfs_super_leafsize(sb); + sectorsize = btrfs_super_sectorsize(sb); + stripesize = btrfs_super_stripesize(sb); + + __setup_root(nodesize, leafsize, sectorsize, stripesize, + fs_info->chunk_root, fs_info, BTRFS_CHUNK_TREE_OBJECTID); + + ret = btrfs_read_sys_array(fs_info->chunk_root); + if (ret) + return ret; + + blocksize = btrfs_level_size(fs_info->chunk_root, + btrfs_super_chunk_root_level(sb)); + generation = btrfs_super_chunk_root_generation(sb); + + fs_info->chunk_root->node = read_tree_block(fs_info->chunk_root, + btrfs_super_chunk_root(sb), + blocksize, generation); + if (!fs_info->chunk_root->node || + !extent_buffer_uptodate(fs_info->chunk_root->node)) { + fprintf(stderr, "Couldn''t read chunk root\n"); + return ret; + } + + if (!(btrfs_super_flags(sb) & BTRFS_SUPER_FLAG_METADUMP)) { + ret = btrfs_read_chunk_tree(fs_info->chunk_root); + if (ret) { + fprintf(stderr, "Couldn''t read chunk tree\n"); + return ret; + } + } + return 0; +} + +static struct btrfs_fs_info *__open_ctree_fd(int fp, const char *path, + u64 sb_bytenr, + u64 root_tree_bytenr, int writes, + int partial) +{ + struct btrfs_fs_info *fs_info; + struct btrfs_super_block *disk_super; + struct btrfs_fs_devices *fs_devices = NULL; + struct extent_buffer *eb; + int ret; + + if (sb_bytenr == 0) + sb_bytenr = BTRFS_SUPER_INFO_OFFSET; + + /* try to drop all the caches */ + if (posix_fadvise(fp, 0, 0, POSIX_FADV_DONTNEED)) + fprintf(stderr, "Warning, could not drop caches\n"); + + fs_info = btrfs_new_fs_info(writes, sb_bytenr); + if (!fs_info) { + fprintf(stderr, "Failed to allocate memory for fs_info\n"); + return NULL; + } + + ret = btrfs_scan_fs_devices(fp, path, &fs_devices); + if (ret) + goto out; + + fs_info->fs_devices = fs_devices; + if (writes) + ret = btrfs_open_devices(fs_devices, O_RDWR); + else + ret = btrfs_open_devices(fs_devices, O_RDONLY); + if (ret) + goto out_devices; + + + disk_super = fs_info->super_copy; + ret = btrfs_read_dev_super(fs_devices->latest_bdev, + disk_super, sb_bytenr); + if (ret) { + printk("No valid btrfs found\n"); + goto out_devices; + } + + memcpy(fs_info->fsid, &disk_super->fsid, BTRFS_FSID_SIZE); + + ret = btrfs_check_fs_compatibility(fs_info->super_copy, writes); + if (ret) + goto out_devices; + + ret = btrfs_setup_chunk_tree_and_device_map(fs_info); + if (ret) + goto out_chunk; + + eb = fs_info->chunk_root->node; + read_extent_buffer(eb, fs_info->chunk_tree_uuid, + (unsigned long)btrfs_header_chunk_tree_uuid(eb), + BTRFS_UUID_SIZE); + + ret = btrfs_setup_all_roots(fs_info, root_tree_bytenr, partial); + if (ret) + goto out_failed; + + return fs_info; + +out_failed: + if (partial) + return fs_info; +out_chunk: + btrfs_release_all_roots(fs_info); + btrfs_cleanup_all_caches(fs_info); +out_devices: + btrfs_close_devices(fs_devices); out: - free(tree_root); - free(extent_root); - free(chunk_root); - free(dev_root); - free(csum_root); - free(fs_info); + btrfs_free_fs_info(fs_info); return NULL; } @@ -1259,19 +1374,6 @@ int write_ctree_super(struct btrfs_trans_handle *trans, return ret; } -static void free_mapping_cache(struct btrfs_fs_info *fs_info) -{ - struct cache_tree *cache_tree = &fs_info->mapping_tree.cache_tree; - struct cache_extent *ce; - struct map_lookup *map; - - while ((ce = find_first_cache_extent(cache_tree, 0))) { - map = container_of(ce, struct map_lookup, ce); - remove_cache_extent(cache_tree, ce); - kfree(map); - } -} - int close_ctree(struct btrfs_root *root) { int ret; @@ -1294,39 +1396,10 @@ int close_ctree(struct btrfs_root *root) free_fs_roots(fs_info); - if (fs_info->extent_root->node) - free_extent_buffer(fs_info->extent_root->node); - if (fs_info->tree_root->node) - free_extent_buffer(fs_info->tree_root->node); - if (fs_info->chunk_root->node) - free_extent_buffer(fs_info->chunk_root->node); - if (fs_info->dev_root->node) - free_extent_buffer(fs_info->dev_root->node); - if (fs_info->csum_root->node) - free_extent_buffer(fs_info->csum_root->node); - - if (fs_info->log_root_tree) { - if (fs_info->log_root_tree->node) - free_extent_buffer(fs_info->log_root_tree->node); - free(fs_info->log_root_tree); - } - + btrfs_release_all_roots(fs_info); btrfs_close_devices(fs_info->fs_devices); - free_mapping_cache(fs_info); - extent_io_tree_cleanup(&fs_info->extent_cache); - extent_io_tree_cleanup(&fs_info->free_space_cache); - extent_io_tree_cleanup(&fs_info->block_group_cache); - extent_io_tree_cleanup(&fs_info->pinned_extents); - extent_io_tree_cleanup(&fs_info->pending_del); - extent_io_tree_cleanup(&fs_info->extent_ins); - - free(fs_info->tree_root); - free(fs_info->extent_root); - free(fs_info->chunk_root); - free(fs_info->dev_root); - free(fs_info->csum_root); - free(fs_info); - + btrfs_cleanup_all_caches(fs_info); + btrfs_free_fs_info(fs_info); return 0; } diff --git a/disk-io.h b/disk-io.h index c29ee8e..2fe2d72 100644 --- a/disk-io.h +++ b/disk-io.h @@ -47,6 +47,18 @@ int __setup_root(u32 nodesize, u32 leafsize, u32 sectorsize, struct btrfs_fs_info *fs_info, u64 objectid); int clean_tree_block(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct extent_buffer *buf); + +void btrfs_free_fs_info(struct btrfs_fs_info *fs_info); +struct btrfs_fs_info *btrfs_new_fs_info(int writable, u64 sb_bytenr); +int btrfs_check_fs_compatibility(struct btrfs_super_block *sb, int writable); +int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, + u64 root_tree_bytenr, int partial); +void btrfs_release_all_roots(struct btrfs_fs_info *fs_info); +void btrfs_cleanup_all_caches(struct btrfs_fs_info *fs_info); +int btrfs_scan_fs_devices(int fd, const char *path, + struct btrfs_fs_devices **fs_devices); +int btrfs_setup_chunk_tree_and_device_map(struct btrfs_fs_info *fs_info); + struct btrfs_root *open_ctree(const char *filename, u64 sb_bytenr, int writes); struct btrfs_root *open_ctree_fd(int fp, const char *path, u64 sb_bytenr, int writes); -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Jul-03 13:25 UTC
[PATCH 05/12] Btrfs-progs: introduce common insert/search/delete functions for rb-tree
In fact, the code of many rb-tree insert/search/delete functions is similar, so we can abstract them, and implement common functions for rb-tree, and then simplify them. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> --- btrfs-list.c | 19 +++----- cmds-check.c | 111 +++++++++++++++++++----------------------- disk-io.c | 39 ++++++--------- disk-io.h | 2 +- extent-cache.c | 150 +++++++++++++++++++++++++++++---------------------------- extent-cache.h | 36 ++++++++------ extent_io.c | 45 ++++++++--------- rbtree.c | 63 ++++++++++++++++++++++++ rbtree.h | 24 +++++++-- repair.c | 2 +- volumes.c | 22 ++++----- 11 files changed, 284 insertions(+), 229 deletions(-) diff --git a/btrfs-list.c b/btrfs-list.c index c3d35de..4fab858 100644 --- a/btrfs-list.c +++ b/btrfs-list.c @@ -513,8 +513,11 @@ static int add_root(struct root_lookup *root_lookup, return 0; } -void __free_root_info(struct root_info *ri) +static void __free_root_info(struct rb_node *node) { + struct root_info *ri; + + ri = rb_entry(node, struct root_info, rb_node); if (ri->name) free(ri->name); @@ -527,19 +530,9 @@ void __free_root_info(struct root_info *ri) free(ri); } -void __free_all_subvolumn(struct root_lookup *root_tree) +static inline void __free_all_subvolumn(struct root_lookup *root_tree) { - struct root_info *entry; - struct rb_node *n; - - n = rb_first(&root_tree->root); - while (n) { - entry = rb_entry(n, struct root_info, rb_node); - rb_erase(n, &root_tree->root); - __free_root_info(entry); - - n = rb_first(&root_tree->root); - } + rb_free_nodes(&root_tree->root, __free_root_info); } /* diff --git a/cmds-check.c b/cmds-check.c index 68cdd52..faf48e6 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -293,7 +293,7 @@ static struct inode_record *get_inode_rec(struct cache_tree *inode_cache, if (ino == BTRFS_FREE_INO_OBJECTID) rec->found_link = 1; - ret = insert_existing_cache_extent(inode_cache, &node->cache); + ret = insert_cache_extent(inode_cache, &node->cache); BUG_ON(ret); } return rec; @@ -614,7 +614,7 @@ again: ins->data = rec; rec->refs++; } - ret = insert_existing_cache_extent(dst, &ins->cache); + ret = insert_cache_extent(dst, &ins->cache); if (ret == -EEXIST) { conflict = get_inode_rec(dst, rec->ino, 1); merge_inode_recs(rec, conflict, dst); @@ -648,24 +648,19 @@ again: return 0; } -static void free_inode_recs(struct cache_tree *inode_cache) +static void free_inode_ptr(struct cache_extent *cache) { - struct cache_extent *cache; struct ptr_node *node; struct inode_record *rec; - while (1) { - cache = find_first_cache_extent(inode_cache, 0); - if (!cache) - break; - node = container_of(cache, struct ptr_node, cache); - rec = node->data; - remove_cache_extent(inode_cache, &node->cache); - free(node); - free_inode_rec(rec); - } + node = container_of(cache, struct ptr_node, cache); + rec = node->data; + free_inode_rec(rec); + free(node); } +FREE_EXTENT_CACHE_BASED_TREE(inode_recs, free_inode_ptr); + static struct shared_node *find_shared_node(struct cache_tree *shared, u64 bytenr) { @@ -692,7 +687,7 @@ static int add_shared_node(struct cache_tree *shared, u64 bytenr, u32 refs) cache_tree_init(&node->inode_cache); node->refs = refs; - ret = insert_existing_cache_extent(shared, &node->cache); + ret = insert_cache_extent(shared, &node->cache); BUG_ON(ret); return 0; } @@ -719,8 +714,8 @@ static int enter_shared_node(struct btrfs_root *root, u64 bytenr, u32 refs, if (wc->root_level == wc->active_node && btrfs_root_refs(&root->root_item) == 0) { if (--node->refs == 0) { - free_inode_recs(&node->root_cache); - free_inode_recs(&node->inode_cache); + free_inode_recs_tree(&node->root_cache); + free_inode_recs_tree(&node->inode_cache); remove_cache_extent(&wc->shared, &node->cache); free(node); } @@ -1427,7 +1422,7 @@ static struct root_record *get_root_rec(struct cache_tree *root_cache, rec->cache.start = objectid; rec->cache.size = 1; - ret = insert_existing_cache_extent(root_cache, &rec->cache); + ret = insert_cache_extent(root_cache, &rec->cache); BUG_ON(ret); } return rec; @@ -1460,29 +1455,24 @@ static struct root_backref *get_root_backref(struct root_record *rec, return backref; } -static void free_root_recs(struct cache_tree *root_cache) +static void free_root_record(struct cache_extent *cache) { - struct cache_extent *cache; struct root_record *rec; struct root_backref *backref; - while (1) { - cache = find_first_cache_extent(root_cache, 0); - if (!cache) - break; - rec = container_of(cache, struct root_record, cache); - remove_cache_extent(root_cache, &rec->cache); - - while (!list_empty(&rec->backrefs)) { - backref = list_entry(rec->backrefs.next, - struct root_backref, list); - list_del(&backref->list); - free(backref); - } - kfree(rec); + rec = container_of(cache, struct root_record, cache); + while (!list_empty(&rec->backrefs)) { + backref = list_entry(rec->backrefs.next, + struct root_backref, list); + list_del(&backref->list); + free(backref); } + + kfree(rec); } +FREE_EXTENT_CACHE_BASED_TREE(root_recs, free_root_record); + static int add_root_backref(struct cache_tree *root_cache, u64 root_id, u64 ref_root, u64 dir, u64 index, const char *name, int namelen, @@ -1541,7 +1531,7 @@ static int merge_root_recs(struct btrfs_root *root, struct inode_backref *backref; if (root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID) { - free_inode_recs(src_cache); + free_inode_recs_tree(src_cache); return 0; } @@ -1855,7 +1845,7 @@ static int check_fs_roots(struct btrfs_root *root, ret = check_fs_root(tmp_root, root_cache, &wc); if (ret) err = 1; - btrfs_free_fs_root(root->fs_info, tmp_root); + btrfs_free_fs_root(tmp_root); } else if (key.type == BTRFS_ROOT_REF_KEY || key.type == BTRFS_ROOT_BACKREF_KEY) { process_root_ref(leaf, path.slots[0], &key, @@ -1935,7 +1925,7 @@ static int all_backpointers_checked(struct extent_record *rec, int print_errs) (unsigned long long)rec->start, back->full_backref ? "parent" : "root", - back->full_backref ? + back->full_backref ? (unsigned long long)dback->parent: (unsigned long long)dback->root, (unsigned long long)dback->owner, @@ -2411,7 +2401,7 @@ static int add_extent_rec(struct cache_tree *extent_cache, rec->cache.start = start; rec->cache.size = nr; - ret = insert_existing_cache_extent(extent_cache, &rec->cache); + ret = insert_cache_extent(extent_cache, &rec->cache); BUG_ON(ret); bytes_used += nr; if (set_checked) { @@ -2538,10 +2528,10 @@ static int add_pending(struct cache_tree *pending, struct cache_tree *seen, u64 bytenr, u32 size) { int ret; - ret = insert_cache_extent(seen, bytenr, size); + ret = add_cache_extent(seen, bytenr, size); if (ret) return ret; - insert_cache_extent(pending, bytenr, size); + add_cache_extent(pending, bytenr, size); return 0; } @@ -3171,15 +3161,17 @@ static int run_next_block(struct btrfs_root *root, struct cache_extent *cache; int reada_bits; - ret = pick_next_pending(pending, reada, nodes, *last, bits, - bits_nr, &reada_bits); - if (ret == 0) { + nritems = pick_next_pending(pending, reada, nodes, *last, bits, + bits_nr, &reada_bits); + if (nritems == 0) return 1; - } + if (!reada_bits) { - for(i = 0; i < ret; i++) { - insert_cache_extent(reada, bits[i].start, - bits[i].size); + for(i = 0; i < nritems; i++) { + ret = add_cache_extent(reada, bits[i].start, + bits[i].size); + if (ret == -EEXIST) + continue; /* fixme, get the parent transid */ readahead_tree_block(root, bits[i].start, @@ -3295,7 +3287,7 @@ static int run_next_block(struct btrfs_root *root, ref = btrfs_item_ptr(buf, i, struct btrfs_shared_data_ref); add_data_backref(extent_cache, - key.objectid, key.offset, 0, 0, 0, + key.objectid, key.offset, 0, 0, 0, btrfs_shared_data_ref_count(buf, ref), 0, root->sectorsize); continue; @@ -4110,7 +4102,7 @@ static int process_duplicates(struct btrfs_root *root, remove_cache_extent(extent_cache, &tmp->cache); free(tmp); } - ret = insert_existing_cache_extent(extent_cache, &good->cache); + ret = insert_cache_extent(extent_cache, &good->cache); BUG_ON(ret); free(rec); return good->num_duplicates ? 0 : 1; @@ -4367,21 +4359,16 @@ static int prune_corrupt_blocks(struct btrfs_trans_handle *trans, return 0; } -static void free_corrupt_blocks(struct btrfs_fs_info *info) +static void free_corrupt_block(struct cache_extent *cache) { - struct cache_extent *cache; struct btrfs_corrupt_block *corrupt; - while (1) { - cache = find_first_cache_extent(info->corrupt_blocks, 0); - if (!cache) - break; - corrupt = container_of(cache, struct btrfs_corrupt_block, cache); - remove_cache_extent(info->corrupt_blocks, cache); - free(corrupt); - } + corrupt = container_of(cache, struct btrfs_corrupt_block, cache); + free(corrupt); } +FREE_EXTENT_CACHE_BASED_TREE(corrupt_blocks, free_corrupt_block); + static int check_block_group(struct btrfs_trans_handle *trans, struct btrfs_fs_info *info, struct map_lookup *map, @@ -4728,7 +4715,7 @@ again: goto out; } - free_corrupt_blocks(root->fs_info); + free_corrupt_blocks_tree(root->fs_info->corrupt_blocks); free_cache_tree(&seen); free_cache_tree(&pending); free_cache_tree(&reada); @@ -4746,7 +4733,7 @@ again: } out: if (repair) { - free_corrupt_blocks(root->fs_info); + free_corrupt_blocks_tree(root->fs_info->corrupt_blocks); root->fs_info->fsck_extent_cache = NULL; root->fs_info->free_extent_hook = NULL; root->fs_info->corrupt_blocks = NULL; @@ -5254,7 +5241,7 @@ int cmd_check(int argc, char **argv) fprintf(stderr, "checking root refs\n"); ret = check_root_refs(root, &root_cache); out: - free_root_recs(&root_cache); + free_root_recs_tree(&root_cache); close_ctree(root); if (found_old_backref) { /* diff --git a/disk-io.c b/disk-io.c index 8116afc..4541573 100644 --- a/disk-io.c +++ b/disk-io.c @@ -671,8 +671,7 @@ static int find_and_setup_log_root(struct btrfs_root *tree_root, } -int btrfs_free_fs_root(struct btrfs_fs_info *fs_info, - struct btrfs_root *root) +int btrfs_free_fs_root(struct btrfs_root *root) { if (root->node) free_extent_buffer(root->node); @@ -682,22 +681,16 @@ int btrfs_free_fs_root(struct btrfs_fs_info *fs_info, return 0; } -static int free_fs_roots(struct btrfs_fs_info *fs_info) +static void __free_fs_root(struct cache_extent *cache) { - struct cache_extent *cache; struct btrfs_root *root; - while (1) { - cache = find_first_cache_extent(&fs_info->fs_root_cache, 0); - if (!cache) - break; - root = container_of(cache, struct btrfs_root, cache); - remove_cache_extent(&fs_info->fs_root_cache, cache); - btrfs_free_fs_root(fs_info, root); - } - return 0; + root = container_of(cache, struct btrfs_root, cache); + btrfs_free_fs_root(root); } +FREE_EXTENT_CACHE_BASED_TREE(fs_roots, __free_fs_root); + struct btrfs_root *btrfs_read_fs_root_no_cache(struct btrfs_fs_info *fs_info, struct btrfs_key *location) { @@ -790,8 +783,7 @@ struct btrfs_root *btrfs_read_fs_root(struct btrfs_fs_info *fs_info, root->cache.start = location->objectid; root->cache.size = 1; - ret = insert_existing_cache_extent(&fs_info->fs_root_cache, - &root->cache); + ret = insert_cache_extent(&fs_info->fs_root_cache, &root->cache); BUG_ON(ret); return root; } @@ -989,22 +981,19 @@ void btrfs_release_all_roots(struct btrfs_fs_info *fs_info) free_extent_buffer(fs_info->chunk_root->node); } -static void free_mapping_cache(struct btrfs_fs_info *fs_info) +static void free_map_lookup(struct cache_extent *ce) { - struct cache_tree *cache_tree = &fs_info->mapping_tree.cache_tree; - struct cache_extent *ce; struct map_lookup *map; - while ((ce = find_first_cache_extent(cache_tree, 0))) { - map = container_of(ce, struct map_lookup, ce); - remove_cache_extent(cache_tree, ce); - kfree(map); - } + map = container_of(ce, struct map_lookup, ce); + kfree(map); } +FREE_EXTENT_CACHE_BASED_TREE(mapping_cache, free_map_lookup); + void btrfs_cleanup_all_caches(struct btrfs_fs_info *fs_info) { - free_mapping_cache(fs_info); + free_mapping_cache_tree(&fs_info->mapping_tree.cache_tree); extent_io_tree_cleanup(&fs_info->extent_cache); extent_io_tree_cleanup(&fs_info->free_space_cache); extent_io_tree_cleanup(&fs_info->block_group_cache); @@ -1394,7 +1383,7 @@ int close_ctree(struct btrfs_root *root) } btrfs_free_block_groups(fs_info); - free_fs_roots(fs_info); + free_fs_roots_tree(&fs_info->fs_root_cache); btrfs_release_all_roots(fs_info); btrfs_close_devices(fs_info->fs_devices); diff --git a/disk-io.h b/disk-io.h index 2fe2d72..e845459 100644 --- a/disk-io.h +++ b/disk-io.h @@ -78,7 +78,7 @@ struct btrfs_root *btrfs_read_fs_root(struct btrfs_fs_info *fs_info, struct btrfs_key *location); struct btrfs_root *btrfs_read_fs_root_no_cache(struct btrfs_fs_info *fs_info, struct btrfs_key *location); -int btrfs_free_fs_root(struct btrfs_fs_info *fs_info, struct btrfs_root *root); +int btrfs_free_fs_root(struct btrfs_root *root); void btrfs_mark_buffer_dirty(struct extent_buffer *buf); int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid); int btrfs_set_buffer_uptodate(struct extent_buffer *buf); diff --git a/extent-cache.c b/extent-cache.c index 3dd6434..a09fe87 100644 --- a/extent-cache.c +++ b/extent-cache.c @@ -20,65 +20,42 @@ #include "kerncompat.h" #include "extent-cache.h" +struct cache_extent_search_range { + u64 start; + u64 size; +}; + void cache_tree_init(struct cache_tree *tree) { - tree->root.rb_node = NULL; + tree->root = RB_ROOT; } -static struct rb_node *tree_insert(struct rb_root *root, u64 offset, - u64 size, struct rb_node *node) +static int cache_tree_comp_range(struct rb_node *node, void *data) { - struct rb_node ** p = &root->rb_node; - struct rb_node * parent = NULL; struct cache_extent *entry; + struct cache_extent_search_range *range; - while(*p) { - parent = *p; - entry = rb_entry(parent, struct cache_extent, rb_node); - - if (offset + size <= entry->start) - p = &(*p)->rb_left; - else if (offset >= entry->start + entry->size) - p = &(*p)->rb_right; - else - return parent; - } + range = (struct cache_extent_search_range *)data; + entry = rb_entry(node, struct cache_extent, rb_node); - entry = rb_entry(parent, struct cache_extent, rb_node); - rb_link_node(node, parent, p); - rb_insert_color(node, root); - return NULL; + if (entry->start + entry->size <= range->start) + return 1; + else if (range->start + range->size <= entry->start) + return -1; + else + return 0; } -static struct rb_node *__tree_search(struct rb_root *root, u64 offset, - u64 size, struct rb_node **prev_ret) +static int cache_tree_comp_nodes(struct rb_node *node1, struct rb_node *node2) { - struct rb_node * n = root->rb_node; - struct rb_node *prev = NULL; struct cache_extent *entry; - struct cache_extent *prev_entry = NULL; - - while(n) { - entry = rb_entry(n, struct cache_extent, rb_node); - prev = n; - prev_entry = entry; - - if (offset + size <= entry->start) - n = n->rb_left; - else if (offset >= entry->start + entry->size) - n = n->rb_right; - else - return n; - } - if (!prev_ret) - return NULL; + struct cache_extent_search_range range; - while(prev && offset >= prev_entry->start + prev_entry->size) { - prev = rb_next(prev); - prev_entry = rb_entry(prev, struct cache_extent, rb_node); - } - *prev_ret = prev; - return NULL; + entry = rb_entry(node2, struct cache_extent, rb_node); + range.start = entry->start; + range.size = entry->size; + + return cache_tree_comp_range(node1, (void *)&range); } struct cache_extent *alloc_cache_extent(u64 start, u64 size) @@ -87,63 +64,79 @@ struct cache_extent *alloc_cache_extent(u64 start, u64 size) if (!pe) return pe; + pe->start = start; pe->size = size; return pe; } -int insert_existing_cache_extent(struct cache_tree *tree, - struct cache_extent *pe) +int insert_cache_extent(struct cache_tree *tree, struct cache_extent *pe) { - struct rb_node *found; - - found = tree_insert(&tree->root, pe->start, pe->size, &pe->rb_node); - if (found) - return -EEXIST; - - return 0; + return rb_insert(&tree->root, &pe->rb_node, cache_tree_comp_nodes); } -int insert_cache_extent(struct cache_tree *tree, u64 start, u64 size) +int add_cache_extent(struct cache_tree *tree, u64 start, u64 size) { struct cache_extent *pe = alloc_cache_extent(start, size); int ret; - ret = insert_existing_cache_extent(tree, pe); + + if (!pe) { + fprintf(stderr, "memory allocation failed\n"); + exit(1); + } + + ret = insert_cache_extent(tree, pe); if (ret) free(pe); + return ret; } struct cache_extent *find_cache_extent(struct cache_tree *tree, - u64 start, u64 size) + u64 start, u64 size) { - struct rb_node *prev; - struct rb_node *ret; + struct rb_node *node; struct cache_extent *entry; - ret = __tree_search(&tree->root, start, size, &prev); - if (!ret) + struct cache_extent_search_range range; + + range.start = start; + range.size = size; + node = rb_search(&tree->root, &range, cache_tree_comp_range, NULL); + if (!node) return NULL; - entry = rb_entry(ret, struct cache_extent, rb_node); + entry = rb_entry(node, struct cache_extent, rb_node); return entry; } -struct cache_extent *find_first_cache_extent(struct cache_tree *tree, - u64 start) +struct cache_extent *find_first_cache_extent(struct cache_tree *tree, u64 start) { - struct rb_node *prev; - struct rb_node *ret; + struct rb_node *next; + struct rb_node *node; struct cache_extent *entry; + struct cache_extent_search_range range; - ret = __tree_search(&tree->root, start, 1, &prev); - if (!ret) - ret = prev; - if (!ret) + range.start = start; + range.size = 1; + node = rb_search(&tree->root, &range, cache_tree_comp_range, &next); + if (!node) + node = next; + if (!node) return NULL; - entry = rb_entry(ret, struct cache_extent, rb_node); + + entry = rb_entry(node, struct cache_extent, rb_node); return entry; } +struct cache_extent *first_cache_extent(struct cache_tree *tree) +{ + struct rb_node *node = rb_first(&tree->root); + + if (!node) + return NULL; + return rb_entry(node, struct cache_extent, rb_node); +} + struct cache_extent *prev_cache_extent(struct cache_extent *pe) { struct rb_node *node = rb_prev(&pe->rb_node); @@ -162,9 +155,18 @@ struct cache_extent *next_cache_extent(struct cache_extent *pe) return rb_entry(node, struct cache_extent, rb_node); } -void remove_cache_extent(struct cache_tree *tree, - struct cache_extent *pe) +void remove_cache_extent(struct cache_tree *tree, struct cache_extent *pe) { rb_erase(&pe->rb_node, &tree->root); } +void cache_tree_free_extents(struct cache_tree *tree, + free_cache_extent free_func) +{ + struct cache_extent *ce; + + while ((ce = first_cache_extent(tree))) { + remove_cache_extent(tree, ce); + free_func(ce); + } +} diff --git a/extent-cache.h b/extent-cache.h index 4cd0f79..2979fc3 100644 --- a/extent-cache.h +++ b/extent-cache.h @@ -16,8 +16,8 @@ * Boston, MA 021110-1307, USA. */ -#ifndef __PENDING_EXTENT__ -#define __PENDING_EXTENT__ +#ifndef __EXTENT_CACHE_H__ +#define __EXTENT_CACHE_H__ #if BTRFS_FLAT_INCLUDES #include "kerncompat.h" @@ -38,28 +38,34 @@ struct cache_extent { }; void cache_tree_init(struct cache_tree *tree); -void remove_cache_extent(struct cache_tree *tree, - struct cache_extent *pe); -struct cache_extent *find_first_cache_extent(struct cache_tree *tree, - u64 start); + +struct cache_extent *first_cache_extent(struct cache_tree *tree); struct cache_extent *prev_cache_extent(struct cache_extent *pe); struct cache_extent *next_cache_extent(struct cache_extent *pe); + +struct cache_extent *find_first_cache_extent(struct cache_tree *tree, + u64 start); struct cache_extent *find_cache_extent(struct cache_tree *tree, - u64 start, u64 size); -int insert_cache_extent(struct cache_tree *tree, u64 start, u64 size); -int insert_existing_cache_extent(struct cache_tree *tree, - struct cache_extent *pe); + u64 start, u64 size); + +int add_cache_extent(struct cache_tree *tree, u64 start, u64 size); +int insert_cache_extent(struct cache_tree *tree, struct cache_extent *pe); +void remove_cache_extent(struct cache_tree *tree, struct cache_extent *pe); static inline int cache_tree_empty(struct cache_tree *tree) { return RB_EMPTY_ROOT(&tree->root); } -static inline void free_cache_extent(struct cache_extent *pe) -{ - free(pe); -} +typedef void (*free_cache_extent)(struct cache_extent *pe); -struct cache_extent *alloc_pending_extent(u64 start, u64 size); +void cache_tree_free_extents(struct cache_tree *tree, + free_cache_extent free_func); + +#define FREE_EXTENT_CACHE_BASED_TREE(name, free_func) \ +static void free_##name##_tree(struct cache_tree *tree) \ +{ \ + cache_tree_free_extents(tree, free_func); \ +} #endif diff --git a/extent_io.c b/extent_io.c index 5093aeb..1e5d25a 100644 --- a/extent_io.c +++ b/extent_io.c @@ -54,7 +54,7 @@ static struct extent_state *alloc_extent_state(void) return state; } -static void free_extent_state(struct extent_state *state) +static void btrfs_free_extent_state(struct extent_state *state) { state->refs--; BUG_ON(state->refs < 0); @@ -62,11 +62,17 @@ static void free_extent_state(struct extent_state *state) free(state); } -void extent_io_tree_cleanup(struct extent_io_tree *tree) +static void free_extent_state_func(struct cache_extent *cache) { struct extent_state *es; + + es = container_of(cache, struct extent_state, cache_node); + btrfs_free_extent_state(es); +} + +void extent_io_tree_cleanup(struct extent_io_tree *tree) +{ struct extent_buffer *eb; - struct cache_extent *cache; while(!list_empty(&tree->lru)) { eb = list_entry(tree->lru.next, struct extent_buffer, lru); @@ -78,14 +84,8 @@ void extent_io_tree_cleanup(struct extent_io_tree *tree) } free_extent_buffer(eb); } - while (1) { - cache = find_first_cache_extent(&tree->state, 0); - if (!cache) - break; - es = container_of(cache, struct extent_state, cache_node); - remove_cache_extent(&tree->state, &es->cache_node); - free_extent_state(es); - } + + cache_tree_free_extents(&tree->state, free_extent_state_func); } static inline void update_extent_state(struct extent_state *state) @@ -118,7 +118,7 @@ static int merge_state(struct extent_io_tree *tree, state->start = other->start; update_extent_state(state); remove_cache_extent(&tree->state, &other->cache_node); - free_extent_state(other); + btrfs_free_extent_state(other); } } other_node = next_cache_extent(&state->cache_node); @@ -130,7 +130,7 @@ static int merge_state(struct extent_io_tree *tree, other->start = state->start; update_extent_state(other); remove_cache_extent(&tree->state, &state->cache_node); - free_extent_state(state); + btrfs_free_extent_state(state); } } return 0; @@ -151,7 +151,7 @@ static int insert_state(struct extent_io_tree *tree, state->start = start; state->end = end; update_extent_state(state); - ret = insert_existing_cache_extent(&tree->state, &state->cache_node); + ret = insert_cache_extent(&tree->state, &state->cache_node); BUG_ON(ret); merge_state(tree, state); return 0; @@ -172,8 +172,7 @@ static int split_state(struct extent_io_tree *tree, struct extent_state *orig, update_extent_state(prealloc); orig->start = split; update_extent_state(orig); - ret = insert_existing_cache_extent(&tree->state, - &prealloc->cache_node); + ret = insert_cache_extent(&tree->state, &prealloc->cache_node); BUG_ON(ret); return 0; } @@ -189,7 +188,7 @@ static int clear_state_bit(struct extent_io_tree *tree, state->state &= ~bits; if (state->state == 0) { remove_cache_extent(&tree->state, &state->cache_node); - free_extent_state(state); + btrfs_free_extent_state(state); } else { merge_state(tree, state); } @@ -280,7 +279,7 @@ again: goto search_again; out: if (prealloc) - free_extent_state(prealloc); + btrfs_free_extent_state(prealloc); return set; search_again: @@ -408,7 +407,7 @@ again: prealloc = NULL; out: if (prealloc) - free_extent_state(prealloc); + btrfs_free_extent_state(prealloc); return err; search_again: if (start > end) @@ -587,7 +586,7 @@ static struct extent_buffer *__alloc_extent_buffer(struct extent_io_tree *tree, eb->cache_node.size = blocksize; free_some_buffers(tree); - ret = insert_existing_cache_extent(&tree->cache, &eb->cache_node); + ret = insert_cache_extent(&tree->cache, &eb->cache_node); if (ret) { free(eb); return NULL; @@ -622,7 +621,8 @@ struct extent_buffer *find_extent_buffer(struct extent_io_tree *tree, struct cache_extent *cache; cache = find_cache_extent(&tree->cache, bytenr, blocksize); - if (cache && cache->start == bytenr && cache->size == blocksize) { + if (cache && cache->start == bytenr && + cache->size == blocksize) { eb = container_of(cache, struct extent_buffer, cache_node); list_move_tail(&eb->lru, &tree->lru); eb->refs++; @@ -652,7 +652,8 @@ struct extent_buffer *alloc_extent_buffer(struct extent_io_tree *tree, struct cache_extent *cache; cache = find_cache_extent(&tree->cache, bytenr, blocksize); - if (cache && cache->start == bytenr && cache->size == blocksize) { + if (cache && cache->start == bytenr && + cache->size == blocksize) { eb = container_of(cache, struct extent_buffer, cache_node); list_move_tail(&eb->lru, &tree->lru); eb->refs++; diff --git a/rbtree.c b/rbtree.c index 6ad800f..4c06b0c 100644 --- a/rbtree.c +++ b/rbtree.c @@ -387,3 +387,66 @@ void rb_replace_node(struct rb_node *victim, struct rb_node *new, /* Copy the pointers/colour from the victim to the replacement */ *new = *victim; } + +int rb_insert(struct rb_root *root, struct rb_node *node, + rb_compare_nodes comp) +{ + struct rb_node **p = &root->rb_node; + struct rb_node *parent = NULL; + int ret; + + while(*p) { + parent = *p; + + ret = comp(parent, node); + if (ret < 0) + p = &(*p)->rb_left; + else if (ret > 0) + p = &(*p)->rb_right; + else + return -EEXIST; + } + + rb_link_node(node, parent, p); + rb_insert_color(node, root); + return 0; +} + +struct rb_node *rb_search(struct rb_root *root, void *key, rb_compare_keys comp, + struct rb_node **next_ret) +{ + struct rb_node *n = root->rb_node; + struct rb_node *parent = NULL; + int ret = 0; + + while(n) { + parent = n; + + ret = comp(n, key); + if (ret < 0) + n = n->rb_left; + else if (ret > 0) + n = n->rb_right; + else + return n; + } + + if (!next_ret) + return NULL; + + if (parent && ret > 0) + parent = rb_next(parent); + + *next_ret = parent; + return NULL; +} + +void rb_free_nodes(struct rb_root *root, rb_free_node free_node) +{ + struct rb_node *node; + + while ((node = rb_first(root))) { + rb_erase(node, root); + free_node(node); + } +} diff --git a/rbtree.h b/rbtree.h index 8f717a9..48e5157 100644 --- a/rbtree.h +++ b/rbtree.h @@ -111,11 +111,8 @@ struct rb_node struct rb_root { struct rb_node *rb_node; - void (*rotate_notify)(struct rb_node *old_parent, struct rb_node *node); - }; - #define rb_parent(r) ((struct rb_node *)((r)->rb_parent_color & ~3)) #define rb_color(r) ((r)->rb_parent_color & 1) #define rb_is_red(r) (!rb_color(r)) @@ -161,4 +158,25 @@ static inline void rb_link_node(struct rb_node * node, struct rb_node * parent, *rb_link = node; } +/* The common insert/search/free functions */ +typedef int (*rb_compare_nodes)(struct rb_node *node1, struct rb_node *node2); +typedef int (*rb_compare_keys)(struct rb_node *node, void *key); +typedef void (*rb_free_node)(struct rb_node *node); + +int rb_insert(struct rb_root *root, struct rb_node *node, + rb_compare_nodes comp); +/* + * In some cases, we need return the next node if we don''t find the node we + * specify. At this time, we can use next_ret. + */ +struct rb_node *rb_search(struct rb_root *root, void *key, rb_compare_keys comp, + struct rb_node **next_ret); +void rb_free_nodes(struct rb_root *root, rb_free_node free_node); + +#define FREE_RB_BASED_TREE(name, free_func) \ +static void free_##name##_tree(struct rb_root *root) \ +{ \ + rb_free_nodes(root, free_func); \ +} + #endif /* _LINUX_RBTREE_H */ diff --git a/repair.c b/repair.c index e640465..4f74742 100644 --- a/repair.c +++ b/repair.c @@ -41,7 +41,7 @@ int btrfs_add_corrupt_extent_record(struct btrfs_fs_info *info, corrupt->cache.size = len; corrupt->level = level; - ret = insert_existing_cache_extent(info->corrupt_blocks, &corrupt->cache); + ret = insert_cache_extent(info->corrupt_blocks, &corrupt->cache); if (ret) free(corrupt); BUG_ON(ret && ret != -EEXIST); diff --git a/volumes.c b/volumes.c index 0f6a35b..e8e7907 100644 --- a/volumes.c +++ b/volumes.c @@ -665,12 +665,12 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle *trans, { u64 dev_offset; struct btrfs_fs_info *info = extent_root->fs_info; - struct btrfs_root *chunk_root = extent_root->fs_info->chunk_root; + struct btrfs_root *chunk_root = info->chunk_root; struct btrfs_stripe *stripes; struct btrfs_device *device = NULL; struct btrfs_chunk *chunk; struct list_head private_devs; - struct list_head *dev_list = &extent_root->fs_info->fs_devices->devices; + struct list_head *dev_list = &info->fs_devices->devices; struct list_head *cur; struct map_lookup *map; int min_stripe_size = 1 * 1024 * 1024; @@ -890,9 +890,7 @@ again: map->ce.start = key.offset; map->ce.size = *num_bytes; - ret = insert_existing_cache_extent( - &extent_root->fs_info->mapping_tree.cache_tree, - &map->ce); + ret = insert_cache_extent(&info->mapping_tree.cache_tree, &map->ce); BUG_ON(ret); if (type & BTRFS_BLOCK_GROUP_SYSTEM) { @@ -911,11 +909,11 @@ int btrfs_alloc_data_chunk(struct btrfs_trans_handle *trans, { u64 dev_offset; struct btrfs_fs_info *info = extent_root->fs_info; - struct btrfs_root *chunk_root = extent_root->fs_info->chunk_root; + struct btrfs_root *chunk_root = info->chunk_root; struct btrfs_stripe *stripes; struct btrfs_device *device = NULL; struct btrfs_chunk *chunk; - struct list_head *dev_list = &extent_root->fs_info->fs_devices->devices; + struct list_head *dev_list = &info->fs_devices->devices; struct list_head *cur; struct map_lookup *map; u64 calc_size = 8 * 1024 * 1024; @@ -998,9 +996,7 @@ int btrfs_alloc_data_chunk(struct btrfs_trans_handle *trans, map->ce.start = key.offset; map->ce.size = num_bytes; - ret = insert_existing_cache_extent( - &extent_root->fs_info->mapping_tree.cache_tree, - &map->ce); + ret = insert_cache_extent(&info->mapping_tree.cache_tree, &map->ce); BUG_ON(ret); kfree(chunk); @@ -1447,7 +1443,7 @@ int btrfs_bootstrap_super_map(struct btrfs_mapping_tree *map_tree, map->stripes[i].dev = device; i++; } - ret = insert_existing_cache_extent(&map_tree->cache_tree, &map->ce); + ret = insert_cache_extent(&map_tree->cache_tree, &map->ce); if (ret == -EEXIST) { struct cache_extent *old; struct map_lookup *old_map; @@ -1455,7 +1451,7 @@ int btrfs_bootstrap_super_map(struct btrfs_mapping_tree *map_tree, old_map = container_of(old, struct map_lookup, ce); remove_cache_extent(&map_tree->cache_tree, old); kfree(old_map); - ret = insert_existing_cache_extent(&map_tree->cache_tree, + ret = insert_cache_extent(&map_tree->cache_tree, &map->ce); } BUG_ON(ret); @@ -1550,7 +1546,7 @@ static int read_one_chunk(struct btrfs_root *root, struct btrfs_key *key, } } - ret = insert_existing_cache_extent(&map_tree->cache_tree, &map->ce); + ret = insert_cache_extent(&map_tree->cache_tree, &map->ce); BUG_ON(ret); return 0; -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Jul-03 13:25 UTC
[PATCH 06/12] Btrfs-progs: use rb-tree instead of extent cache tree for fs/file roots
Because the fs/file roots are not extents, so it is better to use rb-tree to manage them. Fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> --- ctree.h | 4 ++-- disk-io.c | 50 ++++++++++++++++++++++++++++++++++++-------------- 2 files changed, 38 insertions(+), 16 deletions(-) diff --git a/ctree.h b/ctree.h index 3fe14b0..4347c8a 100644 --- a/ctree.h +++ b/ctree.h @@ -909,7 +909,7 @@ struct btrfs_fs_info { struct btrfs_root *dev_root; struct btrfs_root *csum_root; - struct cache_tree fs_root_cache; + struct rb_root fs_root_tree; /* the log root tree is a directory of all the other log roots */ struct btrfs_root *log_root_tree; @@ -993,7 +993,7 @@ struct btrfs_root { /* the dirty list is only used by non-reference counted roots */ struct list_head dirty_list; - struct cache_extent cache; + struct rb_node rb_node; }; /* diff --git a/disk-io.c b/disk-io.c index 4541573..7140367 100644 --- a/disk-io.c +++ b/disk-io.c @@ -681,15 +681,15 @@ int btrfs_free_fs_root(struct btrfs_root *root) return 0; } -static void __free_fs_root(struct cache_extent *cache) +static void __free_fs_root(struct rb_node *node) { struct btrfs_root *root; - root = container_of(cache, struct btrfs_root, cache); + root = container_of(node, struct btrfs_root, rb_node); btrfs_free_fs_root(root); } -FREE_EXTENT_CACHE_BASED_TREE(fs_roots, __free_fs_root); +FREE_RB_BASED_TREE(fs_roots, __free_fs_root); struct btrfs_root *btrfs_read_fs_root_no_cache(struct btrfs_fs_info *fs_info, struct btrfs_key *location) @@ -751,11 +751,35 @@ insert: return root; } +static int btrfs_fs_roots_compare_objectids(struct rb_node *node, + void *data) +{ + u64 objectid = *((u64 *)data); + struct btrfs_root *root; + + root = rb_entry(node, struct btrfs_root, rb_node); + if (objectid > root->objectid) + return 1; + else if (objectid < root->objectid) + return -1; + else + return 0; +} + +static int btrfs_fs_roots_compare_roots(struct rb_node *node1, + struct rb_node *node2) +{ + struct btrfs_root *root; + + root = rb_entry(node2, struct btrfs_root, rb_node); + return btrfs_fs_roots_compare_objectids(node1, (void *)&root->objectid); +} + struct btrfs_root *btrfs_read_fs_root(struct btrfs_fs_info *fs_info, struct btrfs_key *location) { struct btrfs_root *root; - struct cache_extent *cache; + struct rb_node *node; int ret; if (location->objectid == BTRFS_ROOT_TREE_OBJECTID) @@ -772,18 +796,17 @@ struct btrfs_root *btrfs_read_fs_root(struct btrfs_fs_info *fs_info, BUG_ON(location->objectid == BTRFS_TREE_RELOC_OBJECTID || location->offset != (u64)-1); - cache = find_cache_extent(&fs_info->fs_root_cache, - location->objectid, 1); - if (cache) - return container_of(cache, struct btrfs_root, cache); + node = rb_search(&fs_info->fs_root_tree, (void *)&location->objectid, + btrfs_fs_roots_compare_objectids, NULL); + if (node) + return container_of(node, struct btrfs_root, rb_node); root = btrfs_read_fs_root_no_cache(fs_info, location); if (IS_ERR(root)) return root; - root->cache.start = location->objectid; - root->cache.size = 1; - ret = insert_cache_extent(&fs_info->fs_root_cache, &root->cache); + ret = rb_insert(&fs_info->fs_root_tree, &root->rb_node, + btrfs_fs_roots_compare_roots); BUG_ON(ret); return root; } @@ -835,8 +858,7 @@ struct btrfs_fs_info *btrfs_new_fs_info(int writable, u64 sb_bytenr) extent_io_tree_init(&fs_info->pinned_extents); extent_io_tree_init(&fs_info->pending_del); extent_io_tree_init(&fs_info->extent_ins); - - cache_tree_init(&fs_info->fs_root_cache); + fs_info->fs_root_tree = RB_ROOT; cache_tree_init(&fs_info->mapping_tree.cache_tree); mutex_init(&fs_info->fs_mutex); @@ -1383,7 +1405,7 @@ int close_ctree(struct btrfs_root *root) } btrfs_free_block_groups(fs_info); - free_fs_roots_tree(&fs_info->fs_root_cache); + free_fs_roots_tree(&fs_info->fs_root_tree); btrfs_release_all_roots(fs_info); btrfs_close_devices(fs_info->fs_devices); -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Jul-03 13:25 UTC
[PATCH 07/12] Btrfs-progs: extend the extent cache for the device extent
As we know, btrfs can manage several devices in the same fs, so [offset, size] is not sufficient for unique identification of an device extent, we need the device id to identify the device extents which have the same offset and size, but are not in the same device. So, we added a member variant named objectid into the extent cache, and introduced some functions to make the extent cache be suitable to manage the device extent. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> --- cmds-check.c | 96 ++++++++++++++++++----------------------- extent-cache.c | 134 ++++++++++++++++++++++++++++++++++++++++++++++++++------- extent-cache.h | 18 ++++++-- extent_io.c | 19 ++++---- volumes.c | 15 ++++--- 5 files changed, 192 insertions(+), 90 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index faf48e6..2a34839 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -268,7 +268,7 @@ static struct inode_record *get_inode_rec(struct cache_tree *inode_cache, struct inode_record *rec = NULL; int ret; - cache = find_cache_extent(inode_cache, ino, 1); + cache = lookup_cache_extent(inode_cache, ino, 1); if (cache) { node = container_of(cache, struct ptr_node, cache); rec = node->data; @@ -375,7 +375,7 @@ static void maybe_free_inode_rec(struct cache_tree *inode_cache, BUG_ON(rec->refs != 1); if (can_free_inode_rec(rec)) { - cache = find_cache_extent(inode_cache, rec->ino, 1); + cache = lookup_cache_extent(inode_cache, rec->ino, 1); node = container_of(cache, struct ptr_node, cache); BUG_ON(node->data != rec); remove_cache_extent(inode_cache, &node->cache); @@ -598,7 +598,7 @@ static int splice_shared_node(struct shared_node *src_node, src = &src_node->root_cache; dst = &dst_node->root_cache; again: - cache = find_first_cache_extent(src, 0); + cache = search_cache_extent(src, 0); while (cache) { node = container_of(cache, struct ptr_node, cache); rec = node->data; @@ -667,7 +667,7 @@ static struct shared_node *find_shared_node(struct cache_tree *shared, struct cache_extent *cache; struct shared_node *node; - cache = find_cache_extent(shared, bytenr, 1); + cache = lookup_cache_extent(shared, bytenr, 1); if (cache) { node = container_of(cache, struct shared_node, cache); return node; @@ -1355,7 +1355,7 @@ static int check_inode_recs(struct btrfs_root *root, } while (1) { - cache = find_first_cache_extent(inode_cache, 0); + cache = search_cache_extent(inode_cache, 0); if (!cache) break; node = container_of(cache, struct ptr_node, cache); @@ -1412,7 +1412,7 @@ static struct root_record *get_root_rec(struct cache_tree *root_cache, struct root_record *rec = NULL; int ret; - cache = find_cache_extent(root_cache, objectid, 1); + cache = lookup_cache_extent(root_cache, objectid, 1); if (cache) { rec = container_of(cache, struct root_record, cache); } else { @@ -1536,7 +1536,7 @@ static int merge_root_recs(struct btrfs_root *root, } while (1) { - cache = find_first_cache_extent(src_cache, 0); + cache = search_cache_extent(src_cache, 0); if (!cache) break; node = container_of(cache, struct ptr_node, cache); @@ -1586,7 +1586,7 @@ static int check_root_refs(struct btrfs_root *root, /* fixme: this can not detect circular references */ while (loop) { loop = 0; - cache = find_first_cache_extent(root_cache, 0); + cache = search_cache_extent(root_cache, 0); while (1) { if (!cache) break; @@ -1613,7 +1613,7 @@ static int check_root_refs(struct btrfs_root *root, } } - cache = find_first_cache_extent(root_cache, 0); + cache = search_cache_extent(root_cache, 0); while (1) { if (!cache) break; @@ -1989,14 +1989,14 @@ static int free_all_extent_backrefs(struct extent_record *rec) return 0; } -static void free_extent_cache(struct btrfs_fs_info *fs_info, - struct cache_tree *extent_cache) +static void free_extent_record_cache(struct btrfs_fs_info *fs_info, + struct cache_tree *extent_cache) { struct cache_extent *cache; struct extent_record *rec; while (1) { - cache = find_first_cache_extent(extent_cache, 0); + cache = first_cache_extent(extent_cache); if (!cache) break; rec = container_of(cache, struct extent_record, cache); @@ -2108,7 +2108,7 @@ static int record_bad_block_io(struct btrfs_fs_info *info, struct cache_extent *cache; struct btrfs_key key; - cache = find_cache_extent(extent_cache, start, len); + cache = lookup_cache_extent(extent_cache, start, len); if (!cache) return 0; @@ -2130,7 +2130,7 @@ static int check_block(struct btrfs_root *root, int ret = 1; int level; - cache = find_cache_extent(extent_cache, buf->start, buf->len); + cache = lookup_cache_extent(extent_cache, buf->start, buf->len); if (!cache) return 1; rec = container_of(cache, struct extent_record, cache); @@ -2293,7 +2293,7 @@ static int add_extent_rec(struct cache_tree *extent_cache, int ret = 0; int dup = 0; - cache = find_cache_extent(extent_cache, start, nr); + cache = lookup_cache_extent(extent_cache, start, nr); if (cache) { rec = container_of(cache, struct extent_record, cache); if (inc_ref) @@ -2418,11 +2418,11 @@ static int add_tree_backref(struct cache_tree *extent_cache, u64 bytenr, struct tree_backref *back; struct cache_extent *cache; - cache = find_cache_extent(extent_cache, bytenr, 1); + cache = lookup_cache_extent(extent_cache, bytenr, 1); if (!cache) { add_extent_rec(extent_cache, NULL, bytenr, 1, 0, 0, 0, 0, 1, 0, 0); - cache = find_cache_extent(extent_cache, bytenr, 1); + cache = lookup_cache_extent(extent_cache, bytenr, 1); if (!cache) abort(); } @@ -2466,11 +2466,11 @@ static int add_data_backref(struct cache_tree *extent_cache, u64 bytenr, struct data_backref *back; struct cache_extent *cache; - cache = find_cache_extent(extent_cache, bytenr, 1); + cache = lookup_cache_extent(extent_cache, bytenr, 1); if (!cache) { add_extent_rec(extent_cache, NULL, bytenr, 1, 0, 0, 0, 0, 0, 0, max_size); - cache = find_cache_extent(extent_cache, bytenr, 1); + cache = lookup_cache_extent(extent_cache, bytenr, 1); if (!cache) abort(); } @@ -2545,7 +2545,7 @@ static int pick_next_pending(struct cache_tree *pending, struct cache_extent *cache; int ret; - cache = find_first_cache_extent(reada, 0); + cache = search_cache_extent(reada, 0); if (cache) { bits[0].start = cache->start; bits[1].size = cache->size; @@ -2556,12 +2556,12 @@ static int pick_next_pending(struct cache_tree *pending, if (node_start > 32768) node_start -= 32768; - cache = find_first_cache_extent(nodes, node_start); + cache = search_cache_extent(nodes, node_start); if (!cache) - cache = find_first_cache_extent(nodes, 0); + cache = search_cache_extent(nodes, 0); if (!cache) { - cache = find_first_cache_extent(pending, 0); + cache = search_cache_extent(pending, 0); if (!cache) return 0; ret = 0; @@ -2585,7 +2585,7 @@ static int pick_next_pending(struct cache_tree *pending, if (bits_nr - ret > 8) { u64 lookup = bits[0].start + bits[0].size; struct cache_extent *next; - next = find_first_cache_extent(pending, lookup); + next = search_cache_extent(pending, lookup); while(next) { if (next->start - lookup > 32768) break; @@ -3182,17 +3182,17 @@ static int run_next_block(struct btrfs_root *root, bytenr = bits[0].start; size = bits[0].size; - cache = find_cache_extent(pending, bytenr, size); + cache = lookup_cache_extent(pending, bytenr, size); if (cache) { remove_cache_extent(pending, cache); free(cache); } - cache = find_cache_extent(reada, bytenr, size); + cache = lookup_cache_extent(reada, bytenr, size); if (cache) { remove_cache_extent(reada, cache); free(cache); } - cache = find_cache_extent(nodes, bytenr, size); + cache = lookup_cache_extent(nodes, bytenr, size); if (cache) { remove_cache_extent(nodes, cache); free(cache); @@ -3400,7 +3400,7 @@ static int free_extent_hook(struct btrfs_trans_handle *trans, struct cache_tree *extent_cache = root->fs_info->fsck_extent_cache; is_data = owner >= BTRFS_FIRST_FREE_OBJECTID; - cache = find_cache_extent(extent_cache, bytenr, num_bytes); + cache = lookup_cache_extent(extent_cache, bytenr, num_bytes); if (!cache) return 0; @@ -4070,8 +4070,8 @@ static int process_duplicates(struct btrfs_root *root, good->refs = rec->refs; list_splice_init(&rec->backrefs, &good->backrefs); while (1) { - cache = find_cache_extent(extent_cache, good->start, - good->nr); + cache = lookup_cache_extent(extent_cache, good->start, + good->nr); if (!cache) break; tmp = container_of(cache, struct extent_record, cache); @@ -4244,7 +4244,8 @@ static int fixup_extent_refs(struct btrfs_trans_handle *trans, goto out; /* was this block corrupt? If so, don''t add references to it */ - cache = find_cache_extent(info->corrupt_blocks, rec->start, rec->max_size); + cache = lookup_cache_extent(info->corrupt_blocks, + rec->start, rec->max_size); if (cache) { ret = 0; goto out; @@ -4348,7 +4349,7 @@ static int prune_corrupt_blocks(struct btrfs_trans_handle *trans, struct cache_extent *cache; struct btrfs_corrupt_block *corrupt; - cache = find_first_cache_extent(info->corrupt_blocks, 0); + cache = search_cache_extent(info->corrupt_blocks, 0); while (1) { if (!cache) break; @@ -4407,7 +4408,7 @@ static int check_block_groups(struct btrfs_trans_handle *trans, /* this isn''t quite working */ return 0; - ce = find_first_cache_extent(&map_tree->cache_tree, 0); + ce = search_cache_extent(&map_tree->cache_tree, 0); while (1) { if (!ce) break; @@ -4463,7 +4464,7 @@ static int check_extent_refs(struct btrfs_trans_handle *trans, * In the worst case, this will be all the * extents in the FS */ - cache = find_first_cache_extent(extent_cache, 0); + cache = search_cache_extent(extent_cache, 0); while(cache) { rec = container_of(cache, struct extent_record, cache); btrfs_pin_extent(root->fs_info, @@ -4472,7 +4473,7 @@ static int check_extent_refs(struct btrfs_trans_handle *trans, } /* pin down all the corrupted blocks too */ - cache = find_first_cache_extent(root->fs_info->corrupt_blocks, 0); + cache = search_cache_extent(root->fs_info->corrupt_blocks, 0); while(cache) { rec = container_of(cache, struct extent_record, cache); btrfs_pin_extent(root->fs_info, @@ -4522,7 +4523,7 @@ static int check_extent_refs(struct btrfs_trans_handle *trans, while(1) { fixed = 0; - cache = find_first_cache_extent(extent_cache, 0); + cache = search_cache_extent(extent_cache, 0); if (!cache) break; rec = container_of(cache, struct extent_record, cache); @@ -4594,19 +4595,6 @@ repair_abort: return err; } -static void free_cache_tree(struct cache_tree *tree) -{ - struct cache_extent *cache; - - while (1) { - cache = find_first_cache_extent(tree, 0); - if (!cache) - break; - remove_cache_extent(tree, cache); - free(cache); - } -} - static int check_extents(struct btrfs_root *root, int repair) { struct cache_tree extent_cache; @@ -4716,11 +4704,11 @@ again: } free_corrupt_blocks_tree(root->fs_info->corrupt_blocks); - free_cache_tree(&seen); - free_cache_tree(&pending); - free_cache_tree(&reada); - free_cache_tree(&nodes); - free_extent_cache(root->fs_info, &extent_cache); + free_extent_cache_tree(&seen); + free_extent_cache_tree(&pending); + free_extent_cache_tree(&reada); + free_extent_cache_tree(&nodes); + free_extent_record_cache(root->fs_info, &extent_cache); goto again; } diff --git a/extent-cache.c b/extent-cache.c index a09fe87..84de87b 100644 --- a/extent-cache.c +++ b/extent-cache.c @@ -21,15 +21,11 @@ #include "extent-cache.h" struct cache_extent_search_range { + u64 objectid; u64 start; u64 size; }; -void cache_tree_init(struct cache_tree *tree) -{ - tree->root = RB_ROOT; -} - static int cache_tree_comp_range(struct rb_node *node, void *data) { struct cache_extent *entry; @@ -58,26 +54,62 @@ static int cache_tree_comp_nodes(struct rb_node *node1, struct rb_node *node2) return cache_tree_comp_range(node1, (void *)&range); } -struct cache_extent *alloc_cache_extent(u64 start, u64 size) +static int cache_tree_comp_range2(struct rb_node *node, void *data) +{ + struct cache_extent *entry; + struct cache_extent_search_range *range; + + range = (struct cache_extent_search_range *)data; + entry = rb_entry(node, struct cache_extent, rb_node); + + if (entry->objectid < range->objectid) + return 1; + else if (entry->objectid > range->objectid) + return -1; + else if (entry->start + entry->size <= range->start) + return 1; + else if (range->start + range->size <= entry->start) + return -1; + else + return 0; +} + +static int cache_tree_comp_nodes2(struct rb_node *node1, struct rb_node *node2) +{ + struct cache_extent *entry; + struct cache_extent_search_range range; + + entry = rb_entry(node2, struct cache_extent, rb_node); + range.objectid = entry->objectid; + range.start = entry->start; + range.size = entry->size; + + return cache_tree_comp_range2(node1, (void *)&range); +} + +void cache_tree_init(struct cache_tree *tree) +{ + tree->root = RB_ROOT; +} + +static struct cache_extent * +alloc_cache_extent(u64 objectid, u64 start, u64 size) { struct cache_extent *pe = malloc(sizeof(*pe)); if (!pe) return pe; + pe->objectid = objectid; pe->start = start; pe->size = size; return pe; } -int insert_cache_extent(struct cache_tree *tree, struct cache_extent *pe) -{ - return rb_insert(&tree->root, &pe->rb_node, cache_tree_comp_nodes); -} - -int add_cache_extent(struct cache_tree *tree, u64 start, u64 size) +static int __add_cache_extent(struct cache_tree *tree, + u64 objectid, u64 start, u64 size) { - struct cache_extent *pe = alloc_cache_extent(start, size); + struct cache_extent *pe = alloc_cache_extent(objectid, start, size); int ret; if (!pe) { @@ -92,8 +124,29 @@ int add_cache_extent(struct cache_tree *tree, u64 start, u64 size) return ret; } -struct cache_extent *find_cache_extent(struct cache_tree *tree, - u64 start, u64 size) +int add_cache_extent(struct cache_tree *tree, u64 start, u64 size) +{ + return __add_cache_extent(tree, 0, start, size); +} + +int add_cache_extent2(struct cache_tree *tree, + u64 objectid, u64 start, u64 size) +{ + return __add_cache_extent(tree, objectid, start, size); +} + +int insert_cache_extent(struct cache_tree *tree, struct cache_extent *pe) +{ + return rb_insert(&tree->root, &pe->rb_node, cache_tree_comp_nodes); +} + +int insert_cache_extent2(struct cache_tree *tree, struct cache_extent *pe) +{ + return rb_insert(&tree->root, &pe->rb_node, cache_tree_comp_nodes2); +} + +struct cache_extent *lookup_cache_extent(struct cache_tree *tree, + u64 start, u64 size) { struct rb_node *node; struct cache_extent *entry; @@ -109,7 +162,25 @@ struct cache_extent *find_cache_extent(struct cache_tree *tree, return entry; } -struct cache_extent *find_first_cache_extent(struct cache_tree *tree, u64 start) +struct cache_extent *lookup_cache_extent2(struct cache_tree *tree, + u64 objectid, u64 start, u64 size) +{ + struct rb_node *node; + struct cache_extent *entry; + struct cache_extent_search_range range; + + range.objectid = objectid; + range.start = start; + range.size = size; + node = rb_search(&tree->root, &range, cache_tree_comp_range2, NULL); + if (!node) + return NULL; + + entry = rb_entry(node, struct cache_extent, rb_node); + return entry; +} + +struct cache_extent *search_cache_extent(struct cache_tree *tree, u64 start) { struct rb_node *next; struct rb_node *node; @@ -128,6 +199,27 @@ struct cache_extent *find_first_cache_extent(struct cache_tree *tree, u64 start) return entry; } +struct cache_extent *search_cache_extent2(struct cache_tree *tree, + u64 objectid, u64 start) +{ + struct rb_node *next; + struct rb_node *node; + struct cache_extent *entry; + struct cache_extent_search_range range; + + range.objectid = objectid; + range.start = start; + range.size = 1; + node = rb_search(&tree->root, &range, cache_tree_comp_range2, &next); + if (!node) + node = next; + if (!node) + return NULL; + + entry = rb_entry(node, struct cache_extent, rb_node); + return entry; +} + struct cache_extent *first_cache_extent(struct cache_tree *tree) { struct rb_node *node = rb_first(&tree->root); @@ -170,3 +262,13 @@ void cache_tree_free_extents(struct cache_tree *tree, free_func(ce); } } + +static void free_extent_cache(struct cache_extent *pe) +{ + free(pe); +} + +void free_extent_cache_tree(struct cache_tree *tree) +{ + cache_tree_free_extents(tree, free_extent_cache); +} diff --git a/extent-cache.h b/extent-cache.h index 2979fc3..cba83d1 100644 --- a/extent-cache.h +++ b/extent-cache.h @@ -33,6 +33,7 @@ struct cache_tree { struct cache_extent { struct rb_node rb_node; + u64 objectid; u64 start; u64 size; }; @@ -43,10 +44,9 @@ struct cache_extent *first_cache_extent(struct cache_tree *tree); struct cache_extent *prev_cache_extent(struct cache_extent *pe); struct cache_extent *next_cache_extent(struct cache_extent *pe); -struct cache_extent *find_first_cache_extent(struct cache_tree *tree, - u64 start); -struct cache_extent *find_cache_extent(struct cache_tree *tree, - u64 start, u64 size); +struct cache_extent *search_cache_extent(struct cache_tree *tree, u64 start); +struct cache_extent *lookup_cache_extent(struct cache_tree *tree, + u64 start, u64 size); int add_cache_extent(struct cache_tree *tree, u64 start, u64 size); int insert_cache_extent(struct cache_tree *tree, struct cache_extent *pe); @@ -68,4 +68,14 @@ static void free_##name##_tree(struct cache_tree *tree) \ cache_tree_free_extents(tree, free_func); \ } +void free_extent_cache_tree(struct cache_tree *tree); + +struct cache_extent *search_cache_extent2(struct cache_tree *tree, + u64 objectid, u64 start); +struct cache_extent *lookup_cache_extent2(struct cache_tree *tree, + u64 objectid, u64 start, u64 size); +int add_cache_extent2(struct cache_tree *tree, + u64 objectid, u64 start, u64 size); +int insert_cache_extent2(struct cache_tree *tree, struct cache_extent *pe); + #endif diff --git a/extent_io.c b/extent_io.c index 1e5d25a..377dec0 100644 --- a/extent_io.c +++ b/extent_io.c @@ -48,6 +48,7 @@ static struct extent_state *alloc_extent_state(void) state = malloc(sizeof(*state)); if (!state) return NULL; + state->cache_node.objectid = 0; state->refs = 1; state->state = 0; state->xprivate = 0; @@ -217,7 +218,7 @@ again: * this search will find the extents that end after * our range starts */ - node = find_first_cache_extent(&tree->state, start); + node = search_cache_extent(&tree->state, start); if (!node) goto out; state = container_of(node, struct extent_state, cache_node); @@ -311,7 +312,7 @@ again: * this search will find the extents that end after * our range starts */ - node = find_first_cache_extent(&tree->state, start); + node = search_cache_extent(&tree->state, start); if (!node) { err = insert_state(tree, prealloc, start, end, bits); BUG_ON(err == -EEXIST); @@ -438,7 +439,7 @@ int find_first_extent_bit(struct extent_io_tree *tree, u64 start, * this search will find all the extents that end after * our range starts. */ - node = find_first_cache_extent(&tree->state, start); + node = search_cache_extent(&tree->state, start); if (!node) goto out; @@ -465,7 +466,7 @@ int test_range_bit(struct extent_io_tree *tree, u64 start, u64 end, struct cache_extent *node; int bitset = 0; - node = find_first_cache_extent(&tree->state, start); + node = search_cache_extent(&tree->state, start); while (node && start <= end) { state = container_of(node, struct extent_state, cache_node); @@ -502,7 +503,7 @@ int set_state_private(struct extent_io_tree *tree, u64 start, u64 private) struct extent_state *state; int ret = 0; - node = find_first_cache_extent(&tree->state, start); + node = search_cache_extent(&tree->state, start); if (!node) { ret = -ENOENT; goto out; @@ -523,7 +524,7 @@ int get_state_private(struct extent_io_tree *tree, u64 start, u64 *private) struct extent_state *state; int ret = 0; - node = find_first_cache_extent(&tree->state, start); + node = search_cache_extent(&tree->state, start); if (!node) { ret = -ENOENT; goto out; @@ -620,7 +621,7 @@ struct extent_buffer *find_extent_buffer(struct extent_io_tree *tree, struct extent_buffer *eb = NULL; struct cache_extent *cache; - cache = find_cache_extent(&tree->cache, bytenr, blocksize); + cache = lookup_cache_extent(&tree->cache, bytenr, blocksize); if (cache && cache->start == bytenr && cache->size == blocksize) { eb = container_of(cache, struct extent_buffer, cache_node); @@ -636,7 +637,7 @@ struct extent_buffer *find_first_extent_buffer(struct extent_io_tree *tree, struct extent_buffer *eb = NULL; struct cache_extent *cache; - cache = find_first_cache_extent(&tree->cache, start); + cache = search_cache_extent(&tree->cache, start); if (cache) { eb = container_of(cache, struct extent_buffer, cache_node); list_move_tail(&eb->lru, &tree->lru); @@ -651,7 +652,7 @@ struct extent_buffer *alloc_extent_buffer(struct extent_io_tree *tree, struct extent_buffer *eb; struct cache_extent *cache; - cache = find_cache_extent(&tree->cache, bytenr, blocksize); + cache = lookup_cache_extent(&tree->cache, bytenr, blocksize); if (cache && cache->start == bytenr && cache->size == blocksize) { eb = container_of(cache, struct extent_buffer, cache_node); diff --git a/volumes.c b/volumes.c index e8e7907..a3acee8 100644 --- a/volumes.c +++ b/volumes.c @@ -1014,7 +1014,7 @@ int btrfs_num_copies(struct btrfs_mapping_tree *map_tree, u64 logical, u64 len) struct map_lookup *map; int ret; - ce = find_first_cache_extent(&map_tree->cache_tree, logical); + ce = search_cache_extent(&map_tree->cache_tree, logical); BUG_ON(!ce); BUG_ON(ce->start > logical || ce->start + ce->size < logical); map = container_of(ce, struct map_lookup, ce); @@ -1038,7 +1038,7 @@ int btrfs_next_metadata(struct btrfs_mapping_tree *map_tree, u64 *logical, struct cache_extent *ce; struct map_lookup *map; - ce = find_first_cache_extent(&map_tree->cache_tree, *logical); + ce = search_cache_extent(&map_tree->cache_tree, *logical); while (ce) { ce = next_cache_extent(ce); @@ -1069,7 +1069,7 @@ int btrfs_rmap_block(struct btrfs_mapping_tree *map_tree, u64 rmap_len; int i, j, nr = 0; - ce = find_first_cache_extent(&map_tree->cache_tree, chunk_start); + ce = search_cache_extent(&map_tree->cache_tree, chunk_start); BUG_ON(!ce); map = container_of(ce, struct map_lookup, ce); @@ -1181,7 +1181,7 @@ int __btrfs_map_block(struct btrfs_mapping_tree *map_tree, int rw, stripes_allocated = 1; } again: - ce = find_first_cache_extent(&map_tree->cache_tree, logical); + ce = search_cache_extent(&map_tree->cache_tree, logical); if (!ce) { if (multi) kfree(multi); @@ -1447,7 +1447,8 @@ int btrfs_bootstrap_super_map(struct btrfs_mapping_tree *map_tree, if (ret == -EEXIST) { struct cache_extent *old; struct map_lookup *old_map; - old = find_cache_extent(&map_tree->cache_tree, logical, length); + old = lookup_cache_extent(&map_tree->cache_tree, + logical, length); old_map = container_of(old, struct map_lookup, ce); remove_cache_extent(&map_tree->cache_tree, old); kfree(old_map); @@ -1466,7 +1467,7 @@ int btrfs_chunk_readonly(struct btrfs_root *root, u64 chunk_offset) int readonly = 0; int i; - ce = find_first_cache_extent(&map_tree->cache_tree, chunk_offset); + ce = search_cache_extent(&map_tree->cache_tree, chunk_offset); BUG_ON(!ce); map = container_of(ce, struct map_lookup, ce); @@ -1508,7 +1509,7 @@ static int read_one_chunk(struct btrfs_root *root, struct btrfs_key *key, logical = key->offset; length = btrfs_chunk_length(leaf, chunk); - ce = find_first_cache_extent(&map_tree->cache_tree, logical); + ce = search_cache_extent(&map_tree->cache_tree, logical); /* already mapped? */ if (ce && ce->start <= logical && ce->start + ce->size > logical) { -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
This patch adds the function to check correspondence between block group, chunk and device extent. Original-signed-off-by: Cheng Yang <chenyang.fnst@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> --- Makefile | 2 +- btrfsck.h | 118 +++++++++++++ cmds-check.c | 555 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 665 insertions(+), 10 deletions(-) create mode 100644 btrfsck.h diff --git a/Makefile b/Makefile index da7438e..b45235e 100644 --- a/Makefile +++ b/Makefile @@ -14,7 +14,7 @@ cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \ libbtrfs_objects = send-stream.o send-utils.o rbtree.o btrfs-list.o crc32c.o libbtrfs_headers = send-stream.h send-utils.h send.h rbtree.h btrfs-list.h \ crc32c.h list.h kerncompat.h radix-tree.h extent-cache.h \ - extent_io.h ioctl.h ctree.h + extent_io.h ioctl.h ctree.h btrfsck.h CHECKFLAGS= -D__linux__ -Dlinux -D__STDC__ -Dunix -D__unix__ -Wbitwise \ -Wuninitialized -Wshadow -Wundef diff --git a/btrfsck.h b/btrfsck.h new file mode 100644 index 0000000..37ac130 --- /dev/null +++ b/btrfsck.h @@ -0,0 +1,118 @@ +/* + * Copyright (C) 2013 Fujitsu. All rights reserved. + * Written by Miao Xie <miaox@cn.fujitsu.com> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ + +#ifndef __CHUNK_CHECK_H__ +#define __CHUNK_CHECK_H__ + +#if BTRFS_FLAT_INCLUDES +#include "kerncompat.h" +#include "extent-cache.h" +#include "list.h" +#else +#include <btrfs/kerncompat.h> +#include <btrfs/extent-cache.h> +#include <btrfs/list.h> +#endif /* BTRFS_FLAT_INCLUDES */ + +struct block_group_record { + struct cache_extent cache; + /* Used to identify the orphan block groups */ + struct list_head list; + + u64 objectid; + u8 type; + u64 offset; + + u64 flags; +}; + +struct block_group_tree { + struct cache_tree tree; + struct list_head block_groups; +}; + +struct device_record { + struct rb_node node; + u64 devid; + + u64 objectid; + u8 type; + u64 offset; + + u64 total_byte; + u64 byte_used; + + u64 real_used; +}; + +struct stripe { + u64 devid; + u64 offset; +}; + +struct chunk_record { + struct cache_extent cache; + + u64 objectid; + u8 type; + u64 offset; + + u64 length; + u64 type_flags; + u16 num_stripes; + u16 sub_stripes; + struct stripe stripes[0]; +}; + +struct device_extent_record { + struct cache_extent cache; + /* + * Used to identify the orphan device extents (the device extents + * don''t belong to a chunk or a device) + */ + struct list_head chunk_list; + struct list_head device_list; + + u64 objectid; + u8 type; + u64 offset; + + u64 chunk_objecteid; + u64 chunk_offset; + u64 length; +}; + +struct device_extent_tree { + struct cache_tree tree; + /* + * The idea is: + * When checking the chunk information, we move the device extents + * that has its chunk to the chunk''s device extents list. After the + * check, if there are still some device extents in no_chunk_orphans, + * it means there are some device extents which don''t belong to any + * chunk. + * + * The usage of no_device_orphans is the same as the first one, but it + * is for the device information check. + */ + struct list_head no_chunk_orphans; + struct list_head no_device_orphans; +}; + +#endif diff --git a/cmds-check.c b/cmds-check.c index 2a34839..c65ae68 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -27,18 +27,17 @@ #include <unistd.h> #include <getopt.h> #include <uuid/uuid.h> -#include "kerncompat.h" #include "ctree.h" #include "volumes.h" #include "repair.h" #include "disk-io.h" #include "print-tree.h" #include "transaction.h" -#include "list.h" #include "version.h" #include "utils.h" #include "commands.h" #include "free-space-cache.h" +#include "btrfsck.h" static u64 bytes_used = 0; static u64 total_csum_bytes = 0; @@ -239,6 +238,21 @@ static u8 imode_to_type(u32 imode) #undef S_SHIFT } +static int device_record_compare(struct rb_node *node1, struct rb_node *node2) +{ + struct device_record *rec1; + struct device_record *rec2; + + rec1 = rb_entry(node1, struct device_record, node); + rec2 = rb_entry(node2, struct device_record, node); + if (rec1->devid > rec2->devid) + return -1; + else if (rec1->devid < rec2->devid) + return 1; + else + return 0; +} + static struct inode_record *clone_inode_rec(struct inode_record *orig_rec) { struct inode_record *rec; @@ -2603,6 +2617,98 @@ static int pick_next_pending(struct cache_tree *pending, return ret; } +static void free_chunk_record(struct cache_extent *cache) +{ + struct chunk_record *rec; + + rec = container_of(cache, struct chunk_record, cache); + free(rec); +} + +FREE_EXTENT_CACHE_BASED_TREE(chunk_cache, free_chunk_record); + +static void free_device_record(struct rb_node *node) +{ + struct device_record *rec; + + rec = container_of(node, struct device_record, node); + free(rec); +} + +FREE_RB_BASED_TREE(device_cache, free_device_record); + +static void block_group_tree_init(struct block_group_tree *tree) +{ + cache_tree_init(&tree->tree); + INIT_LIST_HEAD(&tree->block_groups); +} + +static int insert_block_group_record(struct block_group_tree *tree, + struct block_group_record *bg_rec) +{ + int ret; + + ret = insert_cache_extent(&tree->tree, &bg_rec->cache); + if (ret) + return ret; + + list_add_tail(&bg_rec->list, &tree->block_groups); + return 0; +} + +static void free_block_group_record(struct cache_extent *cache) +{ + struct block_group_record *rec; + + rec = container_of(cache, struct block_group_record, cache); + free(rec); +} + +static void free_block_group_tree(struct block_group_tree *tree) +{ + cache_tree_free_extents(&tree->tree, free_block_group_record); +} + +static void device_extent_tree_init(struct device_extent_tree *tree) +{ + cache_tree_init(&tree->tree); + INIT_LIST_HEAD(&tree->no_chunk_orphans); + INIT_LIST_HEAD(&tree->no_device_orphans); +} + +static int insert_device_extent_record(struct device_extent_tree *tree, + struct device_extent_record *de_rec) +{ + int ret; + + /* + * Device extent is a bit different from the other extents, because + * the extents which belong to the different devices may have the + * same start and size, so we need use the special extent cache + * search/insert functions. + */ + ret = insert_cache_extent2(&tree->tree, &de_rec->cache); + if (ret) + return ret; + + list_add_tail(&de_rec->chunk_list, &tree->no_chunk_orphans); + list_add_tail(&de_rec->device_list, &tree->no_device_orphans); + return 0; +} + +static void free_device_extent_record(struct cache_extent *cache) +{ + struct device_extent_record *rec; + + rec = container_of(cache, struct device_extent_record, cache); + free(rec); +} + +static void free_device_extent_tree(struct device_extent_tree *tree) +{ + cache_tree_free_extents(&tree->tree, free_device_extent_record); +} + #ifdef BTRFS_COMPAT_EXTENT_TREE_V0 static int process_extent_ref_v0(struct cache_tree *extent_cache, struct extent_buffer *leaf, int slot) @@ -2622,6 +2728,172 @@ static int process_extent_ref_v0(struct cache_tree *extent_cache, } #endif +static inline unsigned long chunk_record_size(int num_stripes) +{ + return sizeof(struct chunk_record) + + sizeof(struct stripe) * num_stripes; +} + +static int process_chunk_item(struct cache_tree *chunk_cache, + struct btrfs_key *key, struct extent_buffer *eb, int slot) +{ + struct btrfs_chunk *ptr; + struct chunk_record *rec; + int num_stripes, i; + int ret = 0; + + ptr = btrfs_item_ptr(eb, + slot, struct btrfs_chunk); + + num_stripes = btrfs_chunk_num_stripes(eb, ptr); + + rec = malloc(chunk_record_size(num_stripes)); + if (!rec) { + fprintf(stderr, "memory allocation failed\n"); + return -ENOMEM; + } + + rec->cache.start = key->offset; + rec->cache.size = btrfs_chunk_length(eb, ptr); + + rec->objectid = key->objectid; + rec->type = key->type; + rec->offset = key->offset; + + rec->length = rec->cache.size; + rec->type_flags = btrfs_chunk_type(eb, ptr); + rec->num_stripes = num_stripes; + rec->sub_stripes = btrfs_chunk_sub_stripes(eb, ptr); + + for (i = 0; i < rec->num_stripes; ++i) { + rec->stripes[i].devid + btrfs_stripe_devid_nr(eb, ptr, i); + rec->stripes[i].offset + btrfs_stripe_offset_nr(eb, ptr, i); + } + + ret = insert_cache_extent(chunk_cache, &rec->cache); + if (ret) { + fprintf(stderr, "Chunk[%llu, %llu] existed.\n", + rec->offset, rec->length); + free(rec); + } + + return ret; +} + +static int process_device_item(struct rb_root *dev_cache, + struct btrfs_key *key, struct extent_buffer *eb, int slot) +{ + struct btrfs_dev_item *ptr; + struct device_record *rec; + int ret = 0; + + ptr = btrfs_item_ptr(eb, + slot, struct btrfs_dev_item); + + rec = malloc(sizeof(*rec)); + if (!rec) { + fprintf(stderr, "memory allocation failed\n"); + return -ENOMEM; + } + + rec->devid = key->offset; + + rec->objectid = key->objectid; + rec->type = key->type; + rec->offset = key->offset; + + rec->devid = btrfs_device_id(eb, ptr); + rec->total_byte = btrfs_device_total_bytes(eb, ptr); + rec->byte_used = btrfs_device_bytes_used(eb, ptr); + + ret = rb_insert(dev_cache, &rec->node, device_record_compare); + if (ret) { + fprintf(stderr, "Device[%llu] existed.\n", rec->devid); + free(rec); + } + + return ret; +} + +static int process_block_group_item(struct block_group_tree *block_group_cache, + struct btrfs_key *key, struct extent_buffer *eb, int slot) +{ + struct btrfs_block_group_item *ptr; + struct block_group_record *rec; + int ret = 0; + + ptr = btrfs_item_ptr(eb, slot, + struct btrfs_block_group_item); + + rec = malloc(sizeof(*rec)); + if (!rec) { + fprintf(stderr, "memory allocation failed\n"); + return -ENOMEM; + } + + rec->cache.start = key->objectid; + rec->cache.size = key->offset; + + rec->objectid = key->objectid; + rec->type = key->type; + rec->offset = key->offset; + rec->flags = btrfs_disk_block_group_flags(eb, ptr); + + ret = insert_block_group_record(block_group_cache, rec); + if (ret) { + fprintf(stderr, "Block Group[%llu, %llu] existed.\n", + rec->objectid, rec->offset); + free(rec); + } + + return ret; +} + +static int +process_device_extent_item(struct device_extent_tree *dev_extent_cache, + struct btrfs_key *key, struct extent_buffer *eb, + int slot) +{ + int ret = 0; + + struct btrfs_dev_extent *ptr; + struct device_extent_record *rec; + + ptr = btrfs_item_ptr(eb, + slot, struct btrfs_dev_extent); + + rec = malloc(sizeof(*rec)); + if (!rec) { + fprintf(stderr, "memory allocation failed\n"); + return -ENOMEM; + } + + rec->cache.objectid = key->objectid; + rec->cache.start = key->offset; + + rec->objectid = key->objectid; + rec->type = key->type; + rec->offset = key->offset; + + rec->chunk_objecteid + btrfs_dev_extent_chunk_objectid(eb, ptr); + rec->chunk_offset + btrfs_dev_extent_chunk_offset(eb, ptr); + rec->length = btrfs_dev_extent_length(eb, ptr); + rec->cache.size = rec->length; + + ret = insert_device_extent_record(dev_extent_cache, rec); + if (ret) { + fprintf(stderr, "Device extent[%llu, %llu, %llu] existed.\n", + rec->objectid, rec->offset, rec->length); + free(rec); + } + + return ret; +} + static int process_extent_item(struct btrfs_root *root, struct cache_tree *extent_cache, struct extent_buffer *eb, int slot) @@ -3146,7 +3418,11 @@ static int run_next_block(struct btrfs_root *root, struct cache_tree *seen, struct cache_tree *reada, struct cache_tree *nodes, - struct cache_tree *extent_cache) + struct cache_tree *extent_cache, + struct cache_tree *chunk_cache, + struct rb_root *dev_cache, + struct block_group_tree *block_group_cache, + struct device_extent_tree *dev_extent_cache) { struct extent_buffer *buf; u64 bytenr; @@ -3246,8 +3522,24 @@ static int run_next_block(struct btrfs_root *root, btrfs_item_size_nr(buf, i); continue; } + if (key.type == BTRFS_CHUNK_ITEM_KEY) { + process_chunk_item(chunk_cache, &key, buf, i); + continue; + } + if (key.type == BTRFS_DEV_ITEM_KEY) { + process_device_item(dev_cache, &key, buf, i); + continue; + } if (key.type == BTRFS_BLOCK_GROUP_ITEM_KEY) { + process_block_group_item(block_group_cache, + &key, buf, i); + continue; + } + if (key.type == BTRFS_DEV_EXTENT_KEY) { + process_device_extent_item(dev_extent_cache, + &key, buf, i); continue; + } if (key.type == BTRFS_EXTENT_REF_V0_KEY) { #ifdef BTRFS_COMPAT_EXTENT_TREE_V0 @@ -4595,8 +4887,233 @@ repair_abort: return err; } -static int check_extents(struct btrfs_root *root, int repair) +static u64 calc_stripe_length(struct chunk_record *chunk_rec) +{ + u64 stripe_size; + + if (chunk_rec->type_flags & BTRFS_BLOCK_GROUP_RAID0) { + stripe_size = chunk_rec->length; + stripe_size /= chunk_rec->num_stripes; + } else if (chunk_rec->type_flags & BTRFS_BLOCK_GROUP_RAID10) { + stripe_size = chunk_rec->length * 2; + stripe_size /= chunk_rec->num_stripes; + } else if (chunk_rec->type_flags & BTRFS_BLOCK_GROUP_RAID5) { + stripe_size = chunk_rec->length; + stripe_size /= (chunk_rec->num_stripes - 1); + } else if (chunk_rec->type_flags & BTRFS_BLOCK_GROUP_RAID6) { + stripe_size = chunk_rec->length; + stripe_size /= (chunk_rec->num_stripes - 2); + } else { + stripe_size = chunk_rec->length; + } + return stripe_size; +} + +static int check_chunk_refs(struct chunk_record *chunk_rec, + struct block_group_tree *block_group_cache, + struct device_extent_tree *dev_extent_cache) +{ + struct cache_extent *block_group_item; + struct block_group_record *block_group_rec; + struct cache_extent *dev_extent_item; + struct device_extent_record *dev_extent_rec; + u64 devid; + u64 offset; + u64 length; + int i; + int ret = 0; + + block_group_item = lookup_cache_extent(&block_group_cache->tree, + chunk_rec->offset, + chunk_rec->length); + if (block_group_item) { + block_group_rec = container_of(block_group_item, + struct block_group_record, + cache); + if (chunk_rec->length != block_group_rec->offset || + chunk_rec->offset != block_group_rec->objectid || + chunk_rec->type_flags != block_group_rec->flags) { + fprintf(stderr, + "Chunk[%llu, %u, %llu]: length(%llu), offset(%llu), type(%llu) mismatch with block group[%llu, %u, %llu]: offset(%llu), objectid(%llu), flags(%llu)\n", + chunk_rec->objectid, + chunk_rec->type, + chunk_rec->offset, + chunk_rec->length, + chunk_rec->offset, + chunk_rec->type_flags, + block_group_rec->objectid, + block_group_rec->type, + block_group_rec->offset, + block_group_rec->offset, + block_group_rec->objectid, + block_group_rec->flags); + ret = -1; + } + list_del(&block_group_rec->list); + } else { + fprintf(stderr, + "Chunk[%llu, %u, %llu]: length(%llu), offset(%llu), type(%llu) is not found in block group\n", + chunk_rec->objectid, + chunk_rec->type, + chunk_rec->offset, + chunk_rec->length, + chunk_rec->offset, + chunk_rec->type_flags); + ret = -1; + } + + length = calc_stripe_length(chunk_rec); + for (i = 0; i < chunk_rec->num_stripes; ++i) { + devid = chunk_rec->stripes[i].devid; + offset = chunk_rec->stripes[i].offset; + dev_extent_item = lookup_cache_extent2(&dev_extent_cache->tree, + devid, offset, length); + if (dev_extent_item) { + dev_extent_rec = container_of(dev_extent_item, + struct device_extent_record, + cache); + if (dev_extent_rec->objectid != devid || + dev_extent_rec->offset != offset || + dev_extent_rec->chunk_offset != chunk_rec->offset || + dev_extent_rec->length != length) { + fprintf(stderr, + "Chunk[%llu, %u, %llu] stripe[%llu, %llu] dismatch dev extent[%llu, %llu, %llu]\n", + chunk_rec->objectid, + chunk_rec->type, + chunk_rec->offset, + chunk_rec->stripes[i].devid, + chunk_rec->stripes[i].offset, + dev_extent_rec->objectid, + dev_extent_rec->offset, + dev_extent_rec->length); + ret = -1; + } + list_del(&dev_extent_rec->chunk_list); + } else { + fprintf(stderr, + "Chunk[%llu, %u, %llu] stripe[%llu, %llu] is not found in dev extent\n", + chunk_rec->objectid, + chunk_rec->type, + chunk_rec->offset, + chunk_rec->stripes[i].devid, + chunk_rec->stripes[i].offset); + ret = -1; + } + } + return ret; +} + +/* check btrfs_chunk -> btrfs_dev_extent / btrfs_block_group_item */ +static int check_chunks(struct cache_tree *chunk_cache, + struct block_group_tree *block_group_cache, + struct device_extent_tree *dev_extent_cache) { + struct cache_extent *chunk_item; + struct chunk_record *chunk_rec; + struct block_group_record *bg_rec; + struct device_extent_record *dext_rec; + int err; + int ret = 0; + + chunk_item = first_cache_extent(chunk_cache); + while (chunk_item) { + chunk_rec = container_of(chunk_item, struct chunk_record, + cache); + err = check_chunk_refs(chunk_rec, block_group_cache, + dev_extent_cache); + if (err) + ret = err; + + chunk_item = next_cache_extent(chunk_item); + } + + list_for_each_entry(bg_rec, &block_group_cache->block_groups, list) { + fprintf(stderr, + "Block group[%llu, %llu] (flags = %llu) didn''t find the relative chunk.\n", + bg_rec->objectid, bg_rec->offset, bg_rec->flags); + if (!ret) + ret = 1; + } + + list_for_each_entry(dext_rec, &dev_extent_cache->no_chunk_orphans, + chunk_list) { + fprintf(stderr, + "Device extent[%llu, %llu, %llu] didn''t find the relative chunk.\n", + dext_rec->objectid, dext_rec->offset, dext_rec->length); + if (!ret) + ret = 1; + } + return ret; +} + + +static int check_device_used(struct device_record *dev_rec, + struct device_extent_tree *dext_cache) +{ + struct cache_extent *cache; + struct device_extent_record *dev_extent_rec; + u64 total_byte = 0; + + cache = search_cache_extent2(&dext_cache->tree, dev_rec->devid, 0); + while (cache) { + dev_extent_rec = container_of(cache, + struct device_extent_record, + cache); + if (dev_extent_rec->objectid != dev_rec->devid) + break; + + list_del(&dev_extent_rec->device_list); + total_byte += dev_extent_rec->length; + cache = next_cache_extent(cache); + } + + if (total_byte != dev_rec->byte_used) { + fprintf(stderr, + "Dev extent''s total-byte(%llu) is not equal to byte-used(%llu) in dev[%llu, %u, %llu]\n", + total_byte, dev_rec->byte_used, dev_rec->objectid, + dev_rec->type, dev_rec->offset); + return -1; + } else { + return 0; + } +} + +/* check btrfs_dev_item -> btrfs_dev_extent */ +static int check_devices(struct rb_root *dev_cache, + struct device_extent_tree *dev_extent_cache) +{ + struct rb_node *dev_node; + struct device_record *dev_rec; + struct device_extent_record *dext_rec; + int err; + int ret = 0; + + dev_node = rb_first(dev_cache); + while (dev_node) { + dev_rec = container_of(dev_node, struct device_record, node); + err = check_device_used(dev_rec, dev_extent_cache); + if (err) + ret = err; + + dev_node = rb_next(dev_node); + } + list_for_each_entry(dext_rec, &dev_extent_cache->no_device_orphans, + device_list) { + fprintf(stderr, + "Device extent[%llu, %llu, %llu] didn''t find its device.\n", + dext_rec->objectid, dext_rec->offset, dext_rec->length); + if (!ret) + ret = 1; + } + return ret; +} + +static int check_chunks_and_extents(struct btrfs_root *root, int repair) +{ + struct rb_root dev_cache; + struct cache_tree chunk_cache; + struct block_group_tree block_group_cache; + struct device_extent_tree dev_extent_cache; struct cache_tree extent_cache; struct cache_tree seen; struct cache_tree pending; @@ -4606,7 +5123,7 @@ static int check_extents(struct btrfs_root *root, int repair) struct btrfs_path path; struct btrfs_key key; struct btrfs_key found_key; - int ret; + int ret, err = 0; u64 last = 0; struct block_info *bits; int bits_nr; @@ -4615,6 +5132,11 @@ static int check_extents(struct btrfs_root *root, int repair) int slot; struct btrfs_root_item ri; + dev_cache = RB_ROOT; + cache_tree_init(&chunk_cache); + block_group_tree_init(&block_group_cache); + device_extent_tree_init(&dev_extent_cache); + cache_tree_init(&extent_cache); cache_tree_init(&seen); cache_tree_init(&pending); @@ -4686,12 +5208,14 @@ again: btrfs_release_path(root, &path); while(1) { ret = run_next_block(root, bits, bits_nr, &last, &pending, - &seen, &reada, &nodes, &extent_cache); + &seen, &reada, &nodes, &extent_cache, + &chunk_cache, &dev_cache, + &block_group_cache, &dev_extent_cache); if (ret != 0) break; } - ret = check_extent_refs(trans, root, &extent_cache, repair); + ret = check_extent_refs(trans, root, &extent_cache, repair); if (ret == -EAGAIN) { ret = btrfs_commit_transaction(trans, root); if (ret) @@ -4712,6 +5236,15 @@ again: goto again; } + err = check_chunks(&chunk_cache, &block_group_cache, + &dev_extent_cache); + if (err && !ret) + ret = err; + + err = check_devices(&dev_cache, &dev_extent_cache); + if (err && !ret) + ret = err; + if (trans) { int err; @@ -4727,6 +5260,10 @@ out: root->fs_info->corrupt_blocks = NULL; } free(bits); + free_chunk_cache_tree(&chunk_cache); + free_device_cache_tree(&dev_cache); + free_block_group_tree(&block_group_cache); + free_device_extent_tree(&dev_extent_cache); return ret; } @@ -5207,9 +5744,9 @@ int cmd_check(int argc, char **argv) exit(1); goto out; } - ret = check_extents(root, repair); + ret = check_chunks_and_extents(root, repair); if (ret) - fprintf(stderr, "Errors found in extent allocation tree\n"); + fprintf(stderr, "Errors found in extent allocation tree or chunk allocation\n"); fprintf(stderr, "checking free space cache\n"); ret = check_space_cache(root); -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Jul-03 13:25 UTC
[PATCH 09/12] Btrfs-progs: Add chunk recover function - using old chunk items
Add chunk-recover program to check or rebuild chunk tree when the system chunk array or chunk tree is broken. Due to the importance of the system chunk array and chunk tree, if one of them is broken, the whole btrfs will be broken even other data are OK. But we have some hint(fsid, checksum...) to salvage the old metadata. So this function will first scan the whole file system and collect the needed data(chunk/block group/dev extent), and check for the references between them. If the references are OK, the chunk tree can be rebuilt and luckily the file system will be mountable. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> --- Makefile | 2 +- btrfs.c | 1 + btrfsck.h | 64 +++ cmds-check.c | 285 +++++++----- cmds-chunk.c | 1399 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ commands.h | 2 + disk-io.c | 22 +- disk-io.h | 1 + extent-tree.c | 6 - extent_io.h | 6 + volumes.c | 11 +- volumes.h | 4 + 12 files changed, 1672 insertions(+), 131 deletions(-) create mode 100644 cmds-chunk.c diff --git a/Makefile b/Makefile index b45235e..c43cb68 100644 --- a/Makefile +++ b/Makefile @@ -10,7 +10,7 @@ objects = ctree.o disk-io.o radix-tree.o extent-tree.o print-tree.o \ cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \ cmds-inspect.o cmds-balance.o cmds-send.o cmds-receive.o \ cmds-quota.o cmds-qgroup.o cmds-replace.o cmds-check.o \ - cmds-restore.o + cmds-restore.o cmds-chunk.o libbtrfs_objects = send-stream.o send-utils.o rbtree.o btrfs-list.o crc32c.o libbtrfs_headers = send-stream.h send-utils.h send.h rbtree.h btrfs-list.h \ crc32c.h list.h kerncompat.h radix-tree.h extent-cache.h \ diff --git a/btrfs.c b/btrfs.c index 691adef..4e93e13 100644 --- a/btrfs.c +++ b/btrfs.c @@ -247,6 +247,7 @@ const struct cmd_group btrfs_cmd_group = { { "device", cmd_device, NULL, &device_cmd_group, 0 }, { "scrub", cmd_scrub, NULL, &scrub_cmd_group, 0 }, { "check", cmd_check, cmd_check_usage, NULL, 0 }, + { "chunk-recover", cmd_chunk_recover, cmd_chunk_recover_usage, NULL, 0}, { "restore", cmd_restore, cmd_restore_usage, NULL, 0 }, { "inspect-internal", cmd_inspect, NULL, &inspect_cmd_group, 0 }, { "send", cmd_send, cmd_send_usage, NULL, 0 }, diff --git a/btrfsck.h b/btrfsck.h index 37ac130..a6151d5 100644 --- a/btrfsck.h +++ b/btrfsck.h @@ -35,6 +35,8 @@ struct block_group_record { /* Used to identify the orphan block groups */ struct list_head list; + u64 generation; + u64 objectid; u8 type; u64 offset; @@ -51,6 +53,8 @@ struct device_record { struct rb_node node; u64 devid; + u64 generation; + u64 objectid; u8 type; u64 offset; @@ -64,19 +68,31 @@ struct device_record { struct stripe { u64 devid; u64 offset; + u8 dev_uuid[BTRFS_UUID_SIZE]; }; struct chunk_record { struct cache_extent cache; + struct list_head list; + struct list_head dextents; + struct block_group_record *bg_rec; + + u64 generation; + u64 objectid; u8 type; u64 offset; + u64 owner; u64 length; u64 type_flags; + u64 stripe_len; u16 num_stripes; u16 sub_stripes; + u32 io_align; + u32 io_width; + u32 sector_size; struct stripe stripes[0]; }; @@ -89,6 +105,8 @@ struct device_extent_record { struct list_head chunk_list; struct list_head device_list; + u64 generation; + u64 objectid; u8 type; u64 offset; @@ -115,4 +133,50 @@ struct device_extent_tree { struct list_head no_device_orphans; }; +static inline unsigned long btrfs_chunk_record_size(int num_stripes) +{ + return sizeof(struct chunk_record) + + sizeof(struct stripe) * num_stripes; +} +void free_chunk_cache_tree(struct cache_tree *chunk_cache); + +/* For block group tree */ +static inline void block_group_tree_init(struct block_group_tree *tree) +{ + cache_tree_init(&tree->tree); + INIT_LIST_HEAD(&tree->block_groups); +} + +int insert_block_group_record(struct block_group_tree *tree, + struct block_group_record *bg_rec); +void free_block_group_tree(struct block_group_tree *tree); + +/* For device extent tree */ +static inline void device_extent_tree_init(struct device_extent_tree *tree) +{ + cache_tree_init(&tree->tree); + INIT_LIST_HEAD(&tree->no_chunk_orphans); + INIT_LIST_HEAD(&tree->no_device_orphans); +} + +int insert_device_extent_record(struct device_extent_tree *tree, + struct device_extent_record *de_rec); +void free_device_extent_tree(struct device_extent_tree *tree); + + +/* Create various in-memory record by on-disk data */ +struct chunk_record *btrfs_new_chunk_record(struct extent_buffer *leaf, + struct btrfs_key *key, + int slot); +struct block_group_record * +btrfs_new_block_group_record(struct extent_buffer *leaf, struct btrfs_key *key, + int slot); +struct device_extent_record * +btrfs_new_device_extent_record(struct extent_buffer *leaf, + struct btrfs_key *key, int slot); + +int check_chunks(struct cache_tree *chunk_cache, + struct block_group_tree *block_group_cache, + struct device_extent_tree *dev_extent_cache, + struct list_head *good, struct list_head *bad, int silent); #endif diff --git a/cmds-check.c b/cmds-check.c index c65ae68..c3c7575 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -2625,7 +2625,10 @@ static void free_chunk_record(struct cache_extent *cache) free(rec); } -FREE_EXTENT_CACHE_BASED_TREE(chunk_cache, free_chunk_record); +void free_chunk_cache_tree(struct cache_tree *chunk_cache) +{ + cache_tree_free_extents(chunk_cache, free_chunk_record); +} static void free_device_record(struct rb_node *node) { @@ -2637,14 +2640,8 @@ static void free_device_record(struct rb_node *node) FREE_RB_BASED_TREE(device_cache, free_device_record); -static void block_group_tree_init(struct block_group_tree *tree) -{ - cache_tree_init(&tree->tree); - INIT_LIST_HEAD(&tree->block_groups); -} - -static int insert_block_group_record(struct block_group_tree *tree, - struct block_group_record *bg_rec) +int insert_block_group_record(struct block_group_tree *tree, + struct block_group_record *bg_rec) { int ret; @@ -2664,20 +2661,13 @@ static void free_block_group_record(struct cache_extent *cache) free(rec); } -static void free_block_group_tree(struct block_group_tree *tree) +void free_block_group_tree(struct block_group_tree *tree) { cache_tree_free_extents(&tree->tree, free_block_group_record); } -static void device_extent_tree_init(struct device_extent_tree *tree) -{ - cache_tree_init(&tree->tree); - INIT_LIST_HEAD(&tree->no_chunk_orphans); - INIT_LIST_HEAD(&tree->no_device_orphans); -} - -static int insert_device_extent_record(struct device_extent_tree *tree, - struct device_extent_record *de_rec) +int insert_device_extent_record(struct device_extent_tree *tree, + struct device_extent_record *de_rec) { int ret; @@ -2704,7 +2694,7 @@ static void free_device_extent_record(struct cache_extent *cache) free(rec); } -static void free_device_extent_tree(struct device_extent_tree *tree) +void free_device_extent_tree(struct device_extent_tree *tree) { cache_tree_free_extents(&tree->tree, free_device_extent_record); } @@ -2728,50 +2718,69 @@ static int process_extent_ref_v0(struct cache_tree *extent_cache, } #endif -static inline unsigned long chunk_record_size(int num_stripes) -{ - return sizeof(struct chunk_record) + - sizeof(struct stripe) * num_stripes; -} - -static int process_chunk_item(struct cache_tree *chunk_cache, - struct btrfs_key *key, struct extent_buffer *eb, int slot) +struct chunk_record *btrfs_new_chunk_record(struct extent_buffer *leaf, + struct btrfs_key *key, + int slot) { struct btrfs_chunk *ptr; struct chunk_record *rec; int num_stripes, i; - int ret = 0; - - ptr = btrfs_item_ptr(eb, - slot, struct btrfs_chunk); - num_stripes = btrfs_chunk_num_stripes(eb, ptr); + ptr = btrfs_item_ptr(leaf, slot, struct btrfs_chunk); + num_stripes = btrfs_chunk_num_stripes(leaf, ptr); - rec = malloc(chunk_record_size(num_stripes)); + rec = malloc(btrfs_chunk_record_size(num_stripes)); if (!rec) { fprintf(stderr, "memory allocation failed\n"); - return -ENOMEM; + exit(-1); } + memset(rec, 0, btrfs_chunk_record_size(num_stripes)); + + INIT_LIST_HEAD(&rec->list); + INIT_LIST_HEAD(&rec->dextents); + rec->bg_rec = NULL; + rec->cache.start = key->offset; - rec->cache.size = btrfs_chunk_length(eb, ptr); + rec->cache.size = btrfs_chunk_length(leaf, ptr); + + rec->generation = btrfs_header_generation(leaf); rec->objectid = key->objectid; rec->type = key->type; rec->offset = key->offset; rec->length = rec->cache.size; - rec->type_flags = btrfs_chunk_type(eb, ptr); + rec->owner = btrfs_chunk_owner(leaf, ptr); + rec->stripe_len = btrfs_chunk_stripe_len(leaf, ptr); + rec->type_flags = btrfs_chunk_type(leaf, ptr); + rec->io_width = btrfs_chunk_io_width(leaf, ptr); + rec->io_align = btrfs_chunk_io_align(leaf, ptr); + rec->sector_size = btrfs_chunk_sector_size(leaf, ptr); rec->num_stripes = num_stripes; - rec->sub_stripes = btrfs_chunk_sub_stripes(eb, ptr); + rec->sub_stripes = btrfs_chunk_sub_stripes(leaf, ptr); for (i = 0; i < rec->num_stripes; ++i) { rec->stripes[i].devid - btrfs_stripe_devid_nr(eb, ptr, i); + btrfs_stripe_devid_nr(leaf, ptr, i); rec->stripes[i].offset - btrfs_stripe_offset_nr(eb, ptr, i); + btrfs_stripe_offset_nr(leaf, ptr, i); + read_extent_buffer(leaf, rec->stripes[i].dev_uuid, + (unsigned long)btrfs_stripe_dev_uuid_nr(ptr, i), + BTRFS_UUID_SIZE); } + return rec; +} + +static int process_chunk_item(struct cache_tree *chunk_cache, + struct btrfs_key *key, struct extent_buffer *eb, + int slot) +{ + struct chunk_record *rec; + int ret = 0; + + rec = btrfs_new_chunk_record(eb, key, slot); ret = insert_cache_extent(chunk_cache, &rec->cache); if (ret) { fprintf(stderr, "Chunk[%llu, %llu] existed.\n", @@ -2799,6 +2808,7 @@ static int process_device_item(struct rb_root *dev_cache, } rec->devid = key->offset; + rec->generation = btrfs_header_generation(eb); rec->objectid = key->objectid; rec->type = key->type; @@ -2817,30 +2827,45 @@ static int process_device_item(struct rb_root *dev_cache, return ret; } -static int process_block_group_item(struct block_group_tree *block_group_cache, - struct btrfs_key *key, struct extent_buffer *eb, int slot) +struct block_group_record * +btrfs_new_block_group_record(struct extent_buffer *leaf, struct btrfs_key *key, + int slot) { struct btrfs_block_group_item *ptr; struct block_group_record *rec; - int ret = 0; - - ptr = btrfs_item_ptr(eb, slot, - struct btrfs_block_group_item); rec = malloc(sizeof(*rec)); if (!rec) { fprintf(stderr, "memory allocation failed\n"); - return -ENOMEM; + exit(-1); } + memset(rec, 0, sizeof(*rec)); rec->cache.start = key->objectid; rec->cache.size = key->offset; + rec->generation = btrfs_header_generation(leaf); + rec->objectid = key->objectid; rec->type = key->type; rec->offset = key->offset; - rec->flags = btrfs_disk_block_group_flags(eb, ptr); + ptr = btrfs_item_ptr(leaf, slot, struct btrfs_block_group_item); + rec->flags = btrfs_disk_block_group_flags(leaf, ptr); + + INIT_LIST_HEAD(&rec->list); + + return rec; +} + +static int process_block_group_item(struct block_group_tree *block_group_cache, + struct btrfs_key *key, + struct extent_buffer *eb, int slot) +{ + struct block_group_record *rec; + int ret = 0; + + rec = btrfs_new_block_group_record(eb, key, slot); ret = insert_block_group_record(block_group_cache, rec); if (ret) { fprintf(stderr, "Block Group[%llu, %llu] existed.\n", @@ -2851,42 +2876,56 @@ static int process_block_group_item(struct block_group_tree *block_group_cache, return ret; } -static int -process_device_extent_item(struct device_extent_tree *dev_extent_cache, - struct btrfs_key *key, struct extent_buffer *eb, - int slot) +struct device_extent_record * +btrfs_new_device_extent_record(struct extent_buffer *leaf, + struct btrfs_key *key, int slot) { - int ret = 0; - - struct btrfs_dev_extent *ptr; struct device_extent_record *rec; - - ptr = btrfs_item_ptr(eb, - slot, struct btrfs_dev_extent); + struct btrfs_dev_extent *ptr; rec = malloc(sizeof(*rec)); if (!rec) { fprintf(stderr, "memory allocation failed\n"); - return -ENOMEM; + exit(-1); } + memset(rec, 0, sizeof(*rec)); rec->cache.objectid = key->objectid; rec->cache.start = key->offset; + rec->generation = btrfs_header_generation(leaf); + rec->objectid = key->objectid; rec->type = key->type; rec->offset = key->offset; + ptr = btrfs_item_ptr(leaf, slot, struct btrfs_dev_extent); rec->chunk_objecteid - btrfs_dev_extent_chunk_objectid(eb, ptr); + btrfs_dev_extent_chunk_objectid(leaf, ptr); rec->chunk_offset - btrfs_dev_extent_chunk_offset(eb, ptr); - rec->length = btrfs_dev_extent_length(eb, ptr); + btrfs_dev_extent_chunk_offset(leaf, ptr); + rec->length = btrfs_dev_extent_length(leaf, ptr); rec->cache.size = rec->length; + INIT_LIST_HEAD(&rec->chunk_list); + INIT_LIST_HEAD(&rec->device_list); + + return rec; +} + +static int +process_device_extent_item(struct device_extent_tree *dev_extent_cache, + struct btrfs_key *key, struct extent_buffer *eb, + int slot) +{ + struct device_extent_record *rec; + int ret; + + rec = btrfs_new_device_extent_record(eb, key, slot); ret = insert_device_extent_record(dev_extent_cache, rec); if (ret) { - fprintf(stderr, "Device extent[%llu, %llu, %llu] existed.\n", + fprintf(stderr, + "Device extent[%llu, %llu, %llu] existed.\n", rec->objectid, rec->offset, rec->length); free(rec); } @@ -4911,7 +4950,8 @@ static u64 calc_stripe_length(struct chunk_record *chunk_rec) static int check_chunk_refs(struct chunk_record *chunk_rec, struct block_group_tree *block_group_cache, - struct device_extent_tree *dev_extent_cache) + struct device_extent_tree *dev_extent_cache, + int silent) { struct cache_extent *block_group_item; struct block_group_record *block_group_rec; @@ -4933,32 +4973,36 @@ static int check_chunk_refs(struct chunk_record *chunk_rec, if (chunk_rec->length != block_group_rec->offset || chunk_rec->offset != block_group_rec->objectid || chunk_rec->type_flags != block_group_rec->flags) { + if (!silent) + fprintf(stderr, + "Chunk[%llu, %u, %llu]: length(%llu), offset(%llu), type(%llu) mismatch with block group[%llu, %u, %llu]: offset(%llu), objectid(%llu), flags(%llu)\n", + chunk_rec->objectid, + chunk_rec->type, + chunk_rec->offset, + chunk_rec->length, + chunk_rec->offset, + chunk_rec->type_flags, + block_group_rec->objectid, + block_group_rec->type, + block_group_rec->offset, + block_group_rec->offset, + block_group_rec->objectid, + block_group_rec->flags); + ret = -1; + } else { + list_del_init(&block_group_rec->list); + chunk_rec->bg_rec = block_group_rec; + } + } else { + if (!silent) fprintf(stderr, - "Chunk[%llu, %u, %llu]: length(%llu), offset(%llu), type(%llu) mismatch with block group[%llu, %u, %llu]: offset(%llu), objectid(%llu), flags(%llu)\n", + "Chunk[%llu, %u, %llu]: length(%llu), offset(%llu), type(%llu) is not found in block group\n", chunk_rec->objectid, chunk_rec->type, chunk_rec->offset, chunk_rec->length, chunk_rec->offset, - chunk_rec->type_flags, - block_group_rec->objectid, - block_group_rec->type, - block_group_rec->offset, - block_group_rec->offset, - block_group_rec->objectid, - block_group_rec->flags); - ret = -1; - } - list_del(&block_group_rec->list); - } else { - fprintf(stderr, - "Chunk[%llu, %u, %llu]: length(%llu), offset(%llu), type(%llu) is not found in block group\n", - chunk_rec->objectid, - chunk_rec->type, - chunk_rec->offset, - chunk_rec->length, - chunk_rec->offset, - chunk_rec->type_flags); + chunk_rec->type_flags); ret = -1; } @@ -4976,27 +5020,31 @@ static int check_chunk_refs(struct chunk_record *chunk_rec, dev_extent_rec->offset != offset || dev_extent_rec->chunk_offset != chunk_rec->offset || dev_extent_rec->length != length) { + if (!silent) + fprintf(stderr, + "Chunk[%llu, %u, %llu] stripe[%llu, %llu] dismatch dev extent[%llu, %llu, %llu]\n", + chunk_rec->objectid, + chunk_rec->type, + chunk_rec->offset, + chunk_rec->stripes[i].devid, + chunk_rec->stripes[i].offset, + dev_extent_rec->objectid, + dev_extent_rec->offset, + dev_extent_rec->length); + ret = -1; + } else { + list_move(&dev_extent_rec->chunk_list, + &chunk_rec->dextents); + } + } else { + if (!silent) fprintf(stderr, - "Chunk[%llu, %u, %llu] stripe[%llu, %llu] dismatch dev extent[%llu, %llu, %llu]\n", + "Chunk[%llu, %u, %llu] stripe[%llu, %llu] is not found in dev extent\n", chunk_rec->objectid, chunk_rec->type, chunk_rec->offset, chunk_rec->stripes[i].devid, - chunk_rec->stripes[i].offset, - dev_extent_rec->objectid, - dev_extent_rec->offset, - dev_extent_rec->length); - ret = -1; - } - list_del(&dev_extent_rec->chunk_list); - } else { - fprintf(stderr, - "Chunk[%llu, %u, %llu] stripe[%llu, %llu] is not found in dev extent\n", - chunk_rec->objectid, - chunk_rec->type, - chunk_rec->offset, - chunk_rec->stripes[i].devid, - chunk_rec->stripes[i].offset); + chunk_rec->stripes[i].offset); ret = -1; } } @@ -5004,9 +5052,10 @@ static int check_chunk_refs(struct chunk_record *chunk_rec, } /* check btrfs_chunk -> btrfs_dev_extent / btrfs_block_group_item */ -static int check_chunks(struct cache_tree *chunk_cache, - struct block_group_tree *block_group_cache, - struct device_extent_tree *dev_extent_cache) +int check_chunks(struct cache_tree *chunk_cache, + struct block_group_tree *block_group_cache, + struct device_extent_tree *dev_extent_cache, + struct list_head *good, struct list_head *bad, int silent) { struct cache_extent *chunk_item; struct chunk_record *chunk_rec; @@ -5020,26 +5069,38 @@ static int check_chunks(struct cache_tree *chunk_cache, chunk_rec = container_of(chunk_item, struct chunk_record, cache); err = check_chunk_refs(chunk_rec, block_group_cache, - dev_extent_cache); - if (err) + dev_extent_cache, silent); + if (err) { ret = err; + if (bad) + list_add_tail(&chunk_rec->list, bad); + } else { + if (good) + list_add_tail(&chunk_rec->list, good); + } chunk_item = next_cache_extent(chunk_item); } list_for_each_entry(bg_rec, &block_group_cache->block_groups, list) { - fprintf(stderr, - "Block group[%llu, %llu] (flags = %llu) didn''t find the relative chunk.\n", - bg_rec->objectid, bg_rec->offset, bg_rec->flags); + if (!silent) + fprintf(stderr, + "Block group[%llu, %llu] (flags = %llu) didn''t find the relative chunk.\n", + bg_rec->objectid, + bg_rec->offset, + bg_rec->flags); if (!ret) ret = 1; } list_for_each_entry(dext_rec, &dev_extent_cache->no_chunk_orphans, chunk_list) { - fprintf(stderr, - "Device extent[%llu, %llu, %llu] didn''t find the relative chunk.\n", - dext_rec->objectid, dext_rec->offset, dext_rec->length); + if (!silent) + fprintf(stderr, + "Device extent[%llu, %llu, %llu] didn''t find the relative chunk.\n", + dext_rec->objectid, + dext_rec->offset, + dext_rec->length); if (!ret) ret = 1; } @@ -5237,7 +5298,7 @@ again: } err = check_chunks(&chunk_cache, &block_group_cache, - &dev_extent_cache); + &dev_extent_cache, NULL, NULL, 0); if (err && !ret) ret = err; diff --git a/cmds-chunk.c b/cmds-chunk.c new file mode 100644 index 0000000..35577ed --- /dev/null +++ b/cmds-chunk.c @@ -0,0 +1,1399 @@ +/* + * Copyright (C) 2013 Fujitsu. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ +#define _XOPEN_SOURCE 500 +#define _GNU_SOURCE + +#include <stdio.h> +#include <stdio_ext.h> +#include <stdlib.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <fcntl.h> +#include <unistd.h> +#include <uuid/uuid.h> + +#include "kerncompat.h" +#include "list.h" +#include "radix-tree.h" +#include "ctree.h" +#include "extent-cache.h" +#include "disk-io.h" +#include "volumes.h" +#include "transaction.h" +#include "crc32c.h" +#include "utils.h" +#include "version.h" +#include "btrfsck.h" +#include "commands.h" + +#define BTRFS_CHUNK_TREE_REBUILD_ABORTED -7500 + +struct recover_control { + int verbose; + int yes; + + u16 csum_size; + u32 sectorsize; + u32 leafsize; + u64 generation; + u64 chunk_root_generation; + + struct btrfs_fs_devices *fs_devices; + + struct cache_tree chunk; + struct block_group_tree bg; + struct device_extent_tree devext; + + struct list_head good_chunks; + struct list_head bad_chunks; +}; + +static struct btrfs_chunk *create_chunk_item(struct chunk_record *record) +{ + struct btrfs_chunk *ret; + struct btrfs_stripe *chunk_stripe; + int i; + + if (!record || record->num_stripes == 0) + return NULL; + ret = malloc(btrfs_chunk_item_size(record->num_stripes)); + if (!ret) + return NULL; + btrfs_set_stack_chunk_length(ret, record->length); + btrfs_set_stack_chunk_owner(ret, record->owner); + btrfs_set_stack_chunk_stripe_len(ret, record->stripe_len); + btrfs_set_stack_chunk_type(ret, record->type_flags); + btrfs_set_stack_chunk_io_align(ret, record->io_align); + btrfs_set_stack_chunk_io_width(ret, record->io_width); + btrfs_set_stack_chunk_sector_size(ret, record->sector_size); + btrfs_set_stack_chunk_num_stripes(ret, record->num_stripes); + btrfs_set_stack_chunk_sub_stripes(ret, record->sub_stripes); + for (i = 0, chunk_stripe = &ret->stripe; i < record->num_stripes; + i++, chunk_stripe++) { + btrfs_set_stack_stripe_devid(chunk_stripe, + record->stripes[i].devid); + btrfs_set_stack_stripe_offset(chunk_stripe, + record->stripes[i].offset); + memcpy(chunk_stripe->dev_uuid, record->stripes[i].dev_uuid, + BTRFS_UUID_SIZE); + } + return ret; +} + +void init_recover_control(struct recover_control *rc, int verbose, int yes) +{ + memset(rc, 0, sizeof(struct recover_control)); + cache_tree_init(&rc->chunk); + block_group_tree_init(&rc->bg); + device_extent_tree_init(&rc->devext); + + INIT_LIST_HEAD(&rc->good_chunks); + INIT_LIST_HEAD(&rc->bad_chunks); + + rc->verbose = verbose; + rc->yes = yes; +} + +void free_recover_control(struct recover_control *rc) +{ + free_block_group_tree(&rc->bg); + free_chunk_cache_tree(&rc->chunk); + free_device_extent_tree(&rc->devext); +} + +static int process_block_group_item(struct block_group_tree *bg_cache, + struct extent_buffer *leaf, + struct btrfs_key *key, int slot) +{ + struct block_group_record *rec; + struct block_group_record *exist; + struct cache_extent *cache; + int ret = 0; + + rec = btrfs_new_block_group_record(leaf, key, slot); + if (!rec->cache.size) + goto free_out; +again: + cache = lookup_cache_extent(&bg_cache->tree, + rec->cache.start, + rec->cache.size); + if (cache) { + exist = container_of(cache, struct block_group_record, cache); + + /*check the generation and replace if needed*/ + if (exist->generation > rec->generation) + goto free_out; + if (exist->generation == rec->generation) { + int offset = offsetof(struct block_group_record, + generation); + /* + * According to the current kernel code, the following + * case is impossble, or there is something wrong in + * the kernel code. + */ + if (memcmp(((void *)exist) + offset, + ((void *)rec) + offset, + sizeof(*rec) - offset)) + ret = -EEXIST; + goto free_out; + } + remove_cache_extent(&bg_cache->tree, cache); + list_del_init(&exist->list); + free(exist); + /* + * We must do seach again to avoid the following cache. + * /--old bg 1--//--old bg 2--/ + * /--new bg--/ + */ + goto again; + } + + ret = insert_block_group_record(bg_cache, rec); + BUG_ON(ret); +out: + return ret; +free_out: + free(rec); + goto out; +} + +static int process_chunk_item(struct cache_tree *chunk_cache, + struct extent_buffer *leaf, struct btrfs_key *key, + int slot) +{ + struct chunk_record *rec; + struct chunk_record *exist; + struct cache_extent *cache; + int ret = 0; + + rec = btrfs_new_chunk_record(leaf, key, slot); + if (!rec->cache.size) + goto free_out; +again: + cache = lookup_cache_extent(chunk_cache, rec->offset, rec->length); + if (cache) { + exist = container_of(cache, struct chunk_record, cache); + + if (exist->generation > rec->generation) + goto free_out; + if (exist->generation == rec->generation) { + int num_stripes = rec->num_stripes; + int rec_size = btrfs_chunk_record_size(num_stripes); + int offset = offsetof(struct chunk_record, generation); + + if (exist->num_stripes != rec->num_stripes || + memcmp(((void *)exist) + offset, + ((void *)rec) + offset, + rec_size - offset)) + ret = -EEXIST; + goto free_out; + } + remove_cache_extent(chunk_cache, cache); + free(exist); + goto again; + } + ret = insert_cache_extent(chunk_cache, &rec->cache); + BUG_ON(ret); +out: + return ret; +free_out: + free(rec); + goto out; +} + +static int process_device_extent_item(struct device_extent_tree *devext_cache, + struct extent_buffer *leaf, + struct btrfs_key *key, int slot) +{ + struct device_extent_record *rec; + struct device_extent_record *exist; + struct cache_extent *cache; + int ret = 0; + + rec = btrfs_new_device_extent_record(leaf, key, slot); + if (!rec->cache.size) + goto free_out; +again: + cache = lookup_cache_extent2(&devext_cache->tree, + rec->cache.objectid, + rec->cache.start, + rec->cache.size); + if (cache) { + exist = container_of(cache, struct device_extent_record, cache); + if (exist->generation > rec->generation) + goto free_out; + if (exist->generation == rec->generation) { + int offset = offsetof(struct device_extent_record, + generation); + if (memcmp(((void *)exist) + offset, + ((void *)rec) + offset, + sizeof(*rec) - offset)) + ret = -EEXIST; + goto free_out; + } + remove_cache_extent(&devext_cache->tree, cache); + list_del_init(&exist->chunk_list); + list_del_init(&exist->device_list); + free(exist); + goto again; + } + + ret = insert_device_extent_record(devext_cache, rec); + BUG_ON(ret); +out: + return ret; +free_out: + free(rec); + goto out; +} + +static void print_block_group_info(struct block_group_record *rec, char *prefix) +{ + if (prefix) + printf("%s", prefix); + printf("Block Group: start = %llu, len = %llu, flag = %llx\n", + rec->objectid, rec->offset, rec->flags); +} + +static void print_block_group_tree(struct block_group_tree *tree) +{ + struct cache_extent *cache; + struct block_group_record *rec; + + printf("All Block Groups:\n"); + for (cache = first_cache_extent(&tree->tree); cache; + cache = next_cache_extent(cache)) { + rec = container_of(cache, struct block_group_record, cache); + print_block_group_info(rec, "\t"); + } + printf("\n"); +} + +static void print_stripe_info(struct stripe *data, char *prefix1, char *prefix2, + int index) +{ + if (prefix1) + printf("%s", prefix1); + if (prefix2) + printf("%s", prefix2); + printf("[%2d] Stripe: devid = %llu, offset = %llu\n", + index, data->devid, data->offset); +} + +static void print_chunk_self_info(struct chunk_record *rec, char *prefix) +{ + int i; + + if (prefix) + printf("%s", prefix); + printf("Chunk: start = %llu, len = %llu, type = %llx, num_stripes = %u\n", + rec->offset, rec->length, rec->type_flags, rec->num_stripes); + if (prefix) + printf("%s", prefix); + printf(" Stripes list:\n"); + for (i = 0; i < rec->num_stripes; i++) + print_stripe_info(&rec->stripes[i], prefix, " ", i); +} + +static void print_chunk_tree(struct cache_tree *tree) +{ + struct cache_extent *n; + struct chunk_record *entry; + + printf("All Chunks:\n"); + for (n = first_cache_extent(tree); n; + n = next_cache_extent(n)) { + entry = container_of(n, struct chunk_record, cache); + print_chunk_self_info(entry, "\t"); + } + printf("\n"); +} + +static void print_device_extent_info(struct device_extent_record *rec, + char *prefix) +{ + if (prefix) + printf("%s", prefix); + printf("Device extent: devid = %llu, start = %llu, len = %llu, chunk offset = %llu\n", + rec->objectid, rec->offset, rec->length, rec->chunk_offset); +} + +static void print_device_extent_tree(struct device_extent_tree *tree) +{ + struct cache_extent *n; + struct device_extent_record *entry; + + printf("All Device Extents:\n"); + for (n = first_cache_extent(&tree->tree); n; + n = next_cache_extent(n)) { + entry = container_of(n, struct device_extent_record, cache); + print_device_extent_info(entry, "\t"); + } + printf("\n"); +} + +static void print_device_info(struct btrfs_device *device, char *prefix) +{ + if (prefix) + printf("%s", prefix); + printf("Device: id = %llu, name = %s\n", + device->devid, device->name); +} + +static void print_all_devices(struct list_head *devices) +{ + struct btrfs_device *dev; + + printf("All Devices:\n"); + list_for_each_entry(dev, devices, dev_list) + print_device_info(dev, "\t"); + printf("\n"); +} + +static void print_scan_result(struct recover_control *rc) +{ + if (!rc->verbose) + return; + + printf("DEVICE SCAN RESULT:\n"); + printf("Filesystem Information:\n"); + printf("\tsectorsize: %d\n", rc->sectorsize); + printf("\tleafsize: %d\n", rc->leafsize); + printf("\ttree root generation: %llu\n", rc->generation); + printf("\tchunk root generation: %llu\n", rc->chunk_root_generation); + printf("\n"); + + print_all_devices(&rc->fs_devices->devices); + print_block_group_tree(&rc->bg); + print_chunk_tree(&rc->chunk); + print_device_extent_tree(&rc->devext); +} + +static void print_chunk_info(struct chunk_record *chunk, char *prefix) +{ + struct device_extent_record *devext; + int i; + + print_chunk_self_info(chunk, prefix); + if (prefix) + printf("%s", prefix); + if (chunk->bg_rec) + print_block_group_info(chunk->bg_rec, " "); + else + printf(" No block group.\n"); + if (prefix) + printf("%s", prefix); + if (list_empty(&chunk->dextents)) { + printf(" No device extent.\n"); + } else { + printf(" Device extent list:\n"); + i = 0; + list_for_each_entry(devext, &chunk->dextents, chunk_list) { + if (prefix) + printf("%s", prefix); + printf("%s[%2d]", " ", i); + print_device_extent_info(devext, NULL); + i++; + } + } +} + +static void print_check_result(struct recover_control *rc) +{ + struct chunk_record *chunk; + struct block_group_record *bg; + struct device_extent_record *devext; + int total = 0; + int good = 0; + int bad = 0; + + if (!rc->verbose) + return; + + printf("CHECK RESULT:\n"); + printf("Healthy Chunks:\n"); + list_for_each_entry(chunk, &rc->good_chunks, list) { + print_chunk_info(chunk, " "); + good++; + total++; + } + printf("Bad Chunks:\n"); + list_for_each_entry(chunk, &rc->bad_chunks, list) { + print_chunk_info(chunk, " "); + bad++; + total++; + } + printf("\n"); + printf("Total Chunks:\t%d\n", total); + printf(" Heathy:\t%d\n", good); + printf(" Bad:\t%d\n", bad); + + printf("\n"); + printf("Orphan Block Groups:\n"); + list_for_each_entry(bg, &rc->bg.block_groups, list) + print_block_group_info(bg, " "); + + printf("\n"); + printf("Orphan Device Extents:\n"); + list_for_each_entry(devext, &rc->devext.no_chunk_orphans, chunk_list) + print_device_extent_info(devext, " "); +} + +static int check_chunk_by_metadata(struct recover_control *rc, + struct btrfs_root *root, + struct chunk_record *chunk, int bg_only) +{ + int ret; + int i; + int slot; + struct btrfs_path path; + struct btrfs_key key; + struct btrfs_root *dev_root; + struct stripe *stripe; + struct btrfs_dev_extent *dev_extent; + struct btrfs_block_group_item *bg_ptr; + struct extent_buffer *l; + + btrfs_init_path(&path); + + if (bg_only) + goto bg_check; + + dev_root = root->fs_info->dev_root; + for (i = 0; i < chunk->num_stripes; i++) { + stripe = &chunk->stripes[i]; + + key.objectid = stripe->devid; + key.offset = stripe->offset; + key.type = BTRFS_DEV_EXTENT_KEY; + + ret = btrfs_search_slot(NULL, dev_root, &key, &path, 0, 0); + if (ret < 0) { + fprintf(stderr, "Search device extent failed(%d)\n", + ret); + btrfs_release_path(root, &path); + return ret; + } else if (ret > 0) { + if (rc->verbose) + fprintf(stderr, + "No device extent[%llu, %llu]\n", + stripe->devid, stripe->offset); + btrfs_release_path(root, &path); + return -ENOENT; + } + l = path.nodes[0]; + slot = path.slots[0]; + dev_extent = btrfs_item_ptr(l, slot, struct btrfs_dev_extent); + if (chunk->offset !+ btrfs_dev_extent_chunk_offset(l, dev_extent)) { + if (rc->verbose) + fprintf(stderr, + "Device tree unmatch with chunks dev_extent[%llu, %llu], chunk[%llu, %llu]\n", + btrfs_dev_extent_chunk_offset(l, + dev_extent), + btrfs_dev_extent_length(l, dev_extent), + chunk->offset, chunk->length); + btrfs_release_path(root, &path); + return -ENOENT; + } + btrfs_release_path(root, &path); + } + +bg_check: + key.objectid = chunk->offset; + key.type = BTRFS_BLOCK_GROUP_ITEM_KEY; + key.offset = chunk->length; + + ret = btrfs_search_slot(NULL, root->fs_info->extent_root, &key, &path, + 0, 0); + if (ret < 0) { + fprintf(stderr, "Search block group failed(%d)\n", ret); + btrfs_release_path(root, &path); + return ret; + } else if (ret > 0) { + if (rc->verbose) + fprintf(stderr, "No block group[%llu, %llu]\n", + key.objectid, key.offset); + btrfs_release_path(root, &path); + return -ENOENT; + } + + l = path.nodes[0]; + slot = path.slots[0]; + bg_ptr = btrfs_item_ptr(l, slot, struct btrfs_block_group_item); + if (chunk->type_flags != btrfs_disk_block_group_flags(l, bg_ptr)) { + if (rc->verbose) + fprintf(stderr, + "Chunk[%llu, %llu]''s type(%llu) is differemt with Block Group''s type(%llu)\n", + chunk->offset, chunk->length, chunk->type_flags, + btrfs_disk_block_group_flags(l, bg_ptr)); + btrfs_release_path(root, &path); + return -ENOENT; + } + btrfs_release_path(root, &path); + return 0; +} + +static int check_all_chunks_by_metadata(struct recover_control *rc, + struct btrfs_root *root) +{ + struct chunk_record *chunk; + LIST_HEAD(orphan_chunks); + int ret = 0; + int err; + + list_for_each_entry(chunk, &rc->good_chunks, list) { + err = check_chunk_by_metadata(rc, root, chunk, 0); + if (err) { + if (err == -ENOENT) + list_move_tail(&chunk->list, &orphan_chunks); + else if (err && !ret) + ret = err; + } + } + + list_for_each_entry(chunk, &rc->bad_chunks, list) { + err = check_chunk_by_metadata(rc, root, chunk, 1); + if (err != -ENOENT && !ret) + ret = err ? err : -EINVAL; + } + list_splice(&orphan_chunks, &rc->bad_chunks); + return ret; +} + +static int extract_metadata_record(struct recover_control *rc, + struct extent_buffer *leaf) +{ + struct btrfs_key key; + int ret = 0; + int i; + u32 nritems; + + nritems = btrfs_header_nritems(leaf); + for (i = 0; i < nritems; i++) { + btrfs_item_key_to_cpu(leaf, &key, i); + switch (key.type) { + case BTRFS_BLOCK_GROUP_ITEM_KEY: + ret = process_block_group_item(&rc->bg, leaf, &key, i); + break; + case BTRFS_CHUNK_ITEM_KEY: + ret = process_chunk_item(&rc->chunk, leaf, &key, i); + break; + case BTRFS_DEV_EXTENT_KEY: + ret = process_device_extent_item(&rc->devext, leaf, + &key, i); + break; + } + if (ret) + break; + } + return ret; +} + +static inline int is_super_block_address(u64 offset) +{ + int i; + + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { + if (offset == btrfs_sb_offset(i)) + return 1; + } + return 0; +} + +static int scan_one_device(struct recover_control *rc, int fd) +{ + struct extent_buffer *buf; + u64 bytenr; + int ret = 0; + + buf = malloc(sizeof(*buf) + rc->leafsize); + if (!buf) + return -ENOMEM; + buf->len = rc->leafsize; + + bytenr = 0; + while (1) { + if (is_super_block_address(bytenr)) + bytenr += rc->sectorsize; + + if (pread64(fd, buf->data, rc->leafsize, bytenr) < + rc->leafsize) + break; + + if (memcmp_extent_buffer(buf, rc->fs_devices->fsid, + (unsigned long)btrfs_header_fsid(buf), + BTRFS_FSID_SIZE)) { + bytenr += rc->sectorsize; + continue; + } + + if (verify_tree_block_csum_silent(buf, rc->csum_size)) { + bytenr += rc->sectorsize; + continue; + } + + if (btrfs_header_level(buf) != 0) + goto next_node; + + switch (btrfs_header_owner(buf)) { + case BTRFS_EXTENT_TREE_OBJECTID: + case BTRFS_DEV_TREE_OBJECTID: + /* different tree use different generation */ + if (btrfs_header_generation(buf) > rc->generation) + break; + ret = extract_metadata_record(rc, buf); + if (ret) + goto out; + break; + case BTRFS_CHUNK_TREE_OBJECTID: + if (btrfs_header_generation(buf) > + rc->chunk_root_generation) + break; + ret = extract_metadata_record(rc, buf); + if (ret) + goto out; + break; + } +next_node: + bytenr += rc->leafsize; + } +out: + free(buf); + return ret; +} + +static int scan_devices(struct recover_control *rc) +{ + int ret = 0; + int fd; + struct btrfs_device *dev; + + list_for_each_entry(dev, &rc->fs_devices->devices, dev_list) { + fd = open(dev->name, O_RDONLY); + if (fd < 0) { + fprintf(stderr, "Failed to open device %s\n", + dev->name); + return -1; + } + ret = scan_one_device(rc, fd); + close(fd); + if (ret) + return ret; + } + return ret; +} + +static int build_device_map_by_chunk_record(struct btrfs_root *root, + struct chunk_record *chunk) +{ + int ret = 0; + int i; + u64 devid; + u8 uuid[BTRFS_UUID_SIZE]; + u16 num_stripes; + struct btrfs_mapping_tree *map_tree; + struct map_lookup *map; + struct stripe *stripe; + + map_tree = &root->fs_info->mapping_tree; + num_stripes = chunk->num_stripes; + map = malloc(btrfs_map_lookup_size(num_stripes)); + if (!map) + return -ENOMEM; + map->ce.start = chunk->offset; + map->ce.size = chunk->length; + map->num_stripes = num_stripes; + map->io_width = chunk->io_width; + map->io_align = chunk->io_align; + map->sector_size = chunk->sector_size; + map->stripe_len = chunk->stripe_len; + map->type = chunk->type_flags; + map->sub_stripes = chunk->sub_stripes; + + for (i = 0, stripe = chunk->stripes; i < num_stripes; i++, stripe++) { + devid = stripe->devid; + memcpy(uuid, stripe->dev_uuid, BTRFS_UUID_SIZE); + map->stripes[i].physical = stripe->offset; + map->stripes[i].dev = btrfs_find_device(root, devid, + uuid, NULL); + if (!map->stripes[i].dev) { + kfree(map); + return -EIO; + } + } + + ret = insert_cache_extent(&map_tree->cache_tree, &map->ce); + return ret; +} + +static int build_device_maps_by_chunk_records(struct recover_control *rc, + struct btrfs_root *root) +{ + int ret = 0; + struct chunk_record *chunk; + + list_for_each_entry(chunk, &rc->good_chunks, list) { + ret = build_device_map_by_chunk_record(root, chunk); + if (ret) + return ret; + } + return ret; +} + +static int block_group_remove_all_extent_items(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + struct block_group_record *bg) +{ + struct btrfs_fs_info *fs_info = root->fs_info; + struct btrfs_key key; + struct btrfs_path path; + struct extent_buffer *leaf; + u64 start = bg->objectid; + u64 end = bg->objectid + bg->offset; + u64 old_val; + int nitems; + int ret; + int i; + int del_s, del_nr; + + btrfs_init_path(&path); + root = root->fs_info->extent_root; + + key.objectid = start; + key.offset = 0; + key.type = BTRFS_EXTENT_ITEM_KEY; +again: + ret = btrfs_search_slot(trans, root, &key, &path, -1, 1); + if (ret < 0) + goto err; + else if (ret > 0) + ret = 0; + + leaf = path.nodes[0]; + nitems = btrfs_header_nritems(leaf); + if (!nitems) { + /* The tree is empty. */ + ret = 0; + goto err; + } + + if (path.slots[0] >= nitems) { + ret = btrfs_next_leaf(root, &path); + if (ret < 0) + goto err; + if (ret > 0) { + ret = 0; + goto err; + } + leaf = path.nodes[0]; + btrfs_item_key_to_cpu(leaf, &key, 0); + if (key.objectid >= end) + goto err; + btrfs_release_path(root, &path); + goto again; + } + + del_nr = 0; + del_s = -1; + for (i = path.slots[0]; i < nitems; i++) { + btrfs_item_key_to_cpu(leaf, &key, i); + if (key.objectid >= end) + break; + + if (key.type == BTRFS_BLOCK_GROUP_ITEM_KEY) { + if (del_nr == 0) + continue; + else + break; + } + + if (del_s == -1) + del_s = i; + del_nr++; + if (key.type == BTRFS_EXTENT_ITEM_KEY || + key.type == BTRFS_METADATA_ITEM_KEY) { + old_val = btrfs_super_bytes_used(fs_info->super_copy); + if (key.type == BTRFS_METADATA_ITEM_KEY) + old_val += root->leafsize; + else + old_val += key.offset; + btrfs_set_super_bytes_used(fs_info->super_copy, + old_val); + } + } + + if (del_nr) { + ret = btrfs_del_items(trans, root, &path, del_s, del_nr); + if (ret) + goto err; + } + + if (key.objectid < end) { + if (key.type == BTRFS_BLOCK_GROUP_ITEM_KEY) { + key.objectid += root->sectorsize; + key.type = BTRFS_EXTENT_ITEM_KEY; + key.offset = 0; + } + btrfs_release_path(root, &path); + goto again; + } +err: + btrfs_release_path(root, &path); + return ret; +} + +static int block_group_free_all_extent(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + struct block_group_record *bg) +{ + struct btrfs_block_group_cache *cache; + struct btrfs_fs_info *info; + u64 start; + u64 end; + + info = root->fs_info; + cache = btrfs_lookup_block_group(info, bg->objectid); + if (!cache) + return -ENOENT; + + start = cache->key.objectid; + end = start + cache->key.offset - 1; + + set_extent_bits(&info->block_group_cache, start, end, + BLOCK_GROUP_DIRTY, GFP_NOFS); + set_extent_dirty(&info->free_space_cache, start, end, GFP_NOFS); + + btrfs_set_block_group_used(&cache->item, 0); + + return 0; +} + +static int remove_chunk_extent_item(struct btrfs_trans_handle *trans, + struct recover_control *rc, + struct btrfs_root *root) +{ + struct chunk_record *chunk; + int ret = 0; + + list_for_each_entry(chunk, &rc->good_chunks, list) { + if (!(chunk->type_flags & BTRFS_BLOCK_GROUP_SYSTEM)) + continue; + ret = block_group_remove_all_extent_items(trans, root, + chunk->bg_rec); + if (ret) + return ret; + + ret = block_group_free_all_extent(trans, root, chunk->bg_rec); + if (ret) + return ret; + } + return ret; +} + +static int __rebuild_chunk_root(struct btrfs_trans_handle *trans, + struct recover_control *rc, + struct btrfs_root *root) +{ + u64 min_devid = -1; + struct btrfs_device *dev; + struct extent_buffer *cow; + struct btrfs_disk_key disk_key; + int ret = 0; + + list_for_each_entry(dev, &rc->fs_devices->devices, dev_list) { + if (min_devid > dev->devid) + min_devid = dev->devid; + } + disk_key.objectid = BTRFS_DEV_ITEMS_OBJECTID; + disk_key.type = BTRFS_DEV_ITEM_KEY; + disk_key.offset = min_devid; + + cow = btrfs_alloc_free_block(trans, root, root->sectorsize, + BTRFS_CHUNK_TREE_OBJECTID, + &disk_key, 0, 0, 0); + btrfs_set_header_bytenr(cow, cow->start); + btrfs_set_header_generation(cow, trans->transid); + btrfs_set_header_nritems(cow, 0); + btrfs_set_header_level(cow, 0); + btrfs_set_header_backref_rev(cow, BTRFS_MIXED_BACKREF_REV); + btrfs_set_header_owner(cow, BTRFS_CHUNK_TREE_OBJECTID); + write_extent_buffer(cow, root->fs_info->fsid, + (unsigned long)btrfs_header_fsid(cow), + BTRFS_FSID_SIZE); + + write_extent_buffer(cow, root->fs_info->chunk_tree_uuid, + (unsigned long)btrfs_header_chunk_tree_uuid(cow), + BTRFS_UUID_SIZE); + + root->node = cow; + btrfs_mark_buffer_dirty(cow); + + return ret; +} + +static int __rebuild_device_items(struct btrfs_trans_handle *trans, + struct recover_control *rc, + struct btrfs_root *root) +{ + struct btrfs_device *dev; + struct btrfs_key key; + struct btrfs_dev_item *dev_item; + int ret = 0; + + dev_item = malloc(sizeof(struct btrfs_dev_item)); + if (!dev_item) + return -ENOMEM; + + list_for_each_entry(dev, &rc->fs_devices->devices, dev_list) { + key.objectid = BTRFS_DEV_ITEMS_OBJECTID; + key.type = BTRFS_DEV_ITEM_KEY; + key.offset = dev->devid; + + btrfs_set_stack_device_generation(dev_item, 0); + btrfs_set_stack_device_type(dev_item, dev->type); + btrfs_set_stack_device_id(dev_item, dev->devid); + btrfs_set_stack_device_total_bytes(dev_item, dev->total_bytes); + btrfs_set_stack_device_bytes_used(dev_item, dev->bytes_used); + btrfs_set_stack_device_io_align(dev_item, dev->io_align); + btrfs_set_stack_device_io_width(dev_item, dev->io_width); + btrfs_set_stack_device_sector_size(dev_item, dev->sector_size); + memcpy(dev_item->uuid, dev->uuid, BTRFS_UUID_SIZE); + memcpy(dev_item->fsid, dev->fs_devices->fsid, BTRFS_UUID_SIZE); + + ret = btrfs_insert_item(trans, root, &key, + dev_item, sizeof(*dev_item)); + } + + free(dev_item); + return ret; +} + +static int __rebuild_chunk_items(struct btrfs_trans_handle *trans, + struct recover_control *rc, + struct btrfs_root *root) +{ + struct btrfs_key key; + struct btrfs_chunk *chunk = NULL; + struct btrfs_root *chunk_root; + struct chunk_record *chunk_rec; + int ret; + + chunk_root = root->fs_info->chunk_root; + + list_for_each_entry(chunk_rec, &rc->good_chunks, list) { + chunk = create_chunk_item(chunk_rec); + if (!chunk) + return -ENOMEM; + + key.objectid = BTRFS_FIRST_CHUNK_TREE_OBJECTID; + key.type = BTRFS_CHUNK_ITEM_KEY; + key.offset = chunk_rec->offset; + + ret = btrfs_insert_item(trans, chunk_root, &key, chunk, + btrfs_chunk_item_size(chunk->num_stripes)); + free(chunk); + if (ret) + return ret; + } + return 0; +} + +static int rebuild_chunk_tree(struct btrfs_trans_handle *trans, + struct recover_control *rc, + struct btrfs_root *root) +{ + int ret = 0; + + root = root->fs_info->chunk_root; + + ret = __rebuild_chunk_root(trans, rc, root); + if (ret) + return ret; + + ret = __rebuild_device_items(trans, rc, root); + if (ret) + return ret; + + ret = __rebuild_chunk_items(trans, rc, root); + + return ret; +} + +static int rebuild_sys_array(struct recover_control *rc, + struct btrfs_root *root) +{ + struct btrfs_chunk *chunk; + struct btrfs_key key; + struct chunk_record *chunk_rec; + int ret = 0; + u16 num_stripes; + + btrfs_set_super_sys_array_size(root->fs_info->super_copy, 0); + + list_for_each_entry(chunk_rec, &rc->good_chunks, list) { + if (!(chunk_rec->type_flags & BTRFS_BLOCK_GROUP_SYSTEM)) + continue; + + num_stripes = chunk_rec->num_stripes; + chunk = create_chunk_item(chunk_rec); + if (!chunk) { + ret = -ENOMEM; + break; + } + + key.objectid = BTRFS_FIRST_CHUNK_TREE_OBJECTID; + key.type = BTRFS_CHUNK_ITEM_KEY; + key.offset = chunk_rec->offset; + + ret = btrfs_add_system_chunk(NULL, root, &key, chunk, + btrfs_chunk_item_size(num_stripes)); + free(chunk); + if (ret) + break; + } + return ret; + +} + +static struct btrfs_root * +open_ctree_with_broken_chunk(struct recover_control *rc) +{ + struct btrfs_fs_info *fs_info; + struct btrfs_super_block *disk_super; + struct extent_buffer *eb; + u32 sectorsize; + u32 nodesize; + u32 leafsize; + u32 stripesize; + int ret; + + fs_info = btrfs_new_fs_info(1, BTRFS_SUPER_INFO_OFFSET); + if (!fs_info) { + fprintf(stderr, "Failed to allocate memory for fs_info\n"); + return ERR_PTR(-ENOMEM); + } + + fs_info->fs_devices = rc->fs_devices; + ret = btrfs_open_devices(fs_info->fs_devices, O_RDWR); + if (ret) + goto out; + + disk_super = fs_info->super_copy; + ret = btrfs_read_dev_super(fs_info->fs_devices->latest_bdev, + disk_super, fs_info->super_bytenr); + if (ret) { + fprintf(stderr, "No valid btrfs found\n"); + goto out_devices; + } + + memcpy(fs_info->fsid, &disk_super->fsid, BTRFS_FSID_SIZE); + + ret = btrfs_check_fs_compatibility(disk_super, 1); + if (ret) + goto out_devices; + + nodesize = btrfs_super_nodesize(disk_super); + leafsize = btrfs_super_leafsize(disk_super); + sectorsize = btrfs_super_sectorsize(disk_super); + stripesize = btrfs_super_stripesize(disk_super); + + __setup_root(nodesize, leafsize, sectorsize, stripesize, + fs_info->chunk_root, fs_info, BTRFS_CHUNK_TREE_OBJECTID); + + ret = build_device_maps_by_chunk_records(rc, fs_info->chunk_root); + if (ret) + goto out_cleanup; + + ret = btrfs_setup_all_roots(fs_info, 0, 0); + if (ret) + goto out_failed; + + eb = fs_info->tree_root->node; + read_extent_buffer(eb, fs_info->chunk_tree_uuid, + (unsigned long)btrfs_header_chunk_tree_uuid(eb), + BTRFS_UUID_SIZE); + + return fs_info->fs_root; +out_failed: + btrfs_release_all_roots(fs_info); +out_cleanup: + btrfs_cleanup_all_caches(fs_info); +out_devices: + btrfs_close_devices(fs_info->fs_devices); +out: + btrfs_free_fs_info(fs_info); + return ERR_PTR(ret); +} + +static int recover_prepare(struct recover_control *rc, char *path) +{ + int ret; + int fd; + struct btrfs_super_block *sb; + struct btrfs_fs_devices *fs_devices; + + ret = 0; + fd = open(path, O_RDONLY); + if (fd < 0) { + fprintf(stderr, "open %s\n error.\n", path); + return -1; + } + + sb = malloc(sizeof(struct btrfs_super_block)); + if (!sb) { + fprintf(stderr, "allocating memory for sb failed.\n"); + ret = -ENOMEM; + goto fail_close_fd; + } + + ret = btrfs_read_dev_super(fd, sb, BTRFS_SUPER_INFO_OFFSET); + if (ret) { + fprintf(stderr, "read super block error\n"); + goto fail_free_sb; + } + + rc->sectorsize = btrfs_super_sectorsize(sb); + rc->leafsize = btrfs_super_leafsize(sb); + rc->generation = btrfs_super_generation(sb); + rc->chunk_root_generation = btrfs_super_chunk_root_generation(sb); + rc->csum_size = btrfs_super_csum_size(sb); + + /* if seed, the result of scanning below will be partial */ + if (btrfs_super_flags(sb) & BTRFS_SUPER_FLAG_SEEDING) { + fprintf(stderr, "this device is seed device\n"); + ret = -1; + goto fail_free_sb; + } + + ret = btrfs_scan_fs_devices(fd, path, &fs_devices); + if (ret) + goto fail_free_sb; + + rc->fs_devices = fs_devices; + + if (rc->verbose) + print_all_devices(&rc->fs_devices->devices); + +fail_free_sb: + free(sb); +fail_close_fd: + close(fd); + return ret; +} + +static int ask_user(char *question, int defval) +{ + char answer[5]; + char *defstr; + int i; + + if (defval == 1) + defstr = "[Y/n]"; + else if (defval == 0) + defstr = "[y/N]"; + else if (defval == -1) + defstr = "[y/n]"; + else + BUG_ON(1); +again: + printf("%s%s? ", question, defstr); + + i = 0; + while (i < 4 && scanf("%c", &answer[i])) { + if (answer[i] == ''\n'') { + answer[i] = ''\0''; + break; + } else if (answer[i] == '' ''){ + answer[i] = ''\0''; + if (i == 0) + continue; + else + break; + } else if (answer[i] >= ''A'' && answer[i] <= ''Z'') { + answer[i] += ''a'' - ''A''; + } + i++; + } + answer[5] = ''\0''; + __fpurge(stdin); + + if (strlen(answer) == 0) { + if (defval != -1) + return defval; + else + goto again; + } + + if (!strcmp(answer, "yes") || + !strcmp(answer, "y")) + return 1; + + if (!strcmp(answer, "no") || + !strcmp(answer, "n")) + return 0; + + goto again; +} + +static int btrfs_recover_chunk_tree(char *path, int verbose, int yes) +{ + int ret = 0; + struct btrfs_root *root = NULL; + struct btrfs_trans_handle *trans; + struct recover_control rc; + + init_recover_control(&rc, verbose, yes); + + ret = recover_prepare(&rc, path); + if (ret) { + fprintf(stderr, "recover prepare error\n"); + return ret; + } + + ret = scan_devices(&rc); + if (ret) { + fprintf(stderr, "scan chunk headers error\n"); + goto fail_rc; + } + + if (cache_tree_empty(&rc.chunk) && + cache_tree_empty(&rc.bg.tree) && + cache_tree_empty(&rc.devext.tree)) { + fprintf(stderr, "no recoverable chunk\n"); + goto fail_rc; + } + + print_scan_result(&rc); + + ret = check_chunks(&rc.chunk, &rc.bg, &rc.devext, &rc.good_chunks, + &rc.bad_chunks, 1); + print_check_result(&rc); + if (ret) { + if (!list_empty(&rc.bg.block_groups) || + !list_empty(&rc.devext.no_chunk_orphans)) { + fprintf(stderr, + "There are some orphan block groups and device extents, we can''t repair them now.\n"); + goto fail_rc; + } + /* + * If the chunk is healthy, its block group item and device + * extent item should be written on the disks. So, it is very + * likely that the bad chunk is a old one that has been + * droppped from the fs. Don''t deal with them now, we will + * check it after the fs is opened. + */ + } + + root = open_ctree_with_broken_chunk(&rc); + if (IS_ERR(root)) { + fprintf(stderr, "open with broken chunk error\n"); + ret = PTR_ERR(root); + goto fail_rc; + } + + ret = check_all_chunks_by_metadata(&rc, root); + if (ret) { + fprintf(stderr, "The chunks in memory can not match the metadata of the fs. Repair failed.\n"); + goto fail_close_ctree; + } + + if (!rc.yes) { + ret = ask_user("We are going to rebuild the chunk tree on disk, it might destroy the old metadata on the disk, Are you sure", + 0); + if (!ret) { + ret = BTRFS_CHUNK_TREE_REBUILD_ABORTED; + goto fail_close_ctree; + } + } + + trans = btrfs_start_transaction(root, 1); + ret = remove_chunk_extent_item(trans, &rc, root); + BUG_ON(ret); + + ret = rebuild_chunk_tree(trans, &rc, root); + BUG_ON(ret); + + ret = rebuild_sys_array(&rc, root); + BUG_ON(ret); + + btrfs_commit_transaction(trans, root); +fail_close_ctree: + close_ctree(root); +fail_rc: + free_recover_control(&rc); + return ret; +} + +const char * const cmd_chunk_recover_usage[] = { + "btrfs chunk-recover [options] <device>", + "Recover the chunk tree by scaning the devices one by one.", + "", + "-y Assume an answer of `yes'' to all questions", + "-v Verbose mode", + "-h Help", + NULL +}; + +int cmd_chunk_recover(int argc, char *argv[]) +{ + int ret = 0; + char *file; + int yes = 0; + int verbose = 0; + + while (1) { + int c = getopt(argc, argv, "yvh"); + if (c < 0) + break; + switch (c) { + case ''y'': + yes = 1; + break; + case ''v'': + verbose = 1; + break; + case ''h'': + default: + usage(cmd_chunk_recover_usage); + } + } + + argc = argc - optind; + if (argc == 0) + usage(cmd_chunk_recover_usage); + + file = argv[optind]; + + ret = check_mounted(file); + if (ret) { + fprintf(stderr, "the device is busy\n"); + return ret; + } + + ret = btrfs_recover_chunk_tree(file, verbose, yes); + if (!ret) { + fprintf(stdout, "Recover the chunk tree successfully.\n"); + } else if (ret == BTRFS_CHUNK_TREE_REBUILD_ABORTED) { + ret = 0; + fprintf(stdout, "Abort to rebuild the on-disk chunk tree.\n"); + } else { + fprintf(stdout, "Fail to recover the chunk tree.\n"); + } + return ret; +} diff --git a/commands.h b/commands.h index 15c616d..65829f4 100644 --- a/commands.h +++ b/commands.h @@ -94,6 +94,7 @@ extern const struct cmd_group replace_cmd_group; extern const char * const cmd_send_usage[]; extern const char * const cmd_receive_usage[]; extern const char * const cmd_check_usage[]; +extern const char * const cmd_chunk_recover_usage[]; extern const char * const cmd_restore_usage[]; int cmd_subvolume(int argc, char **argv); @@ -102,6 +103,7 @@ int cmd_balance(int argc, char **argv); int cmd_device(int argc, char **argv); int cmd_scrub(int argc, char **argv); int cmd_check(int argc, char **argv); +int cmd_chunk_recover(int argc, char **argv); int cmd_inspect(int argc, char **argv); int cmd_send(int argc, char **argv); int cmd_receive(int argc, char **argv); diff --git a/disk-io.c b/disk-io.c index 7140367..a41d166 100644 --- a/disk-io.c +++ b/disk-io.c @@ -70,8 +70,8 @@ void btrfs_csum_final(u32 crc, char *result) *(__le32 *)result = ~cpu_to_le32(crc); } -int csum_tree_block_size(struct extent_buffer *buf, u16 csum_size, - int verify) +static int __csum_tree_block_size(struct extent_buffer *buf, u16 csum_size, + int verify, int silent) { char *result; u32 len; @@ -87,9 +87,11 @@ int csum_tree_block_size(struct extent_buffer *buf, u16 csum_size, if (verify) { if (memcmp_extent_buffer(buf, result, 0, csum_size)) { - printk("checksum verify failed on %llu found %08X " - "wanted %08X\n", (unsigned long long)buf->start, - *((u32 *)result), *((u32*)(char *)buf->data)); + if (!silent) + printk("checksum verify failed on %llu found %08X wanted %08X\n", + (unsigned long long)buf->start, + *((u32 *)result), + *((u32*)(char *)buf->data)); free(result); return 1; } @@ -100,6 +102,16 @@ int csum_tree_block_size(struct extent_buffer *buf, u16 csum_size, return 0; } +int csum_tree_block_size(struct extent_buffer *buf, u16 csum_size, int verify) +{ + return __csum_tree_block_size(buf, csum_size, verify, 0); +} + +int verify_tree_block_csum_silent(struct extent_buffer *buf, u16 csum_size) +{ + return __csum_tree_block_size(buf, csum_size, 1, 1); +} + int csum_tree_block(struct btrfs_root *root, struct extent_buffer *buf, int verify) { diff --git a/disk-io.h b/disk-io.h index e845459..5fed663 100644 --- a/disk-io.h +++ b/disk-io.h @@ -92,6 +92,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, int btrfs_open_device(struct btrfs_device *dev); int csum_tree_block_size(struct extent_buffer *buf, u16 csum_sectorsize, int verify); +int verify_tree_block_csum_silent(struct extent_buffer *buf, u16 csum_size); int csum_tree_block(struct btrfs_root *root, struct extent_buffer *buf, int verify); int btrfs_read_buffer(struct extent_buffer *buf, u64 parent_transid); diff --git a/extent-tree.c b/extent-tree.c index b0cfe0a..f597e16 100644 --- a/extent-tree.c +++ b/extent-tree.c @@ -28,12 +28,6 @@ #include "volumes.h" #include "free-space-cache.h" -#define BLOCK_GROUP_DATA EXTENT_WRITEBACK -#define BLOCK_GROUP_METADATA EXTENT_UPTODATE -#define BLOCK_GROUP_SYSTEM EXTENT_NEW - -#define BLOCK_GROUP_DIRTY EXTENT_DIRTY - #define PENDING_EXTENT_INSERT 0 #define PENDING_EXTENT_DELETE 1 #define PENDING_BACKREF_UPDATE 2 diff --git a/extent_io.h b/extent_io.h index a0308a9..2f5ff02 100644 --- a/extent_io.h +++ b/extent_io.h @@ -41,6 +41,12 @@ #define EXTENT_CSUM (1 << 9) #define EXTENT_IOBITS (EXTENT_LOCKED | EXTENT_WRITEBACK) +#define BLOCK_GROUP_DATA EXTENT_WRITEBACK +#define BLOCK_GROUP_METADATA EXTENT_UPTODATE +#define BLOCK_GROUP_SYSTEM EXTENT_NEW + +#define BLOCK_GROUP_DIRTY EXTENT_DIRTY + struct btrfs_fs_info; struct extent_io_tree { diff --git a/volumes.c b/volumes.c index a3acee8..42cd943 100644 --- a/volumes.c +++ b/volumes.c @@ -52,9 +52,6 @@ static inline int nr_data_stripes(struct map_lookup *map) #define is_parity_stripe(x) ( ((x) == BTRFS_RAID5_P_STRIPE) || ((x) == BTRFS_RAID6_Q_STRIPE) ) -#define map_lookup_size(n) (sizeof(struct map_lookup) + \ - (sizeof(struct btrfs_bio_stripe) * (n))) - static LIST_HEAD(fs_uuids); static struct btrfs_device *__find_device(struct list_head *head, u64 devid, @@ -823,7 +820,7 @@ again: if (!chunk) return -ENOMEM; - map = kmalloc(map_lookup_size(num_stripes), GFP_NOFS); + map = kmalloc(btrfs_map_lookup_size(num_stripes), GFP_NOFS); if (!map) { kfree(chunk); return -ENOMEM; @@ -935,7 +932,7 @@ int btrfs_alloc_data_chunk(struct btrfs_trans_handle *trans, if (!chunk) return -ENOMEM; - map = kmalloc(map_lookup_size(num_stripes), GFP_NOFS); + map = kmalloc(btrfs_map_lookup_size(num_stripes), GFP_NOFS); if (!map) { kfree(chunk); return -ENOMEM; @@ -1420,7 +1417,7 @@ int btrfs_bootstrap_super_map(struct btrfs_mapping_tree *map_tree, list_for_each(cur, &fs_devices->devices) { num_stripes++; } - map = kmalloc(map_lookup_size(num_stripes), GFP_NOFS); + map = kmalloc(btrfs_map_lookup_size(num_stripes), GFP_NOFS); if (!map) return -ENOMEM; @@ -1517,7 +1514,7 @@ static int read_one_chunk(struct btrfs_root *root, struct btrfs_key *key, } num_stripes = btrfs_chunk_num_stripes(leaf, chunk); - map = kmalloc(map_lookup_size(num_stripes), GFP_NOFS); + map = kmalloc(btrfs_map_lookup_size(num_stripes), GFP_NOFS); if (!map) return -ENOMEM; diff --git a/volumes.h b/volumes.h index 911f788..91277a7 100644 --- a/volumes.h +++ b/volumes.h @@ -103,6 +103,8 @@ struct map_lookup { #define btrfs_multi_bio_size(n) (sizeof(struct btrfs_multi_bio) + \ (sizeof(struct btrfs_bio_stripe) * (n))) +#define btrfs_map_lookup_size(n) (sizeof(struct map_lookup) + \ + (sizeof(struct btrfs_bio_stripe) * (n))) /* * Restriper''s general type filter @@ -190,4 +192,6 @@ int btrfs_add_system_chunk(struct btrfs_trans_handle *trans, int btrfs_chunk_readonly(struct btrfs_root *root, u64 chunk_offset); struct btrfs_device *btrfs_find_device_by_devid(struct btrfs_root *root, u64 devid, int instance); +struct btrfs_device *btrfs_find_device(struct btrfs_root *root, u64 devid, + u8 *uuid, u8 *fsid); #endif -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Jul-03 13:25 UTC
[PATCH 10/12] Btrfs-progs: introduce list_{first, next}_entry/list_splice_tail{_init}
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> --- list.h | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 59 insertions(+), 9 deletions(-) diff --git a/list.h b/list.h index 50f4619..db7a58c 100644 --- a/list.h +++ b/list.h @@ -223,18 +223,18 @@ static inline int list_empty_careful(const struct list_head *head) return (next == head) && (next == head->prev); } -static inline void __list_splice(struct list_head *list, - struct list_head *head) +static inline void __list_splice(const struct list_head *list, + struct list_head *prev, + struct list_head *next) { struct list_head *first = list->next; struct list_head *last = list->prev; - struct list_head *at = head->next; - first->prev = head; - head->next = first; + first->prev = prev; + prev->next = first; - last->next = at; - at->prev = last; + last->next = next; + next->prev = last; } /** @@ -245,7 +245,19 @@ static inline void __list_splice(struct list_head *list, static inline void list_splice(struct list_head *list, struct list_head *head) { if (!list_empty(list)) - __list_splice(list, head); + __list_splice(list, head, head->next); +} + +/** + * list_splice_tail - join two lists, each list being a queue + * @list: the new list to add. + * @head: the place to add it in the first list. + */ +static inline void list_splice_tail(struct list_head *list, + struct list_head *head) +{ + if (!list_empty(list)) + __list_splice(list, head->prev, head); } /** @@ -259,7 +271,24 @@ static inline void list_splice_init(struct list_head *list, struct list_head *head) { if (!list_empty(list)) { - __list_splice(list, head); + __list_splice(list, head, head->next); + INIT_LIST_HEAD(list); + } +} + +/** + * list_splice_tail_init - join two lists and reinitialise the emptied list + * @list: the new list to add. + * @head: the place to add it in the first list. + * + * Each of the lists is a queue. + * The list at @list is reinitialised + */ +static inline void list_splice_tail_init(struct list_head *list, + struct list_head *head) +{ + if (!list_empty(list)) { + __list_splice(list, head->prev, head); INIT_LIST_HEAD(list); } } @@ -274,6 +303,27 @@ static inline void list_splice_init(struct list_head *list, container_of(ptr, type, member) /** + * list_first_entry - get the first element from a list + * @ptr: the list head to take the element from. + * @type: the type of the struct this is embedded in. + * @member: the name of the list_struct within the struct. + * + * Note, that list is expected to be not empty. + */ +#define list_first_entry(ptr, type, member) \ + list_entry((ptr)->next, type, member) + +/** + * list_next_entry - get the next element from a list + * @ptr: the list head to take the element from. + * @member: the name of the list_struct within the struct. + * + * Note, that next is expected to be not null. + */ +#define list_next_entry(ptr, member) \ + list_entry((ptr)->member.next, typeof(*ptr), member) + +/** * list_for_each - iterate over a list * @pos: the &struct list_head to use as a loop cursor. * @head: the head for your list. -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Jul-03 13:25 UTC
[PATCH 11/12] Btrfs-progs: Add chunk rebuild function for RAID1/SINGLE/DUP
Add chunk rebuild for RAID1/SINGLE/DUP to chunk-recover command. Before this patch chunk-recover can only scan and reuse the old chunk data to recover. With this patch, chunk-recover can use the reference between chunk/block group/dev extent to rebuild the whole chunk tree even when old chunks are not available. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> --- btrfsck.h | 1 + cmds-check.c | 31 ++++++----- cmds-chunk.c | 175 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- volumes.c | 11 ++-- volumes.h | 5 +- 5 files changed, 197 insertions(+), 26 deletions(-) diff --git a/btrfsck.h b/btrfsck.h index a6151d5..f73c605 100644 --- a/btrfsck.h +++ b/btrfsck.h @@ -140,6 +140,7 @@ static inline unsigned long btrfs_chunk_record_size(int num_stripes) } void free_chunk_cache_tree(struct cache_tree *chunk_cache); +u64 calc_stripe_length(u64 type, u64 length, int num_stripes); /* For block group tree */ static inline void block_group_tree_init(struct block_group_tree *tree) { diff --git a/cmds-check.c b/cmds-check.c index c3c7575..185d91f 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -4926,24 +4926,24 @@ repair_abort: return err; } -static u64 calc_stripe_length(struct chunk_record *chunk_rec) +u64 calc_stripe_length(u64 type, u64 length, int num_stripes) { u64 stripe_size; - if (chunk_rec->type_flags & BTRFS_BLOCK_GROUP_RAID0) { - stripe_size = chunk_rec->length; - stripe_size /= chunk_rec->num_stripes; - } else if (chunk_rec->type_flags & BTRFS_BLOCK_GROUP_RAID10) { - stripe_size = chunk_rec->length * 2; - stripe_size /= chunk_rec->num_stripes; - } else if (chunk_rec->type_flags & BTRFS_BLOCK_GROUP_RAID5) { - stripe_size = chunk_rec->length; - stripe_size /= (chunk_rec->num_stripes - 1); - } else if (chunk_rec->type_flags & BTRFS_BLOCK_GROUP_RAID6) { - stripe_size = chunk_rec->length; - stripe_size /= (chunk_rec->num_stripes - 2); + if (type & BTRFS_BLOCK_GROUP_RAID0) { + stripe_size = length; + stripe_size /= num_stripes; + } else if (type & BTRFS_BLOCK_GROUP_RAID10) { + stripe_size = length * 2; + stripe_size /= num_stripes; + } else if (type & BTRFS_BLOCK_GROUP_RAID5) { + stripe_size = length; + stripe_size /= (num_stripes - 1); + } else if (type & BTRFS_BLOCK_GROUP_RAID6) { + stripe_size = length; + stripe_size /= (num_stripes - 2); } else { - stripe_size = chunk_rec->length; + stripe_size = length; } return stripe_size; } @@ -5006,7 +5006,8 @@ static int check_chunk_refs(struct chunk_record *chunk_rec, ret = -1; } - length = calc_stripe_length(chunk_rec); + length = calc_stripe_length(chunk_rec->type_flags, chunk_rec->length, + chunk_rec->num_stripes); for (i = 0; i < chunk_rec->num_stripes; ++i) { devid = chunk_rec->stripes[i].devid; offset = chunk_rec->stripes[i].offset; diff --git a/cmds-chunk.c b/cmds-chunk.c index 35577ed..7b740a3 100644 --- a/cmds-chunk.c +++ b/cmds-chunk.c @@ -42,6 +42,7 @@ #include "commands.h" #define BTRFS_CHUNK_TREE_REBUILD_ABORTED -7500 +#define BTRFS_STRIPE_LEN (64 * 1024) struct recover_control { int verbose; @@ -1251,6 +1252,174 @@ again: goto again; } +static int btrfs_get_device_extents(u64 chunk_object, + struct list_head *orphan_devexts, + struct list_head *ret_list) +{ + struct device_extent_record *devext; + struct device_extent_record *next; + int count = 0; + + list_for_each_entry_safe(devext, next, orphan_devexts, chunk_list) { + if (devext->chunk_offset == chunk_object) { + list_move_tail(&devext->chunk_list, ret_list); + count++; + } + } + return count; +} + +static int calc_num_stripes(u64 type) +{ + if (type & (BTRFS_BLOCK_GROUP_RAID0 | + BTRFS_BLOCK_GROUP_RAID10 | + BTRFS_BLOCK_GROUP_RAID5 | + BTRFS_BLOCK_GROUP_RAID6)) + return 0; + else if (type & (BTRFS_BLOCK_GROUP_RAID1 | + BTRFS_BLOCK_GROUP_DUP)) + return 2; + else + return 1; +} + +static inline int calc_sub_nstripes(u64 type) +{ + if (type & BTRFS_BLOCK_GROUP_RAID10) + return 2; + else + return 1; +} + +static int btrfs_verify_device_extents(struct block_group_record *bg, + struct list_head *devexts, int ndevexts) +{ + struct device_extent_record *devext; + u64 strpie_length; + int expected_num_stripes; + + expected_num_stripes = calc_num_stripes(bg->flags); + if (!expected_num_stripes && expected_num_stripes != ndevexts) + return 1; + + strpie_length = calc_stripe_length(bg->flags, bg->offset, ndevexts); + list_for_each_entry(devext, devexts, chunk_list) { + if (devext->length != strpie_length) + return 1; + } + return 0; +} + +static int btrfs_rebuild_unordered_chunk_stripes(struct recover_control *rc, + struct chunk_record *chunk) +{ + struct device_extent_record *devext; + struct btrfs_device *device; + int i; + + devext = list_first_entry(&chunk->dextents, struct device_extent_record, + chunk_list); + for (i = 0; i < chunk->num_stripes; i++) { + chunk->stripes[i].devid = devext->objectid; + chunk->stripes[i].offset = devext->offset; + device = btrfs_find_device_by_devid(rc->fs_devices, + devext->objectid, + 0); + if (!device) + return -ENOENT; + BUG_ON(btrfs_find_device_by_devid(rc->fs_devices, + devext->objectid, + 1)); + memcpy(chunk->stripes[i].dev_uuid, device->uuid, + BTRFS_UUID_SIZE); + devext = list_next_entry(devext, chunk_list); + } + return 0; +} + +static int btrfs_rebuild_chunk_stripes(struct recover_control *rc, + struct chunk_record *chunk) +{ + int ret; + + if (chunk->type_flags & (BTRFS_BLOCK_GROUP_RAID10 | + BTRFS_BLOCK_GROUP_RAID0 | + BTRFS_BLOCK_GROUP_RAID5 | + BTRFS_BLOCK_GROUP_RAID6)) + BUG_ON(1); /* Fixme: implement in the next patch */ + else + ret = btrfs_rebuild_unordered_chunk_stripes(rc, chunk); + + return ret; +} + +static int btrfs_recover_chunks(struct recover_control *rc) +{ + struct chunk_record *chunk; + struct block_group_record *bg; + struct block_group_record *next; + LIST_HEAD(new_chunks); + LIST_HEAD(devexts); + int nstripes; + int ret; + + /* create the chunk by block group */ + list_for_each_entry_safe(bg, next, &rc->bg.block_groups, list) { + nstripes = btrfs_get_device_extents(bg->objectid, + &rc->devext.no_chunk_orphans, + &devexts); + chunk = malloc(btrfs_chunk_record_size(nstripes)); + if (!chunk) + return -ENOMEM; + memset(chunk, 0, btrfs_chunk_record_size(nstripes)); + INIT_LIST_HEAD(&chunk->dextents); + chunk->bg_rec = bg; + chunk->cache.start = bg->objectid; + chunk->cache.size = bg->offset; + chunk->objectid = BTRFS_FIRST_CHUNK_TREE_OBJECTID; + chunk->type = BTRFS_CHUNK_ITEM_KEY; + chunk->offset = bg->objectid; + chunk->generation = bg->generation; + chunk->length = bg->offset; + chunk->owner = BTRFS_CHUNK_TREE_OBJECTID; + chunk->stripe_len = BTRFS_STRIPE_LEN; + chunk->type_flags = bg->flags; + chunk->io_width = BTRFS_STRIPE_LEN; + chunk->io_align = BTRFS_STRIPE_LEN; + chunk->sector_size = rc->sectorsize; + chunk->sub_stripes = calc_sub_nstripes(bg->flags); + + ret = insert_cache_extent(&rc->chunk, &chunk->cache); + BUG_ON(ret); + + if (!nstripes) { + list_add_tail(&chunk->list, &rc->bad_chunks); + continue; + } + + list_splice_init(&devexts, &chunk->dextents); + + ret = btrfs_verify_device_extents(bg, &devexts, nstripes); + if (ret) { + list_add_tail(&chunk->list, &rc->bad_chunks); + continue; + } + + chunk->num_stripes = nstripes; + ret = btrfs_rebuild_chunk_stripes(rc, chunk); + if (ret) + list_add_tail(&chunk->list, &rc->bad_chunks); + else + list_add_tail(&chunk->list, &rc->good_chunks); + } + /* + * Don''t worry about the lost orphan device extents, they don''t + * have its chunk and block group, they must be the old ones that + * we have dropped. + */ + return 0; +} + static int btrfs_recover_chunk_tree(char *path, int verbose, int yes) { int ret = 0; @@ -1287,9 +1456,9 @@ static int btrfs_recover_chunk_tree(char *path, int verbose, int yes) if (ret) { if (!list_empty(&rc.bg.block_groups) || !list_empty(&rc.devext.no_chunk_orphans)) { - fprintf(stderr, - "There are some orphan block groups and device extents, we can''t repair them now.\n"); - goto fail_rc; + ret = btrfs_recover_chunks(&rc); + if (ret) + goto fail_rc; } /* * If the chunk is healthy, its block group item and device diff --git a/volumes.c b/volumes.c index 42cd943..ab282d3 100644 --- a/volumes.c +++ b/volumes.c @@ -1386,16 +1386,15 @@ struct btrfs_device *btrfs_find_device(struct btrfs_root *root, u64 devid, return NULL; } -struct btrfs_device *btrfs_find_device_by_devid(struct btrfs_root *root, - u64 devid, int instance) +struct btrfs_device * +btrfs_find_device_by_devid(struct btrfs_fs_devices *fs_devices, + u64 devid, int instance) { - struct list_head *head = &root->fs_info->fs_devices->devices; + struct list_head *head = &fs_devices->devices; struct btrfs_device *dev; - struct list_head *cur; int num_found = 0; - list_for_each(cur, head) { - dev = list_entry(cur, struct btrfs_device, dev_list); + list_for_each_entry(dev, head, dev_list) { if (dev->devid == devid && num_found++ == instance) return dev; } diff --git a/volumes.h b/volumes.h index 91277a7..0b894fd 100644 --- a/volumes.h +++ b/volumes.h @@ -190,8 +190,9 @@ int btrfs_add_system_chunk(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_key *key, struct btrfs_chunk *chunk, int item_size); int btrfs_chunk_readonly(struct btrfs_root *root, u64 chunk_offset); -struct btrfs_device *btrfs_find_device_by_devid(struct btrfs_root *root, - u64 devid, int instance); +struct btrfs_device * +btrfs_find_device_by_devid(struct btrfs_fs_devices *fs_devices, + u64 devid, int instance); struct btrfs_device *btrfs_find_device(struct btrfs_root *root, u64 devid, u8 *uuid, u8 *fsid); #endif -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Jul-03 13:25 UTC
[PATCH 12/12] Btrfs-progs: recover raid0/raid10/raid5/raid6 metadata chunk
According to the bytenr of the extent buffer record, we can calculate the index of the stripes, and we also know which device and where we read out the extent buffer record, that means we can know the relationship between the device extent and the stripes in the chunk, by this relationship, we can recover the raid0/radi10/ raid5/raid6 metadata chunk. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> --- cmds-chunk.c | 289 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 279 insertions(+), 10 deletions(-) diff --git a/cmds-chunk.c b/cmds-chunk.c index 7b740a3..03314de 100644 --- a/cmds-chunk.c +++ b/cmds-chunk.c @@ -43,6 +43,7 @@ #define BTRFS_CHUNK_TREE_REBUILD_ABORTED -7500 #define BTRFS_STRIPE_LEN (64 * 1024) +#define BTRFS_NUM_MIRRORS 2 struct recover_control { int verbose; @@ -59,11 +60,102 @@ struct recover_control { struct cache_tree chunk; struct block_group_tree bg; struct device_extent_tree devext; + struct cache_tree eb_cache; struct list_head good_chunks; struct list_head bad_chunks; + struct list_head unrepaired_chunks; }; +struct extent_record { + struct cache_extent cache; + u64 generation; + u8 csum[BTRFS_CSUM_SIZE]; + struct btrfs_device *devices[BTRFS_NUM_MIRRORS]; + u64 offsets[BTRFS_NUM_MIRRORS]; + int nmirrors; +}; + +static struct extent_record *btrfs_new_extent_record(struct extent_buffer *eb) +{ + struct extent_record *rec; + + rec = malloc(sizeof(*rec)); + if (!rec) { + fprintf(stderr, "Fail to allocate memory for extent record.\n"); + exit(1); + } + + memset(rec, 0, sizeof(*rec)); + rec->cache.start = btrfs_header_bytenr(eb); + rec->cache.size = eb->len; + rec->generation = btrfs_header_generation(eb); + read_extent_buffer(eb, rec->csum, (unsigned long)btrfs_header_csum(eb), + BTRFS_CSUM_SIZE); + return rec; +} + +static int process_extent_buffer(struct cache_tree *eb_cache, + struct extent_buffer *eb, + struct btrfs_device *device, u64 offset) +{ + struct extent_record *rec; + struct extent_record *exist; + struct cache_extent *cache; + int ret = 0; + + rec = btrfs_new_extent_record(eb); + if (!rec->cache.size) + goto free_out; +again: + cache = lookup_cache_extent(eb_cache, + rec->cache.start, + rec->cache.size); + if (cache) { + exist = container_of(cache, struct extent_record, cache); + + if (exist->generation > rec->generation) + goto free_out; + if (exist->generation == rec->generation) { + if (exist->cache.start != rec->cache.start || + exist->cache.size != rec->cache.size || + memcmp(exist->csum, rec->csum, BTRFS_CSUM_SIZE)) { + ret = -EEXIST; + } else { + BUG_ON(exist->nmirrors >= BTRFS_NUM_MIRRORS); + exist->devices[exist->nmirrors] = device; + exist->offsets[exist->nmirrors] = offset; + exist->nmirrors++; + } + goto free_out; + } + remove_cache_extent(eb_cache, cache); + free(exist); + goto again; + } + + rec->devices[0] = device; + rec->offsets[0] = offset; + rec->nmirrors++; + ret = insert_cache_extent(eb_cache, &rec->cache); + BUG_ON(ret); +out: + return ret; +free_out: + free(rec); + goto out; +} + +static void free_extent_record(struct cache_extent *cache) +{ + struct extent_record *er; + + er = container_of(cache, struct extent_record, cache); + free(er); +} + +FREE_EXTENT_CACHE_BASED_TREE(extent_record, free_extent_record); + static struct btrfs_chunk *create_chunk_item(struct chunk_record *record) { struct btrfs_chunk *ret; @@ -100,11 +192,13 @@ void init_recover_control(struct recover_control *rc, int verbose, int yes) { memset(rc, 0, sizeof(struct recover_control)); cache_tree_init(&rc->chunk); + cache_tree_init(&rc->eb_cache); block_group_tree_init(&rc->bg); device_extent_tree_init(&rc->devext); INIT_LIST_HEAD(&rc->good_chunks); INIT_LIST_HEAD(&rc->bad_chunks); + INIT_LIST_HEAD(&rc->unrepaired_chunks); rc->verbose = verbose; rc->yes = yes; @@ -115,6 +209,7 @@ void free_recover_control(struct recover_control *rc) free_block_group_tree(&rc->bg); free_chunk_cache_tree(&rc->chunk); free_device_extent_tree(&rc->devext); + free_extent_record_tree(&rc->eb_cache); } static int process_block_group_item(struct block_group_tree *bg_cache, @@ -554,11 +649,12 @@ static int check_all_chunks_by_metadata(struct recover_control *rc, struct btrfs_root *root) { struct chunk_record *chunk; + struct chunk_record *next; LIST_HEAD(orphan_chunks); int ret = 0; int err; - list_for_each_entry(chunk, &rc->good_chunks, list) { + list_for_each_entry_safe(chunk, next, &rc->good_chunks, list) { err = check_chunk_by_metadata(rc, root, chunk, 0); if (err) { if (err == -ENOENT) @@ -568,6 +664,14 @@ static int check_all_chunks_by_metadata(struct recover_control *rc, } } + list_for_each_entry_safe(chunk, next, &rc->unrepaired_chunks, list) { + err = check_chunk_by_metadata(rc, root, chunk, 1); + if (err == -ENOENT) + list_move_tail(&chunk->list, &orphan_chunks); + else if (err && !ret) + ret = err; + } + list_for_each_entry(chunk, &rc->bad_chunks, list) { err = check_chunk_by_metadata(rc, root, chunk, 1); if (err != -ENOENT && !ret) @@ -617,7 +721,8 @@ static inline int is_super_block_address(u64 offset) return 0; } -static int scan_one_device(struct recover_control *rc, int fd) +static int scan_one_device(struct recover_control *rc, int fd, + struct btrfs_device *device) { struct extent_buffer *buf; u64 bytenr; @@ -649,6 +754,10 @@ static int scan_one_device(struct recover_control *rc, int fd) continue; } + ret = process_extent_buffer(&rc->eb_cache, buf, device, bytenr); + if (ret) + goto out; + if (btrfs_header_level(buf) != 0) goto next_node; @@ -692,7 +801,7 @@ static int scan_devices(struct recover_control *rc) dev->name); return -1; } - ret = scan_one_device(rc, fd); + ret = scan_one_device(rc, fd, dev); close(fd); if (ret) return ret; @@ -1299,7 +1408,7 @@ static int btrfs_verify_device_extents(struct block_group_record *bg, int expected_num_stripes; expected_num_stripes = calc_num_stripes(bg->flags); - if (!expected_num_stripes && expected_num_stripes != ndevexts) + if (expected_num_stripes && expected_num_stripes != ndevexts) return 1; strpie_length = calc_stripe_length(bg->flags, bg->offset, ndevexts); @@ -1337,16 +1446,174 @@ static int btrfs_rebuild_unordered_chunk_stripes(struct recover_control *rc, return 0; } +static int btrfs_calc_stripe_index(struct chunk_record *chunk, u64 logical) +{ + u64 offset = logical - chunk->offset; + int stripe_nr; + int nr_data_stripes; + int index; + + stripe_nr = offset / chunk->stripe_len; + if (chunk->type_flags & BTRFS_BLOCK_GROUP_RAID0) { + index = stripe_nr % chunk->num_stripes; + } else if (chunk->type_flags & BTRFS_BLOCK_GROUP_RAID10) { + index = stripe_nr % (chunk->num_stripes / chunk->sub_stripes); + index *= chunk->sub_stripes; + } else if (chunk->type_flags & BTRFS_BLOCK_GROUP_RAID5) { + nr_data_stripes = chunk->num_stripes - 1; + index = stripe_nr % nr_data_stripes; + stripe_nr /= nr_data_stripes; + index = (index + stripe_nr) % chunk->num_stripes; + } else if (chunk->type_flags & BTRFS_BLOCK_GROUP_RAID6) { + nr_data_stripes = chunk->num_stripes - 2; + index = stripe_nr % nr_data_stripes; + stripe_nr /= nr_data_stripes; + index = (index + stripe_nr) % chunk->num_stripes; + } else { + BUG_ON(1); + } + return index; +} + +/* calc the logical offset which is the start of the next stripe. */ +static inline u64 btrfs_next_stripe_logical_offset(struct chunk_record *chunk, + u64 logical) +{ + u64 offset = logical - chunk->offset; + + offset /= chunk->stripe_len; + offset *= chunk->stripe_len; + offset += chunk->stripe_len; + + return offset + chunk->offset; +} + +static int is_extent_record_in_device_extent(struct extent_record *er, + struct device_extent_record *dext, + int *mirror) +{ + int i; + + for (i = 0; i < er->nmirrors; i++) { + if (er->devices[i]->devid == dext->objectid && + er->offsets[i] >= dext->offset && + er->offsets[i] < dext->offset + dext->length) { + *mirror = i; + return 1; + } + } + return 0; +} + +static int +btrfs_rebuild_ordered_meta_chunk_stripes(struct recover_control *rc, + struct chunk_record *chunk) +{ + u64 start = chunk->offset; + u64 end = chunk->offset + chunk->length; + struct cache_extent *cache; + struct extent_record *er; + struct device_extent_record *devext; + struct device_extent_record *next; + struct btrfs_device *device; + LIST_HEAD(devexts); + int index; + int mirror; + int ret; + + cache = lookup_cache_extent(&rc->eb_cache, + start, chunk->length); + if (!cache) { + /* No used space, we can reorder the stripes freely. */ + ret = btrfs_rebuild_unordered_chunk_stripes(rc, chunk); + return ret; + } + + list_splice_init(&chunk->dextents, &devexts); +again: + er = container_of(cache, struct extent_record, cache); + index = btrfs_calc_stripe_index(chunk, er->cache.start); + if (chunk->stripes[index].devid) + goto next; + list_for_each_entry_safe(devext, next, &devexts, chunk_list) { + if (is_extent_record_in_device_extent(er, devext, &mirror)) { + chunk->stripes[index].devid = devext->objectid; + chunk->stripes[index].offset = devext->offset; + memcpy(chunk->stripes[index].dev_uuid, + er->devices[mirror]->uuid, + BTRFS_UUID_SIZE); + index++; + list_move(&devext->chunk_list, &chunk->dextents); + } + } +next: + start = btrfs_next_stripe_logical_offset(chunk, er->cache.start); + if (start >= end) + goto no_extent_record; + + cache = lookup_cache_extent(&rc->eb_cache, start, end - start); + if (cache) + goto again; +no_extent_record: + if (list_empty(&devexts)) + return 0; + + if (chunk->type_flags & (BTRFS_BLOCK_GROUP_RAID5 | + BTRFS_BLOCK_GROUP_RAID6)) { + /* Fixme: try to recover the order by the parity block. */ + list_splice_tail(&devexts, &chunk->dextents); + return -EINVAL; + } + + /* There is no data on the lost stripes, we can reorder them freely. */ + for (index = 0; index < chunk->num_stripes; index++) { + if (chunk->stripes[index].devid) + continue; + + devext = list_first_entry(&devexts, + struct device_extent_record, + chunk_list); + list_move(&devext->chunk_list, &chunk->dextents); + + chunk->stripes[index].devid = devext->objectid; + chunk->stripes[index].offset = devext->offset; + device = btrfs_find_device_by_devid(rc->fs_devices, + devext->objectid, + 0); + if (!device) { + list_splice_tail(&devexts, &chunk->dextents); + return -EINVAL; + } + BUG_ON(btrfs_find_device_by_devid(rc->fs_devices, + devext->objectid, + 1)); + memcpy(chunk->stripes[index].dev_uuid, device->uuid, + BTRFS_UUID_SIZE); + } + return 0; +} + +#define BTRFS_ORDERED_RAID (BTRFS_BLOCK_GROUP_RAID0 | \ + BTRFS_BLOCK_GROUP_RAID10 | \ + BTRFS_BLOCK_GROUP_RAID5 | \ + BTRFS_BLOCK_GROUP_RAID6) + static int btrfs_rebuild_chunk_stripes(struct recover_control *rc, struct chunk_record *chunk) { int ret; - if (chunk->type_flags & (BTRFS_BLOCK_GROUP_RAID10 | - BTRFS_BLOCK_GROUP_RAID0 | - BTRFS_BLOCK_GROUP_RAID5 | - BTRFS_BLOCK_GROUP_RAID6)) - BUG_ON(1); /* Fixme: implement in the next patch */ + /* + * All the data in the system metadata chunk will be dropped, + * so we need not guarantee that the data is right or not, that + * is we can reorder the stripes in the system metadata chunk. + */ + if ((chunk->type_flags & BTRFS_BLOCK_GROUP_METADATA) && + (chunk->type_flags & BTRFS_ORDERED_RAID)) + ret =btrfs_rebuild_ordered_meta_chunk_stripes(rc, chunk); + else if ((chunk->type_flags & BTRFS_BLOCK_GROUP_DATA) && + (chunk->type_flags & BTRFS_ORDERED_RAID)) + ret = 1; /* Be handled after the fs is opened. */ else ret = btrfs_rebuild_unordered_chunk_stripes(rc, chunk); @@ -1407,7 +1674,9 @@ static int btrfs_recover_chunks(struct recover_control *rc) chunk->num_stripes = nstripes; ret = btrfs_rebuild_chunk_stripes(rc, chunk); - if (ret) + if (ret > 0) + list_add_tail(&chunk->list, &rc->unrepaired_chunks); + else if (ret < 0) list_add_tail(&chunk->list, &rc->bad_chunks); else list_add_tail(&chunk->list, &rc->good_chunks); -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Filipe David Manana
2013-Jul-03 14:17 UTC
Re: [PATCH 02/12] Btrfs-progs: don''t close the file descriptor 0 when closing a device
On Wed, Jul 3, 2013 at 2:25 PM, Miao Xie <miaox@cn.fujitsu.com> wrote:> > +++ b/disk-io.c > @@ -1270,12 +1270,13 @@ static int close_all_devices(struct btrfs_fs_info *fs_info) > while (!list_empty(list)) { > device = list_entry(list->next, struct btrfs_device, dev_list); > list_del_init(&device->dev_list); > - if (device->fd) { > + if (device->fd != -1) { > fsync(device->fd); > if (posix_fadvise(device->fd, 0, 0, POSIX_FADV_DONTNEED)) > fprintf(stderr, "Warning, could not drop caches\n"); > + close(device->fd); > + device->fd = -1; > } > - close(device->fd); > kfree(device->name); > kfree(device->label); > kfree(device);I deal with this part too at https://patchwork.kernel.org/patch/2787291/ Is there any reason to set device->fd to -1 if we just kfree(device) shortly after? thanks -- Filipe David Manana, "Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That''s why all progress depends on unreasonable men." -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2013-Jul-03 20:36 UTC
Re: [RFC PATCH 00/12] Btrfs-progs: introduce chunk recover function
Quoting Miao Xie (2013-07-03 09:25:08)> This patchset introduced chunk recover function, which was implemented by > scanning the whoel disks in the filesystem. Now, we can recover Single, > Dup, RAID1 chunks, and RAID0, RAID10, RAID5, RAID6 metadata chunks.Really nice. I''ve integrated this with Liu Bo''s btrfs-image fixes and put it into a branch called integration. I''ve tested both repair and image here, but if you could please double check the merge I''d appreciate it. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Jul-04 01:30 UTC
Re: [PATCH 02/12] Btrfs-progs: don''t close the file descriptor 0 when closing a device
On wed, 3 Jul 2013 15:17:02 +0100, Filipe David Manana wrote:> On Wed, Jul 3, 2013 at 2:25 PM, Miao Xie <miaox@cn.fujitsu.com> wrote: >> >> +++ b/disk-io.c >> @@ -1270,12 +1270,13 @@ static int close_all_devices(struct btrfs_fs_info *fs_info) >> while (!list_empty(list)) { >> device = list_entry(list->next, struct btrfs_device, dev_list); >> list_del_init(&device->dev_list); >> - if (device->fd) { >> + if (device->fd != -1) { >> fsync(device->fd); >> if (posix_fadvise(device->fd, 0, 0, POSIX_FADV_DONTNEED)) >> fprintf(stderr, "Warning, could not drop caches\n"); >> + close(device->fd); >> + device->fd = -1; >> } >> - close(device->fd); >> kfree(device->name); >> kfree(device->label); >> kfree(device); > > I deal with this part too at https://patchwork.kernel.org/patch/2787291/Sorry, I don''t know you have dealt with it. But your patch didn''t fix the problem completely, there are still some functions that you didn''t deal with.> Is there any reason to set device->fd to -1 if we just kfree(device) > shortly after?Right, I will update my patch. Thanks Miao> thanks > > > -- > Filipe David Manana, > > "Reasonable men adapt themselves to the world. > Unreasonable men adapt the world to themselves. > That''s why all progress depends on unreasonable men." > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2013-Jul-04 04:06 UTC
Re: [RFC PATCH 00/12] Btrfs-progs: introduce chunk recover function
On Wed, Jul 03, 2013 at 04:36:44PM -0400, Chris Mason wrote:> Quoting Miao Xie (2013-07-03 09:25:08) > > This patchset introduced chunk recover function, which was implemented by > > scanning the whoel disks in the filesystem. Now, we can recover Single, > > Dup, RAID1 chunks, and RAID0, RAID10, RAID5, RAID6 metadata chunks. > > Really nice. I''ve integrated this with Liu Bo''s btrfs-image fixes and > put it into a branch called integration. I''ve tested both repair and > image here, but if you could please double check the merge I''d > appreciate it.We still need another patch to make image work(actually we need to initialize missing device''s uuid to NULL). It''s the read_one_dev() part in https://patchwork.kernel.org/patch/2787291/ diff --git a/volumes.c b/volumes.c index d6f81f8..061f094 100644 --- a/volumes.c +++ b/volumes.c @@ -116,6 +116,7 @@ static int device_list_add(const char *path, /* we can safely leave the fs_devices entry around */ return -ENOMEM; } + device->fd = -1; device->devid = devid; memcpy(device->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE); @@ -1628,10 +1629,10 @@ static int read_one_dev(struct btrfs_root *root, if (!device) { printk("warning devid %llu not found already\n", (unsigned long long)devid); - device = kmalloc(sizeof(*device), GFP_NOFS); + device = kzalloc(sizeof(*device), GFP_NOFS); if (!device) return -ENOMEM; - device->total_ios = 0; + device->fd = -1; list_add(&device->dev_list, &root->fs_info->fs_devices->devices); } -liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Filipe David Manana
2013-Jul-04 08:30 UTC
Re: [PATCH 02/12] Btrfs-progs: don''t close the file descriptor 0 when closing a device
On Thu, Jul 4, 2013 at 2:30 AM, Miao Xie <miaox@cn.fujitsu.com> wrote:> On wed, 3 Jul 2013 15:17:02 +0100, Filipe David Manana wrote: >> On Wed, Jul 3, 2013 at 2:25 PM, Miao Xie <miaox@cn.fujitsu.com> wrote: >>> >>> +++ b/disk-io.c >>> @@ -1270,12 +1270,13 @@ static int close_all_devices(struct btrfs_fs_info *fs_info) >>> while (!list_empty(list)) { >>> device = list_entry(list->next, struct btrfs_device, dev_list); >>> list_del_init(&device->dev_list); >>> - if (device->fd) { >>> + if (device->fd != -1) { >>> fsync(device->fd); >>> if (posix_fadvise(device->fd, 0, 0, POSIX_FADV_DONTNEED)) >>> fprintf(stderr, "Warning, could not drop caches\n"); >>> + close(device->fd); >>> + device->fd = -1; >>> } >>> - close(device->fd); >>> kfree(device->name); >>> kfree(device->label); >>> kfree(device); >> >> I deal with this part too at https://patchwork.kernel.org/patch/2787291/ > > Sorry, I don''t know you have dealt with it. But your patch didn''t fix the problem completely, > there are still some functions that you didn''t deal with.Yes, I was addressing a different problem. My comment was relative only to this change in disk-io.c:close_all_devices(). thanks> >> Is there any reason to set device->fd to -1 if we just kfree(device) >> shortly after? > > Right, I will update my patch. > > Thanks > Miao > >> thanks >> >> >> -- >> Filipe David Manana, >> >> "Reasonable men adapt themselves to the world. >> Unreasonable men adapt the world to themselves. >> That''s why all progress depends on unreasonable men." >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >-- Filipe David Manana, "Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That''s why all progress depends on unreasonable men." -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2013-Jul-05 14:00 UTC
Re: [RFC PATCH 00/12] Btrfs-progs: introduce chunk recover function
Quoting Liu Bo (2013-07-04 00:06:47)> On Wed, Jul 03, 2013 at 04:36:44PM -0400, Chris Mason wrote: > > Quoting Miao Xie (2013-07-03 09:25:08) > > > This patchset introduced chunk recover function, which was implemented by > > > scanning the whoel disks in the filesystem. Now, we can recover Single, > > > Dup, RAID1 chunks, and RAID0, RAID10, RAID5, RAID6 metadata chunks. > > > > Really nice. I''ve integrated this with Liu Bo''s btrfs-image fixes and > > put it into a branch called integration. I''ve tested both repair and > > image here, but if you could please double check the merge I''d > > appreciate it. > > We still need another patch to make image work(actually we need to initialize > missing device''s uuid to NULL). > > It''s the read_one_dev() part in > > https://patchwork.kernel.org/patch/2787291/Got it, thanks! -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Anand Jain
2013-Jul-08 04:59 UTC
Re: [PATCH 03/12] Btrfs-progs: Don''t free the devices when close the ctree
btrfs_close_devices() should reset fs_devices->latest_bdev and fs_devices->lowest_bdev as well they hold fd of the open dev in the list which is being closed. On 07/03/2013 09:25 PM, Miao Xie wrote:> Some commands(such as btrfs-convert) access the devices again after we close > the ctree, so it is better that we don''t free the devices objects when the ctree > is closed, or we need re-allocate the memory for the devices.> We needn''t worry > the memory leak problem, because all the memory will be freed after the taskes > die.That will apply same for close as well right ? However from debugging and converting these functions as library-functions point of view its better if we have close/free called explicitly where possible. I am sending a patch to fix the the memory leak. So its fine if this patch just address the close issue. Thanks, Anand> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> > --- > btrfs-find-root.c | 21 +-------------------- > disk-io.c | 30 ++---------------------------- > volumes.c | 3 +++ > 3 files changed, 6 insertions(+), 48 deletions(-) > > diff --git a/btrfs-find-root.c b/btrfs-find-root.c > index 3e1396d..da22c1d 100644 > --- a/btrfs-find-root.c > +++ b/btrfs-find-root.c > @@ -65,25 +65,6 @@ int csum_block(void *buf, u32 len) > return ret; > } > > -static int close_all_devices(struct btrfs_fs_info *fs_info) > -{ > - struct list_head *list; > - struct list_head *next; > - struct btrfs_device *device; > - > - return 0; > - > - list = &fs_info->fs_devices->devices; > - list_for_each(next, list) { > - device = list_entry(next, struct btrfs_device, dev_list); > - if (device->fd != -1) { > - close(device->fd); > - device->fd = -1; > - } > - } > - return 0; > -} > - > static struct btrfs_root *open_ctree_broken(int fd, const char *device) > { > u32 sectorsize; > @@ -217,7 +198,7 @@ static struct btrfs_root *open_ctree_broken(int fd, const char *device) > out_chunk: > free_extent_buffer(fs_info->chunk_root->node); > out_devices: > - close_all_devices(fs_info); > + btrfs_close_devices(fs_info->fs_devices); > out_cleanup: > extent_io_tree_cleanup(&fs_info->extent_cache); > extent_io_tree_cleanup(&fs_info->free_space_cache); > diff --git a/disk-io.c b/disk-io.c > index 4003636..a8176a5 100644 > --- a/disk-io.c > +++ b/disk-io.c > @@ -35,8 +35,6 @@ > #include "utils.h" > #include "print-tree.h" > > -static int close_all_devices(struct btrfs_fs_info *fs_info); > - > static int check_tree_block(struct btrfs_root *root, struct extent_buffer *buf) > { > > @@ -1028,7 +1026,7 @@ out_chunk: > if (fs_info->chunk_root) > free_extent_buffer(fs_info->chunk_root->node); > out_devices: > - close_all_devices(fs_info); > + btrfs_close_devices(fs_info->fs_devices); > out_cleanup: > extent_io_tree_cleanup(&fs_info->extent_cache); > extent_io_tree_cleanup(&fs_info->free_space_cache); > @@ -1261,30 +1259,6 @@ int write_ctree_super(struct btrfs_trans_handle *trans, > return ret; > } > > -static int close_all_devices(struct btrfs_fs_info *fs_info) > -{ > - struct list_head *list; > - struct btrfs_device *device; > - > - list = &fs_info->fs_devices->devices; > - while (!list_empty(list)) { > - device = list_entry(list->next, struct btrfs_device, dev_list); > - list_del_init(&device->dev_list); > - if (device->fd != -1) { > - fsync(device->fd); > - if (posix_fadvise(device->fd, 0, 0, POSIX_FADV_DONTNEED)) > - fprintf(stderr, "Warning, could not drop caches\n"); > - close(device->fd); > - device->fd = -1; > - } > - kfree(device->name); > - kfree(device->label); > - kfree(device); > - } > - kfree(fs_info->fs_devices); > - return 0; > -} > - > static void free_mapping_cache(struct btrfs_fs_info *fs_info) > { > struct cache_tree *cache_tree = &fs_info->mapping_tree.cache_tree; > @@ -1337,7 +1311,7 @@ int close_ctree(struct btrfs_root *root) > free(fs_info->log_root_tree); > } > > - close_all_devices(fs_info); > + btrfs_close_devices(fs_info->fs_devices); > free_mapping_cache(fs_info); > extent_io_tree_cleanup(&fs_info->extent_cache); > extent_io_tree_cleanup(&fs_info->free_space_cache); > diff --git a/volumes.c b/volumes.c > index b88385b..0f6a35b 100644 > --- a/volumes.c > +++ b/volumes.c > @@ -163,6 +163,9 @@ again: > list_for_each(cur, &fs_devices->devices) { > device = list_entry(cur, struct btrfs_device, dev_list); > if (device->fd != -1) { > + fsync(device->fd); > + if (posix_fadvise(device->fd, 0, 0, POSIX_FADV_DONTNEED)) > + fprintf(stderr, "Warning, could not drop caches\n"); > close(device->fd); > device->fd = -1; > } >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Anand Jain
2013-Jul-15 04:58 UTC
Re: [PATCH 03/12] Btrfs-progs: Don''t free the devices when close the ctree
> I am sending a patch to fix the the memory leak. So > its fine if this patch just address the close issue.This is more complicated than initially thought and it isn''t ready. As explained in the other thread. Thanks, Anand -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2013-Aug-01 20:30 UTC
Re: [PATCH 09/12] Btrfs-progs: Add chunk recover function - using old chunk items
On Wed, Jul 03, 2013 at 09:25:17PM +0800, Miao Xie wrote:> --- a/btrfs.c > +++ b/btrfs.c > @@ -247,6 +247,7 @@ const struct cmd_group btrfs_cmd_group = { > { "device", cmd_device, NULL, &device_cmd_group, 0 }, > { "scrub", cmd_scrub, NULL, &scrub_cmd_group, 0 }, > { "check", cmd_check, cmd_check_usage, NULL, 0 }, > + { "chunk-recover", cmd_chunk_recover, cmd_chunk_recover_usage, NULL, 0},Better late than never, though the patches are already in master branch and I am horribly late. I don''t like to see this very specific command in the first level of command namespace. In the past we''ve proposed a group named ''rescue'' that would collect functions that are potentially dangerous but perform certain tasks that can make a filesystem usable again. Examples are select-super or zero-log that are now separate utilities. As a related topic, I was thinking about intorducing a separate namespace that would be declared unstable and any feature in development would be free to use it and add/modify/delete commands and params as needed. That way developers can focus on the feature itself and let the user interface polishing for later. And now the name of the namespace: how about _ ? It''s short, will never clash with any other command. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric Sandeen
2013-Aug-04 16:04 UTC
Re: [PATCH 04/12] Btrfs-progs: cleanup similar code in open_ctree_* and close_ctree
On 7/3/13 8:25 AM, Miao Xie wrote:> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> > --- > btrfs-find-root.c | 137 +++------------- > disk-io.c | 473 +++++++++++++++++++++++++++++++----------------------- > disk-io.h | 12 ++ > 3 files changed, 307 insertions(+), 315 deletions(-)This broke at least btrfs-convert:> ./btrfs-convert fsfile > No valid Btrfs found on fsfile > unable to open ctree > conversion aborted.I read the detailed changelog carefully but it didn''t help me understand the change, or how it might have broken. (yes, that''s sarcasm ;) ). Can you take a look & see what is wrong with your change? Also, I guess we need a regression test for btrfs-convert. -Eric> diff --git a/btrfs-find-root.c b/btrfs-find-root.c > index da22c1d..f2cc1bf 100644 > --- a/btrfs-find-root.c > +++ b/btrfs-find-root.c > @@ -67,74 +67,31 @@ int csum_block(void *buf, u32 len) > > static struct btrfs_root *open_ctree_broken(int fd, const char *device) > { > - u32 sectorsize; > - u32 nodesize; > - u32 leafsize; > - u32 blocksize; > - u32 stripesize; > - u64 generation; > - struct btrfs_root *tree_root = malloc(sizeof(struct btrfs_root)); > - struct btrfs_root *extent_root = malloc(sizeof(struct btrfs_root)); > - struct btrfs_root *chunk_root = malloc(sizeof(struct btrfs_root)); > - struct btrfs_root *dev_root = malloc(sizeof(struct btrfs_root)); > - struct btrfs_root *csum_root = malloc(sizeof(struct btrfs_root)); > - struct btrfs_fs_info *fs_info = malloc(sizeof(*fs_info)); > - int ret;<giant snip> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Wang Shilong
2013-Aug-04 23:24 UTC
Re: [PATCH 04/12] Btrfs-progs: cleanup similar code in open_ctree_* and close_ctree
Hello Eric, I have sent a patch to fix up this regression: https://patchwork.kernel.org/patch/2828820/ Would you please try and see if this can solve problems. Thanks, Wang> On 7/3/13 8:25 AM, Miao Xie wrote: >> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> >> --- >> btrfs-find-root.c | 137 +++------------- >> disk-io.c | 473 +++++++++++++++++++++++++++++++----------------------- >> disk-io.h | 12 ++ >> 3 files changed, 307 insertions(+), 315 deletions(-) > > This broke at least btrfs-convert: > >> ./btrfs-convert fsfile >> No valid Btrfs found on fsfile >> unable to open ctree >> conversion aborted. > > I read the detailed changelog carefully but it didn''t help me > understand the change, or how it might have broken. (yes, > that''s sarcasm ;) ). > > Can you take a look & see what is wrong with your change? > > Also, I guess we need a regression test for btrfs-convert. > > -Eric > >> diff --git a/btrfs-find-root.c b/btrfs-find-root.c >> index da22c1d..f2cc1bf 100644 >> --- a/btrfs-find-root.c >> +++ b/btrfs-find-root.c >> @@ -67,74 +67,31 @@ int csum_block(void *buf, u32 len) >> >> static struct btrfs_root *open_ctree_broken(int fd, const char *device) >> { >> - u32 sectorsize; >> - u32 nodesize; >> - u32 leafsize; >> - u32 blocksize; >> - u32 stripesize; >> - u64 generation; >> - struct btrfs_root *tree_root = malloc(sizeof(struct btrfs_root)); >> - struct btrfs_root *extent_root = malloc(sizeof(struct btrfs_root)); >> - struct btrfs_root *chunk_root = malloc(sizeof(struct btrfs_root)); >> - struct btrfs_root *dev_root = malloc(sizeof(struct btrfs_root)); >> - struct btrfs_root *csum_root = malloc(sizeof(struct btrfs_root)); >> - struct btrfs_fs_info *fs_info = malloc(sizeof(*fs_info)); >> - int ret; > <giant snip> > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric Sandeen
2013-Aug-04 23:43 UTC
Re: [PATCH 04/12] Btrfs-progs: cleanup similar code in open_ctree_* and close_ctree
On 8/4/13 6:24 PM, Wang Shilong wrote:> Hello Eric, > > I have sent a patch to fix up this regression: > https://patchwork.kernel.org/patch/2828820/ > > Would you please try and see if this can solve problems.Ah, thanks. I missed that, I''ll try it. Chris, maybe one to pick up sooner than later! -Eric> Thanks, > Wang >> On 7/3/13 8:25 AM, Miao Xie wrote: >>> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> >>> --- >>> btrfs-find-root.c | 137 +++------------- >>> disk-io.c | 473 +++++++++++++++++++++++++++++++----------------------- >>> disk-io.h | 12 ++ >>> 3 files changed, 307 insertions(+), 315 deletions(-) >> >> This broke at least btrfs-convert: >> >>> ./btrfs-convert fsfile >>> No valid Btrfs found on fsfile >>> unable to open ctree >>> conversion aborted. >> >> I read the detailed changelog carefully but it didn''t help me >> understand the change, or how it might have broken. (yes, >> that''s sarcasm ;) ). >> >> Can you take a look & see what is wrong with your change? >> >> Also, I guess we need a regression test for btrfs-convert. >> >> -Eric >> >>> diff --git a/btrfs-find-root.c b/btrfs-find-root.c >>> index da22c1d..f2cc1bf 100644 >>> --- a/btrfs-find-root.c >>> +++ b/btrfs-find-root.c >>> @@ -67,74 +67,31 @@ int csum_block(void *buf, u32 len) >>> >>> static struct btrfs_root *open_ctree_broken(int fd, const char *device) >>> { >>> - u32 sectorsize; >>> - u32 nodesize; >>> - u32 leafsize; >>> - u32 blocksize; >>> - u32 stripesize; >>> - u64 generation; >>> - struct btrfs_root *tree_root = malloc(sizeof(struct btrfs_root)); >>> - struct btrfs_root *extent_root = malloc(sizeof(struct btrfs_root)); >>> - struct btrfs_root *chunk_root = malloc(sizeof(struct btrfs_root)); >>> - struct btrfs_root *dev_root = malloc(sizeof(struct btrfs_root)); >>> - struct btrfs_root *csum_root = malloc(sizeof(struct btrfs_root)); >>> - struct btrfs_fs_info *fs_info = malloc(sizeof(*fs_info)); >>> - int ret; >> <giant snip> >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html