Stefan Behrens
2013-May-16 14:45 UTC
[PATCH v3 0/4] Btrfs-progs: speedup btrfs send/receive
The addressed issue is that Btrfs send / receive does not work as it is today when a high number of subvolumes exist. This commit changes the btrfs send/receive commands to use the UUID tree to map UUIDs to subvolumes, and to use the root tree to map subvolume IDs to paths. Now these tools start fast and are independent on the number of subvolumes/snapshots that exist. Before this commit, mapping UUIDs to subvolume IDs was an operation with a high effort. The algorithm even had quadratic effort (based on the number of existing subvolumes). E.g. with 15,000 subvolumes it took much more than 5 minutes on a state of the art XEON CPU to start btrfs send or receive before these tools were able to send or receive the first byte). Even linear effort instead of the current quadratic effort would be too much since it would be a waste. And these data structures to allow mapping UUIDs to subvolume IDs had been created every time a btrfs send/receive instance was started. It is much more efficient to maintain a searchable persistent data structure in the filesystem, one that is updated whenever a subvolume/snapshot is created and deleted, and when the received subvolume UUID is set by the btrfs-receive tool. The user mode tools can then just use the tree-search ioctl to quickly retrieve all information. With a recent commit, kernel code was added that is able to maintain data structures in the filesystem that allow to quickly search for a given UUID and to retrieve data that is assigned to this UUID, like which subvolume ID is related to this UUID. This commit series adds support for the UUID tree to Btrfs-progs and changes the send/receive tools to use it. Additionally, the btrfs-show-super tool is updated to print a new field. v1 -> v2: - Addressed the review comments from David Sterba. - The v2 of the kernel patch adds a uuid_tree_generation field to the superblock, the v2 of the user mode patch adds this field to the btrfs-show-super tool. - uuid-tree.o is added to the libbtrfs_objects since it is used by send-utils.o which is part of the exported libbtrfs. v2 -> v3: - shrinked the uuid_item (this was a review comment from Liu Bo). Stefan Behrens (4): Btrfs-progs: Support UUID tree and UUID items in btrfs-debug-tree Btrfs-progs: add UUID tree lookup methods Btrfs-progs: use UUID tree for send/receive Btrfs-progs: add uuid_tree_gen field to btrfs-show-super Makefile | 5 +- btrfs-show-super.c | 2 + cmds-receive.c | 23 ++- cmds-send.c | 53 +++++- ctree.h | 39 ++++- print-tree.c | 95 ++++++++++- send-utils.c | 477 +++++++++++++++++++++-------------------------------- send-utils.h | 9 +- uuid-tree.c | 174 +++++++++++++++++++ 9 files changed, 562 insertions(+), 315 deletions(-) create mode 100644 uuid-tree.c -- 1.8.2.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stefan Behrens
2013-May-16 14:45 UTC
[PATCH v3 1/4] Btrfs-progs: Support UUID tree and UUID items in btrfs-debug-tree
Support printing these things. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> --- ctree.h | 29 +++++++++++++++++++ print-tree.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 121 insertions(+), 3 deletions(-) diff --git a/ctree.h b/ctree.h index 4ea37ac..0f14b41 100644 --- a/ctree.h +++ b/ctree.h @@ -71,6 +71,8 @@ struct btrfs_free_space_ctl; #define BTRFS_CSUM_TREE_OBJECTID 7ULL #define BTRFS_QUOTA_TREE_OBJECTID 8ULL +/* for storing items that use the BTRFS_UUID_KEY */ +#define BTRFS_UUID_TREE_OBJECTID 9ULL /* for storing balance parameters in the root tree */ #define BTRFS_BALANCE_OBJECTID -4ULL @@ -811,6 +813,16 @@ struct btrfs_csum_item { u8 csum; } __attribute__ ((__packed__)); +/* for items that use the BTRFS_UUID_KEY */ +#define BTRFS_UUID_ITEM_TYPE_SUBVOL 0 /* for UUIDs assigned to subvols */ +#define BTRFS_UUID_ITEM_TYPE_RECEIVED_SUBVOL 1 /* for UUIDs assigned to + * received subvols */ +struct btrfs_uuid_item { + __le16 type; /* refer to BTRFS_UUID_ITEM_TYPE* defines above */ + __le32 len; /* number of following 64bit values */ + __le64 data[0]; /* data aligned to 64bit */ +} __attribute__ ((__packed__)); + /* tag for the radix tree of block groups in ram */ #define BTRFS_BLOCK_GROUP_DATA (1ULL << 0) #define BTRFS_BLOCK_GROUP_SYSTEM (1ULL << 1) @@ -1107,6 +1119,17 @@ struct btrfs_root { #define BTRFS_DEV_REPLACE_KEY 250 /* + * Stores items that allow to quickly map UUIDs to something else. + * These items are part of the filesystem UUID tree. + * The key is built like this: + * (UUID_upper_64_bits, BTRFS_UUID_KEY, UUID_lower_64_bits). + */ +#if BTRFS_UUID_SIZE != 16 +#error "UUID items require BTRFS_UUID_SIZE == 16!" +#endif +#define BTRFS_UUID_KEY 251 + +/* * string items are for debugging. They just store a short string of * data in the FS */ @@ -2046,6 +2069,12 @@ static inline u32 btrfs_file_extent_inline_item_len(struct extent_buffer *eb, return btrfs_item_size(eb, e) - offset; } +/* btrfs_uuid_item */ +BTRFS_SETGET_FUNCS(uuid_type, struct btrfs_uuid_item, type, 16); +BTRFS_SETGET_FUNCS(uuid_len, struct btrfs_uuid_item, len, 32); +BTRFS_SETGET_STACK_FUNCS(stack_uuid_type, struct btrfs_uuid_item, type, 16); +BTRFS_SETGET_STACK_FUNCS(stack_uuid_len, struct btrfs_uuid_item, len, 32); + static inline u32 btrfs_level_size(struct btrfs_root *root, int level) { if (level == 0) return root->leafsize; diff --git a/print-tree.c b/print-tree.c index aae47a9..c49a189 100644 --- a/print-tree.c +++ b/print-tree.c @@ -509,6 +509,9 @@ static void print_key_type(u64 objectid, u8 type) case BTRFS_DEV_STATS_KEY: printf("DEV_STATS_ITEM"); break; + case BTRFS_UUID_KEY: + printf("BTRFS_UUID_KEY"); + break; default: printf("UNKNOWN.%d", type); }; @@ -516,15 +519,17 @@ static void print_key_type(u64 objectid, u8 type) static void print_objectid(u64 objectid, u8 type) { - if (type == BTRFS_DEV_EXTENT_KEY) { + switch (type) { + case BTRFS_DEV_EXTENT_KEY: printf("%llu", (unsigned long long)objectid); /* device id */ return; - } - switch (type) { case BTRFS_QGROUP_RELATION_KEY: printf("%llu/%llu", objectid >> 48, objectid & ((1ll << 48) - 1)); return; + case BTRFS_UUID_KEY: + printf("0x%016llx", (unsigned long long)objectid); + return; } switch (objectid) { @@ -582,6 +587,9 @@ static void print_objectid(u64 objectid, u8 type) case BTRFS_QUOTA_TREE_OBJECTID: printf("QUOTA_TREE"); break; + case BTRFS_UUID_TREE_OBJECTID: + printf("UUID_TREE"); + break; case BTRFS_MULTIPLE_OBJECTIDS: printf("MULTIPLE"); break; @@ -616,6 +624,9 @@ void btrfs_print_key(struct btrfs_disk_key *disk_key) printf(" %llu/%llu)", (unsigned long long)(offset >> 48), (unsigned long long)(offset & ((1ll << 48) - 1))); break; + case BTRFS_UUID_KEY: + printf(" 0x%016llx)", (unsigned long long)offset); + break; default: if (offset == (u64)-1) printf(" -1)"); @@ -625,6 +636,78 @@ void btrfs_print_key(struct btrfs_disk_key *disk_key) } } +static void print_uuid_item(struct extent_buffer *l, + struct btrfs_uuid_item *ptr, + u32 item_size) +{ + do { + u16 sub_item_type; + u64 sub_item_len; + u64 subvol_id; + unsigned long offset; + + if (item_size < sizeof(*ptr)) { + printf("btrfs: uuid item too short (%lu < %d)!\n", + (unsigned long)item_size, (int)sizeof(*ptr)); + return; + } + sub_item_type = btrfs_uuid_type(l, ptr); + sub_item_len = btrfs_uuid_len(l, ptr); + ptr++; + item_size -= sizeof(*ptr); + if (sub_item_len * sizeof(u64) > item_size) { + printf("btrfs: uuid item too short (%llu > %lu)!\n", + (unsigned long long)(sub_item_len * sizeof(u64)), + (unsigned long)item_size); + + return; + } + + offset = (unsigned long)ptr; + ptr = (struct btrfs_uuid_item *) + (((char *)ptr) + sub_item_len * sizeof(u64)); + item_size -= sub_item_len * sizeof(u64); + switch (sub_item_type) { + case BTRFS_UUID_ITEM_TYPE_SUBVOL: + while (sub_item_len) { + read_extent_buffer(l, &subvol_id, offset, + sizeof(u64)); + printf("\t\tsubvol_id %llu\n", + (unsigned long long) + le64_to_cpu(subvol_id)); + sub_item_len--; + offset += sizeof(u64); + } + break; + case BTRFS_UUID_ITEM_TYPE_RECEIVED_SUBVOL: + while (sub_item_len) { + read_extent_buffer(l, &subvol_id, offset, + sizeof(u64)); + printf("\t\treceived_subvol_id %llu\n", + (unsigned long long) + le64_to_cpu(subvol_id)); + sub_item_len--; + offset += sizeof(u64); + } + break; + default: + printf("\t\tunknown type=%llu, len=8*%llu\n", + (unsigned long long)sub_item_type, + (unsigned long long)sub_item_len); + while (sub_item_len) { + read_extent_buffer(l, &subvol_id, offset, + sizeof(u64)); + printf("\t\tid %llu\n", + (unsigned long long) + le64_to_cpu(subvol_id)); + sub_item_len--; + offset += sizeof(u64); + } + break; + } + } while (item_size); +} + void btrfs_print_leaf(struct btrfs_root *root, struct extent_buffer *l) { int i; @@ -645,6 +728,7 @@ void btrfs_print_leaf(struct btrfs_root *root, struct extent_buffer *l) struct btrfs_qgroup_info_item *qg_info; struct btrfs_qgroup_limit_item *qg_limit; struct btrfs_qgroup_status_item *qg_status; + struct btrfs_uuid_item *uuid_item; u32 nr = btrfs_header_nritems(l); u64 objectid; u32 type; @@ -841,6 +925,11 @@ void btrfs_print_leaf(struct btrfs_root *root, struct extent_buffer *l) (long long) btrfs_qgroup_limit_rsv_exclusive(l, qg_limit)); break; + case BTRFS_UUID_KEY: + uuid_item = btrfs_item_ptr(l, i, + struct btrfs_uuid_item); + print_uuid_item(l, uuid_item, btrfs_item_size_nr(l, i)); + break; case BTRFS_STRING_ITEM_KEY: /* dirty, but it''s simple */ str = l->data + btrfs_item_ptr_offset(l, i); -- 1.8.2.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stefan Behrens
2013-May-16 14:45 UTC
[PATCH v3 2/4] Btrfs-progs: add UUID tree lookup methods
This commit adds UUID tree lookup methods that make use of the search ioctl. The code is based on the kernel code. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> --- Makefile | 5 +- ctree.h | 5 ++ uuid-tree.c | 174 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 182 insertions(+), 2 deletions(-) diff --git a/Makefile b/Makefile index 7d4faba..66eefb7 100644 --- a/Makefile +++ b/Makefile @@ -6,12 +6,13 @@ CFLAGS = -g -O1 objects = ctree.o disk-io.o radix-tree.o extent-tree.o print-tree.o \ root-tree.o dir-item.o file-item.o inode-item.o inode-map.o \ extent-cache.o extent_io.o volumes.o utils.o repair.o \ - qgroup.o raid6.o free-space-cache.o + qgroup.o raid6.o free-space-cache.o uuid-tree.o cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \ cmds-inspect.o cmds-balance.o cmds-send.o cmds-receive.o \ cmds-quota.o cmds-qgroup.o cmds-replace.o cmds-check.o \ cmds-restore.o -libbtrfs_objects = send-stream.o send-utils.o rbtree.o btrfs-list.o crc32c.o +libbtrfs_objects = send-stream.o send-utils.o rbtree.o btrfs-list.o crc32c.o \ + uuid-tree.o libbtrfs_headers = send-stream.h send-utils.h send.h rbtree.h btrfs-list.h \ crc32c.h list.h kerncompat.h radix-tree.h extent-cache.h \ extent_io.h ioctl.h ctree.h diff --git a/ctree.h b/ctree.h index 0f14b41..e36cb17 100644 --- a/ctree.h +++ b/ctree.h @@ -2355,4 +2355,9 @@ struct btrfs_csum_item *btrfs_lookup_csum(struct btrfs_trans_handle *trans, int btrfs_csum_truncate(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_path *path, u64 isize); + +/* uuid-tree.c */ +int btrfs_lookup_uuid_subvol_item(int fd, const u8 *uuid, u64 *subvol_id); +int btrfs_lookup_uuid_received_subvol_item(int fd, const u8 *uuid, + u64 *subvol_id); #endif diff --git a/uuid-tree.c b/uuid-tree.c new file mode 100644 index 0000000..fcd270e --- /dev/null +++ b/uuid-tree.c @@ -0,0 +1,174 @@ +/* + * Copyright (C) STRATO AG 2013. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ +#include <stdio.h> +#include <stdlib.h> +#include <uuid/uuid.h> +#include <sys/ioctl.h> +#include "ctree.h" +#include "transaction.h" +#include "disk-io.h" +#include "print-tree.h" + + +/* + * One key is used to store a sequence of btrfs_uuid_item items. + * Each item in the sequence contains a type information and a sequence of + * ids (together with the information about the size of the sequence of ids). + * {[btrfs_uuid_item type0 {id0, id1, ..., idN}], + * ..., + * [btrfs_uuid_item typeZ {id0, id1, ..., idN}]} + * + * It is forbidden to put multiple items with the same type under the same key. + * Instead the sequence of ids is extended and used to store any additional + * ids for the same item type. + */ + +static void btrfs_uuid_to_key(const u8 *uuid, u64 *key_objectid, + u64 *key_offset) +{ + *key_objectid = get_unaligned_le64(uuid); + *key_offset = get_unaligned_le64(uuid + sizeof(u64)); +} + +static struct btrfs_uuid_item *btrfs_match_uuid_item_type( + struct btrfs_uuid_item *ptr, u32 item_size, u16 type) +{ + do { + u16 sub_item_type; + u64 sub_item_len; + + if (item_size < sizeof(*ptr)) { + fprintf(stderr, + "btrfs: uuid item too short (%lu < %d)!\n", + (unsigned long)item_size, (int)sizeof(*ptr)); + return NULL; + } + item_size -= sizeof(*ptr); + sub_item_type = btrfs_stack_uuid_type(ptr); + sub_item_len = btrfs_stack_uuid_len(ptr); + if (sub_item_len * sizeof(u64) > item_size) { + fprintf(stderr, + "btrfs: uuid item too short (%llu > %lu)!\n", + (unsigned long long) + (sub_item_len * sizeof(u64)), + (unsigned long)item_size); + return NULL; + } + if (sub_item_type == type) + return ptr; + item_size -= sub_item_len * sizeof(u64); + ptr = 1 + (struct btrfs_uuid_item *) + (((char *)ptr) + (sub_item_len * sizeof(u64))); + } while (item_size); + + return NULL; +} + +static int btrfs_uuid_tree_lookup_prepare(int fd, const u8 *uuid, u16 type, + struct btrfs_ioctl_search_args + *search_arg, + struct btrfs_uuid_item **ptr) +{ + int ret; + u64 key_objectid = 0; + u64 key_offset; + struct btrfs_uuid_item *uuid_item; + struct btrfs_ioctl_search_header *search_header; + u32 item_size; + + btrfs_uuid_to_key(uuid, &key_objectid, &key_offset); + + memset(search_arg, 0, sizeof(*search_arg)); + search_arg->key.tree_id = BTRFS_UUID_TREE_OBJECTID; + search_arg->key.min_objectid = key_objectid; + search_arg->key.max_objectid = key_objectid; + search_arg->key.min_type = BTRFS_UUID_KEY; + search_arg->key.max_type = BTRFS_UUID_KEY; + search_arg->key.min_offset = key_offset; + search_arg->key.max_offset = key_offset; + search_arg->key.max_transid = (u64)-1; + search_arg->key.nr_items = 1; + ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, search_arg); + if (ret < 0) { + fprintf(stderr, + "ioctl(BTRFS_IOC_TREE_SEARCH, uuid, key %016llx, UUID_KEY, %016llx) ret=%d, error: %s\n", + (unsigned long long)key_objectid, + (unsigned long long)key_offset, ret, strerror(errno)); + ret = -ENOENT; + goto out; + } + + if (search_arg->key.nr_items < 1) { + ret = -ENOENT; + goto out; + } + search_header = (struct btrfs_ioctl_search_header *)(search_arg->buf); + uuid_item = (struct btrfs_uuid_item *)(search_header + 1); + item_size = search_header->len; + + *ptr = btrfs_match_uuid_item_type(uuid_item, item_size, type); + if (!*ptr) { + ret = -ENOENT; + goto out; + } + + ret = 0; + +out: + return ret; +} + +/* return -ENOENT for !found, < 0 for errors, or 0 if an item was found */ +static int btrfs_uuid_tree_lookup_any(int fd, const u8 *uuid, u16 type, + u64 *subid) +{ + int ret; + struct btrfs_ioctl_search_args search_arg; + struct btrfs_uuid_item *ptr; + u32 sub_item_len; + + ret = btrfs_uuid_tree_lookup_prepare(fd, uuid, type, &search_arg, &ptr); + if (ret) + goto out; + + sub_item_len = btrfs_stack_uuid_len(ptr); + if (sub_item_len > 0) { + /* return first stored id */ + memcpy(subid, ptr + 1, sizeof(*subid)); + *subid = le64_to_cpu(*subid); + } else { + ret = -ENOENT; + } + +out: + return ret; +} + +int btrfs_lookup_uuid_subvol_item(int fd, const u8 *uuid, u64 *subvol_id) +{ + return btrfs_uuid_tree_lookup_any(fd, uuid, BTRFS_UUID_ITEM_TYPE_SUBVOL, + subvol_id); +} + +int btrfs_lookup_uuid_received_subvol_item(int fd, const u8 *uuid, + u64 *subvol_id) +{ + return btrfs_uuid_tree_lookup_any(fd, uuid, + BTRFS_UUID_ITEM_TYPE_RECEIVED_SUBVOL, + subvol_id); +} -- 1.8.2.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stefan Behrens
2013-May-16 14:45 UTC
[PATCH v3 3/4] Btrfs-progs: use UUID tree for send/receive
This commit changes the btrfs send/receive commands to use the UUID tree to map UUIDs to subvolumes, and to use the root tree to map subvolume IDs to paths. Now these tools start fast and are independent on the number of subvolules/snapshot that exist. Before this commit, mapping UUIDs to subvolume IDs was an operation with a high effort. The algorithm even had quadratic effort (based on the number of existing subvolumes). E.g. with 15,000 subvolumes it took much more than 5 minutes on a state of the art XEON CPU to start btrfs send or receive before these tools were able to send or receive the first byte). Even linear effort instead of the current quadratic effort would be too much since it would be a waste. And these data structures to allow mapping UUIDs to subvolume IDs had been created every time a btrfs send/receive instance was started. It is much more efficient to maintain a searchable persistent data structure in the filesystem, one that is updated whenever a subvolume/snapshot is created and deleted, and when the received subvolume UUID is set by the btrfs-receive tool. Therefore kernel code was added that is able to maintain data structures in the filesystem that allow to quickly search for a given UUID and to retrieve data that is assigned to this UUID, like which subvolume ID is related to this UUID. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> --- cmds-receive.c | 23 ++- cmds-send.c | 53 +++++-- send-utils.c | 477 +++++++++++++++++++++++---------------------------------- send-utils.h | 9 +- 4 files changed, 253 insertions(+), 309 deletions(-) diff --git a/cmds-receive.c b/cmds-receive.c index c2fa2e1..4e480f9 100644 --- a/cmds-receive.c +++ b/cmds-receive.c @@ -31,6 +31,7 @@ #include <math.h> #include <ftw.h> #include <wait.h> +#include <assert.h> #include <sys/stat.h> #include <sys/types.h> @@ -129,14 +130,14 @@ static int finish_subvol(struct btrfs_receive *r) goto out; } - ret = btrfs_list_get_path_rootid(subvol_fd, &r->cur_subvol->root_id); - if (ret < 0) - goto out; - subvol_uuid_search_add(&r->sus, r->cur_subvol); - r->cur_subvol = NULL; ret = 0; out: + if (r->cur_subvol) { + free(r->cur_subvol->path); + free(r->cur_subvol); + r->cur_subvol = NULL; + } if (subvol_fd != -1) close(subvol_fd); return ret; @@ -197,7 +198,7 @@ static int process_snapshot(const char *path, const u8 *uuid, u64 ctransid, struct btrfs_receive *r = user; char uuid_str[128]; struct btrfs_ioctl_vol_args_v2 args_v2; - struct subvol_info *parent_subvol; + struct subvol_info *parent_subvol = NULL; ret = finish_subvol(r); if (ret < 0) @@ -268,6 +269,10 @@ static int process_snapshot(const char *path, const u8 *uuid, u64 ctransid, } out: + if (parent_subvol) { + free(parent_subvol->path); + free(parent_subvol); + } return ret; } @@ -557,7 +562,7 @@ static int process_clone(const char *path, u64 offset, u64 len, const char *clone_path, u64 clone_offset, void *user) { - int ret = 0; + int ret; struct btrfs_receive *r = user; struct btrfs_ioctl_clone_range_args clone_args; struct subvol_info *si = NULL; @@ -624,6 +629,10 @@ static int process_clone(const char *path, u64 offset, u64 len, } out: + if (si) { + free(si->path); + free(si); + } free(full_path); free(full_clone_path); free(subvol_path); diff --git a/cmds-send.c b/cmds-send.c index 72804a9..7209aba 100644 --- a/cmds-send.c +++ b/cmds-send.c @@ -31,6 +31,7 @@ #include <sys/ioctl.h> #include <libgen.h> #include <mntent.h> +#include <assert.h> #include <uuid/uuid.h> @@ -106,30 +107,33 @@ static int get_root_id(struct btrfs_send *s, const char *path, u64 *root_id) if (!si) return -ENOENT; *root_id = si->root_id; + free(si->path); + free(si); return 0; } static struct subvol_info *get_parent(struct btrfs_send *s, u64 root_id) { + struct subvol_info *si_tmp; struct subvol_info *si; - si = subvol_uuid_search(&s->sus, root_id, NULL, 0, NULL, + si_tmp = subvol_uuid_search(&s->sus, root_id, NULL, 0, NULL, subvol_search_by_root_id); - if (!si) + if (!si_tmp) return NULL; - si = subvol_uuid_search(&s->sus, 0, si->parent_uuid, 0, NULL, + si = subvol_uuid_search(&s->sus, 0, si_tmp->parent_uuid, 0, NULL, subvol_search_by_uuid); - if (!si) - return NULL; + free(si_tmp->path); + free(si_tmp); return si; } static int find_good_parent(struct btrfs_send *s, u64 root_id, u64 *found) { int ret; - struct subvol_info *parent; - struct subvol_info *parent2; + struct subvol_info *parent = NULL; + struct subvol_info *parent2 = NULL; struct subvol_info *best_parent = NULL; __s64 tmp; u64 best_diff = (u64)-1; @@ -144,24 +148,43 @@ static int find_good_parent(struct btrfs_send *s, u64 root_id, u64 *found) for (i = 0; i < s->clone_sources_count; i++) { if (s->clone_sources[i] == parent->root_id) { best_parent = parent; + parent = NULL; goto out_found; } } for (i = 0; i < s->clone_sources_count; i++) { parent2 = get_parent(s, s->clone_sources[i]); - if (parent2 != parent) + if (!parent2) continue; + if (parent2->root_id != parent->root_id) { + free(parent2->path); + free(parent2); + parent2 = NULL; + continue; + } + free(parent2->path); + free(parent2); parent2 = subvol_uuid_search(&s->sus, s->clone_sources[i], NULL, 0, NULL, subvol_search_by_root_id); + assert(parent2); tmp = parent2->ctransid - parent->ctransid; if (tmp < 0) tmp *= -1; if (tmp < best_diff) { + if (best_parent) { + free(best_parent->path); + free(best_parent); + } best_parent = parent2; + parent2 = NULL; best_diff = tmp; + } else { + free(parent2->path); + free(parent2); + parent2 = NULL; } } @@ -175,6 +198,14 @@ out_found: ret = 0; out: + if (parent) { + free(parent->path); + free(parent); + } + if (best_parent) { + free(best_parent->path); + free(best_parent); + } return ret; } @@ -320,7 +351,7 @@ static int do_send(struct btrfs_send *send, u64 root_id, u64 parent_root_id, fprintf(stderr, "joining genl thread\n"); close(pipefd[1]); - pipefd[1] = 0; + pipefd[1] = -1; ret = pthread_join(t_read, &t_err); if (ret) { @@ -347,6 +378,10 @@ out: close(pipefd[0]); if (pipefd[1] != -1) close(pipefd[1]); + if (si) { + free(si->path); + free(si); + } return ret; } diff --git a/send-utils.c b/send-utils.c index bacd47e..874f8a5 100644 --- a/send-utils.c +++ b/send-utils.c @@ -16,7 +16,10 @@ * Boston, MA 021110-1307, USA. */ +#include <unistd.h> +#include <fcntl.h> #include <sys/ioctl.h> +#include <uuid/uuid.h> #include "ctree.h" #include "send-utils.h" @@ -26,6 +29,136 @@ static int btrfs_subvolid_resolve_sub(int fd, char *path, size_t *path_len, u64 subvol_id); +static int btrfs_get_root_id_by_sub_path(int mnt_fd, const char *sub_path, + u64 *root_id) +{ + int ret; + int subvol_fd; + + subvol_fd = openat(mnt_fd, sub_path, O_RDONLY); + if (subvol_fd < 0) { + ret = -errno; + fprintf(stderr, "ERROR: open %s failed. %s\n", sub_path, + strerror(-ret)); + return ret; + } + + ret = btrfs_list_get_path_rootid(subvol_fd, root_id); + close(subvol_fd); + return ret; +} + +static int btrfs_read_root_item_raw(int mnt_fd, u64 root_id, size_t buf_len, + u32 *read_len, void *buf) +{ + int ret; + struct btrfs_ioctl_search_args args; + struct btrfs_ioctl_search_key *sk = &args.key; + struct btrfs_ioctl_search_header *sh; + unsigned long off = 0; + int found = 0; + int i; + + *read_len = 0; + memset(&args, 0, sizeof(args)); + + sk->tree_id = BTRFS_ROOT_TREE_OBJECTID; + + /* + * there may be more than one ROOT_ITEM key if there are + * snapshots pending deletion, we have to loop through + * them. + */ + sk->min_objectid = root_id; + sk->max_objectid = root_id; + sk->max_type = BTRFS_ROOT_ITEM_KEY; + sk->min_type = BTRFS_ROOT_ITEM_KEY; + sk->max_offset = (u64)-1; + sk->max_transid = (u64)-1; + sk->nr_items = 4096; + + while (1) { + ret = ioctl(mnt_fd, BTRFS_IOC_TREE_SEARCH, &args); + if (ret < 0) { + fprintf(stderr, + "ERROR: can''t perform the search - %s\n", + strerror(errno)); + return 0; + } + /* the ioctl returns the number of item it found in nr_items */ + if (sk->nr_items == 0) + break; + + off = 0; + for (i = 0; i < sk->nr_items; i++) { + struct btrfs_root_item *item; + sh = (struct btrfs_ioctl_search_header *)(args.buf + + off); + + off += sizeof(*sh); + item = (struct btrfs_root_item *)(args.buf + off); + off += sh->len; + + sk->min_objectid = sh->objectid; + sk->min_type = sh->type; + sk->min_offset = sh->offset; + + if (sh->objectid > root_id) + break; + + if (sh->objectid == root_id && + sh->type == BTRFS_ROOT_ITEM_KEY) { + if (sh->len > buf_len) { + /* btrfs-progs is too old for kernel */ + fprintf(stderr, + "ERROR: buf for read_root_item_raw() is too small, get newer btrfs tools!\n"); + return -EOVERFLOW; + } + memcpy(buf, item, sh->len); + *read_len = sh->len; + found = 1; + } + } + if (sk->min_offset < (u64)-1) + sk->min_offset++; + else + break; + + if (sk->min_type != BTRFS_ROOT_ITEM_KEY || + sk->min_objectid != root_id) + break; + } + + return found ? 0 : -ENOENT; +} + +/* + * Read a root item from the tree. In case we detect a root item smaller then + * sizeof(root_item), we know it''s an old version of the root structure and + * initialize all new fields to zero. The same happens if we detect mismatching + * generation numbers as then we know the root was once mounted with an older + * kernel that was not aware of the root item structure change. + */ +static int btrfs_read_root_item(int mnt_fd, u64 root_id, + struct btrfs_root_item *item) +{ + int ret; + u32 read_len; + + ret = btrfs_read_root_item_raw(mnt_fd, root_id, sizeof(*item), + &read_len, item); + if (ret) + return ret; + + if (read_len < sizeof(*item) || + btrfs_root_generation(item) != btrfs_root_generation_v2(item)) + memset(&item->generation_v2, 0, + sizeof(*item) - offsetof(struct btrfs_root_item, + generation_v2)); + + return 0; +} + int btrfs_subvolid_resolve(int fd, char *path, size_t path_len, u64 subvol_id) { if (path_len < 1) @@ -122,142 +255,13 @@ static int btrfs_subvolid_resolve_sub(int fd, char *path, size_t *path_len, return 0; } -static struct rb_node *tree_insert(struct rb_root *root, - struct subvol_info *si, - enum subvol_search_type type) -{ - struct rb_node ** p = &root->rb_node; - struct rb_node * parent = NULL; - struct subvol_info *entry; - __s64 comp; - - while(*p) { - parent = *p; - if (type == subvol_search_by_received_uuid) { - entry = rb_entry(parent, struct subvol_info, - rb_received_node); - - comp = memcmp(entry->received_uuid, si->received_uuid, - BTRFS_UUID_SIZE); - if (!comp) { - if (entry->stransid < si->stransid) - comp = -1; - else if (entry->stransid > si->stransid) - comp = 1; - else - comp = 0; - } - } else if (type == subvol_search_by_uuid) { - entry = rb_entry(parent, struct subvol_info, - rb_local_node); - comp = memcmp(entry->uuid, si->uuid, BTRFS_UUID_SIZE); - } else if (type == subvol_search_by_root_id) { - entry = rb_entry(parent, struct subvol_info, - rb_root_id_node); - comp = entry->root_id - si->root_id; - } else if (type == subvol_search_by_path) { - entry = rb_entry(parent, struct subvol_info, - rb_path_node); - comp = strcmp(entry->path, si->path); - } else { - BUG(); - } - - if (comp < 0) - p = &(*p)->rb_left; - else if (comp > 0) - p = &(*p)->rb_right; - else - return parent; - } - - if (type == subvol_search_by_received_uuid) { - rb_link_node(&si->rb_received_node, parent, p); - rb_insert_color(&si->rb_received_node, root); - } else if (type == subvol_search_by_uuid) { - rb_link_node(&si->rb_local_node, parent, p); - rb_insert_color(&si->rb_local_node, root); - } else if (type == subvol_search_by_root_id) { - rb_link_node(&si->rb_root_id_node, parent, p); - rb_insert_color(&si->rb_root_id_node, root); - } else if (type == subvol_search_by_path) { - rb_link_node(&si->rb_path_node, parent, p); - rb_insert_color(&si->rb_path_node, root); - } - return NULL; -} - -static struct subvol_info *tree_search(struct rb_root *root, - u64 root_id, const u8 *uuid, - u64 stransid, const char *path, - enum subvol_search_type type) -{ - struct rb_node * n = root->rb_node; - struct subvol_info *entry; - __s64 comp; - - while(n) { - if (type == subvol_search_by_received_uuid) { - entry = rb_entry(n, struct subvol_info, - rb_received_node); - comp = memcmp(entry->received_uuid, uuid, - BTRFS_UUID_SIZE); - if (!comp) { - if (entry->stransid < stransid) - comp = -1; - else if (entry->stransid > stransid) - comp = 1; - else - comp = 0; - } - } else if (type == subvol_search_by_uuid) { - entry = rb_entry(n, struct subvol_info, rb_local_node); - comp = memcmp(entry->uuid, uuid, BTRFS_UUID_SIZE); - } else if (type == subvol_search_by_root_id) { - entry = rb_entry(n, struct subvol_info, rb_root_id_node); - comp = entry->root_id - root_id; - } else if (type == subvol_search_by_path) { - entry = rb_entry(n, struct subvol_info, rb_path_node); - comp = strcmp(entry->path, path); - } else { - BUG(); - } - if (comp < 0) - n = n->rb_left; - else if (comp > 0) - n = n->rb_right; - else - return entry; - } - return NULL; -} - -static int count_bytes(void *buf, int len, char b) -{ - int cnt = 0; - int i; - for (i = 0; i < len; i++) { - if (((char*)buf)[i] == b) - cnt++; - } - return cnt; -} - void subvol_uuid_search_add(struct subvol_uuid_search *s, struct subvol_info *si) { - int cnt; - - tree_insert(&s->root_id_subvols, si, subvol_search_by_root_id); - tree_insert(&s->path_subvols, si, subvol_search_by_path); - - cnt = count_bytes(si->uuid, BTRFS_UUID_SIZE, 0); - if (cnt != BTRFS_UUID_SIZE) - tree_insert(&s->local_subvols, si, subvol_search_by_uuid); - cnt = count_bytes(si->received_uuid, BTRFS_UUID_SIZE, 0); - if (cnt != BTRFS_UUID_SIZE) - tree_insert(&s->received_subvols, si, - subvol_search_by_received_uuid); + if (si) { + free(si->path); + free(si); + } } struct subvol_info *subvol_uuid_search(struct subvol_uuid_search *s, @@ -265,166 +269,71 @@ struct subvol_info *subvol_uuid_search(struct subvol_uuid_search *s, const char *path, enum subvol_search_type type) { - struct rb_root *root; - if (type == subvol_search_by_received_uuid) - root = &s->received_subvols; - else if (type == subvol_search_by_uuid) - root = &s->local_subvols; - else if (type == subvol_search_by_root_id) - root = &s->root_id_subvols; - else if (type == subvol_search_by_path) - root = &s->path_subvols; - else - return NULL; - return tree_search(root, root_id, uuid, transid, path, type); -} - -int subvol_uuid_search_init(int mnt_fd, struct subvol_uuid_search *s) -{ - int ret; - struct btrfs_ioctl_search_args args; - struct btrfs_ioctl_search_key *sk = &args.key; - struct btrfs_ioctl_search_header *sh; - struct btrfs_root_item *root_item_ptr; + int ret = 0; struct btrfs_root_item root_item; - struct subvol_info *si = NULL; - int root_item_valid = 0; - unsigned long off = 0; - int i; - int e; - char *path; - - memset(&args, 0, sizeof(args)); - - sk->tree_id = BTRFS_ROOT_TREE_OBJECTID; - - sk->max_objectid = (u64)-1; - sk->max_offset = (u64)-1; - sk->max_transid = (u64)-1; - sk->min_type = BTRFS_ROOT_ITEM_KEY; - sk->max_type = BTRFS_ROOT_BACKREF_KEY; - sk->nr_items = 4096; - - while(1) { - ret = ioctl(mnt_fd, BTRFS_IOC_TREE_SEARCH, &args); - e = errno; - if (ret < 0) { - fprintf(stderr, "ERROR: can''t perform the search- %s\n", - strerror(e)); - return ret; - } - if (sk->nr_items == 0) - break; - - off = 0; - - for (i = 0; i < sk->nr_items; i++) { - sh = (struct btrfs_ioctl_search_header *)(args.buf + - off); - off += sizeof(*sh); - - if ((sh->objectid != 5 && - sh->objectid < BTRFS_FIRST_FREE_OBJECTID) || - sh->objectid > BTRFS_LAST_FREE_OBJECTID) - goto skip; - - if (sh->type == BTRFS_ROOT_ITEM_KEY) { - /* older kernels don''t have uuids+times */ - if (sh->len < sizeof(root_item)) { - root_item_valid = 0; - goto skip; - } - root_item_ptr = (struct btrfs_root_item *) - (args.buf + off); - memcpy(&root_item, root_item_ptr, - sizeof(root_item)); - root_item_valid = 1; - } else if (sh->type == BTRFS_ROOT_BACKREF_KEY || - root_item_valid) { - if (!root_item_valid) - goto skip; - - path = btrfs_list_path_for_root(mnt_fd, - sh->objectid); - if (!path) - path = strdup(""); - if (IS_ERR(path)) { - ret = PTR_ERR(path); - fprintf(stderr, "ERROR: unable to " - "resolve path " - "for root %llu\n", - sh->objectid); - goto out; - } - - si = calloc(1, sizeof(*si)); - si->root_id = sh->objectid; - memcpy(si->uuid, root_item.uuid, - BTRFS_UUID_SIZE); - memcpy(si->parent_uuid, root_item.parent_uuid, - BTRFS_UUID_SIZE); - memcpy(si->received_uuid, - root_item.received_uuid, - BTRFS_UUID_SIZE); - si->ctransid = btrfs_root_ctransid(&root_item); - si->otransid = btrfs_root_otransid(&root_item); - si->stransid = btrfs_root_stransid(&root_item); - si->rtransid = btrfs_root_rtransid(&root_item); - si->path = path; - subvol_uuid_search_add(s, si); - root_item_valid = 0; - } else { - goto skip; - } - -skip: - off += sh->len; + struct subvol_info *info = NULL; + + switch (type) { + case subvol_search_by_received_uuid: + ret = btrfs_lookup_uuid_received_subvol_item(s->mnt_fd, uuid, + &root_id); + break; + case subvol_search_by_uuid: + ret = btrfs_lookup_uuid_subvol_item(s->mnt_fd, uuid, &root_id); + break; + case subvol_search_by_root_id: + break; + case subvol_search_by_path: + ret = btrfs_get_root_id_by_sub_path(s->mnt_fd, path, &root_id); + break; + default: + ret = -EINVAL; + break; + } - /* - * record the mins in sk so we can make sure the - * next search doesn''t repeat this root - */ - sk->min_objectid = sh->objectid; - sk->min_offset = sh->offset; - sk->min_type = sh->type; - } - sk->nr_items = 4096; - if (sk->min_offset < (u64)-1) - sk->min_offset++; - else if (sk->min_objectid < (u64)-1) { - sk->min_objectid++; - sk->min_offset = 0; - sk->min_type = 0; - } else - break; + if (ret) + goto out; + + ret = btrfs_read_root_item(s->mnt_fd, root_id, &root_item); + if (ret) + goto out; + + info = calloc(1, sizeof(*info)); + info->root_id = root_id; + memcpy(info->uuid, root_item.uuid, BTRFS_UUID_SIZE); + memcpy(info->received_uuid, root_item.received_uuid, BTRFS_UUID_SIZE); + memcpy(info->parent_uuid, root_item.parent_uuid, BTRFS_UUID_SIZE); + info->ctransid = btrfs_root_ctransid(&root_item); + info->otransid = btrfs_root_otransid(&root_item); + info->stransid = btrfs_root_stransid(&root_item); + info->rtransid = btrfs_root_rtransid(&root_item); + if (type == subvol_search_by_path) { + info->path = strdup(path); + } else { + info->path = malloc(BTRFS_PATH_NAME_MAX); + ret = btrfs_subvolid_resolve(s->mnt_fd, info->path, + BTRFS_PATH_NAME_MAX, root_id); } out: - return ret; + if (ret && info) { + free(info->path); + free(info); + info = NULL; + } + + return info; } -/* - * It''s safe to call this function even without the subvol_uuid_search_init() - * call before as long as the subvol_uuid_search structure is all-zero. - */ -void subvol_uuid_search_finit(struct subvol_uuid_search *s) +int subvol_uuid_search_init(int mnt_fd, struct subvol_uuid_search *s) { - struct rb_root *root = &s->root_id_subvols; - struct rb_node *node; - - while ((node = rb_first(root))) { - struct subvol_info *entry - rb_entry(node, struct subvol_info, rb_root_id_node); + s->mnt_fd = mnt_fd; - free(entry->path); - rb_erase(node, root); - free(entry); - } + return 0; +} - s->root_id_subvols = RB_ROOT; - s->local_subvols = RB_ROOT; - s->received_subvols = RB_ROOT; - s->path_subvols = RB_ROOT; +void subvol_uuid_search_finit(struct subvol_uuid_search *s) +{ } char *path_cat(const char *p1, const char *p2) @@ -441,7 +350,6 @@ char *path_cat(const char *p1, const char *p2) return new; } - char *path_cat3(const char *p1, const char *p2, const char *p3) { int p1_len = strlen(p1); @@ -458,4 +366,3 @@ char *path_cat3(const char *p1, const char *p2, const char *p3) sprintf(new, "%.*s/%.*s/%.*s", p1_len, p1, p2_len, p2, p3_len, p3); return new; } - diff --git a/send-utils.h b/send-utils.h index 06af75f..ed1a40e 100644 --- a/send-utils.h +++ b/send-utils.h @@ -38,10 +38,6 @@ enum subvol_search_type { }; struct subvol_info { - struct rb_node rb_root_id_node; - struct rb_node rb_local_node; - struct rb_node rb_received_node; - struct rb_node rb_path_node; u64 root_id; u8 uuid[BTRFS_UUID_SIZE]; u8 parent_uuid[BTRFS_UUID_SIZE]; @@ -55,10 +51,7 @@ struct subvol_info { }; struct subvol_uuid_search { - struct rb_root root_id_subvols; - struct rb_root local_subvols; - struct rb_root received_subvols; - struct rb_root path_subvols; + int mnt_fd; }; int subvol_uuid_search_init(int mnt_fd, struct subvol_uuid_search *s); -- 1.8.2.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stefan Behrens
2013-May-16 14:45 UTC
[PATCH v3 4/4] Btrfs-progs: add uuid_tree_gen field to btrfs-show-super
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> --- btrfs-show-super.c | 2 ++ ctree.h | 5 ++++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/btrfs-show-super.c b/btrfs-show-super.c index f587f10..c815469 100644 --- a/btrfs-show-super.c +++ b/btrfs-show-super.c @@ -247,6 +247,8 @@ static void dump_superblock(struct btrfs_super_block *sb) (unsigned long long)btrfs_super_csum_size(sb)); printf("cache_generation\t%llu\n", (unsigned long long)btrfs_super_cache_generation(sb)); + printf("uuid_tree_generation\t%llu\n", + (unsigned long long)btrfs_super_uuid_tree_generation(sb)); uuid_unparse(sb->dev_item.uuid, buf); printf("dev_item.uuid\t\t%s\n", buf); diff --git a/ctree.h b/ctree.h index e36cb17..cf068b1 100644 --- a/ctree.h +++ b/ctree.h @@ -437,9 +437,10 @@ struct btrfs_super_block { char label[BTRFS_LABEL_SIZE]; __le64 cache_generation; + __le64 uuid_tree_generation; /* future expansion */ - __le64 reserved[31]; + __le64 reserved[30]; u8 sys_chunk_array[BTRFS_SYSTEM_CHUNK_ARRAY_SIZE]; struct btrfs_root_backup super_roots[BTRFS_NUM_BACKUP_ROOTS]; } __attribute__ ((__packed__)); @@ -1941,6 +1942,8 @@ BTRFS_SETGET_STACK_FUNCS(super_csum_type, struct btrfs_super_block, csum_type, 16); BTRFS_SETGET_STACK_FUNCS(super_cache_generation, struct btrfs_super_block, cache_generation, 64); +BTRFS_SETGET_STACK_FUNCS(super_uuid_tree_generation, struct btrfs_super_block, + uuid_tree_generation, 64); static inline int btrfs_super_csum_size(struct btrfs_super_block *s) { -- 1.8.2.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2013-May-16 16:19 UTC
Re: [PATCH v3 1/4] Btrfs-progs: Support UUID tree and UUID items in btrfs-debug-tree
On Thu, May 16, 2013 at 04:45:55PM +0200, Stefan Behrens wrote:> +struct btrfs_uuid_item { > + __le16 type; /* refer to BTRFS_UUID_ITEM_TYPE* defines above */ > + __le32 len; /* number of following 64bit values */ > + __le64 data[0]; /* data aligned to 64bit */ > +} __attribute__ ((__packed__));With __packed__ (which is preferrably written as __packed) the data is not aligned to u64 as the comment says. Aligning u64''s is a good thing, so (for example) pad the space after type (I don''t think we need more than u16 here). david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stefan Behrens
2013-May-16 21:44 UTC
Re: [PATCH v3 1/4] Btrfs-progs: Support UUID tree and UUID items in btrfs-debug-tree
On 05/16/2013 18:19, David Sterba wrote:> On Thu, May 16, 2013 at 04:45:55PM +0200, Stefan Behrens wrote: >> +struct btrfs_uuid_item { >> + __le16 type; /* refer to BTRFS_UUID_ITEM_TYPE* defines above */ >> + __le32 len; /* number of following 64bit values */ >> + __le64 data[0]; /* data aligned to 64bit */ >> +} __attribute__ ((__packed__)); > > With __packed__ (which is preferrably written as __packed) the data is > not aligned to u64 as the comment says. Aligning u64''s is a good thing, > so (for example) pad the space after type (I don''t think we need more > than u16 here).The on-disk format is in general not aligned (it is "packed") and stored on any byte-aligned position on the disk. In the source code, you can use it for sizeof(), otherwise you use the access functions from ctree.h and struct-funcs.c. I know that you know this already :) Maybe I am not understanding your review comment? The "data" part in the btrfs_uuid_item (which represents the "value" of the (type, length, value) tripple) is a multiple of 64 bits, that''s what I mean with the "data aligned to 64bit" comment. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html