Dear list, This is V4 of batched discard support, now we will get full mapping of the free space on each device for RAID0/1/10/DUP instead of just a single stripe length, and tested with xfsstests 251, Thanks. Changelog V4: *make btrfs_map_block() return full mapping. Changelog V3: *fix style problems. *rebase to 2.6.38-rc7. Changelog V2: *Check if we have devices support trim before trying to trim the fs, also adjust minlen according to the discard_granularity. *Update reserved extent calculations in btrfs_trim_block_group(). *Call cond_resched() without checking need_resched() *Use bitmap_clear_bits() and unlink_free_space() instead of btrfs_remove_free_space(), so we won''t search the same extent for twice. *Try harder in btrfs_discard_extent(), now we won''t report errors if it''s not a EOPNOTSUPP. *make sure the block group is cached before trimming it,or we''ll see an empty caching tree if the block group is not cached. *Minor return value fix in btrfs_discard_block_group(). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Li Dongyang
2011-Mar-24 10:24 UTC
[PATCH V4 1/4] Btrfs: make update_reserved_bytes() public
Make the function public as we should update the reserved extents calculations after taking out an extent for trimming. Signed-off-by: Li Dongyang <lidongyang@novell.com> --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/extent-tree.c | 16 +++++++--------- 2 files changed, 9 insertions(+), 9 deletions(-) create mode 100644 fs/btrfs/Module.symvers diff --git a/fs/btrfs/Module.symvers b/fs/btrfs/Module.symvers new file mode 100644 index 0000000..e69de29 diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 7f78cc7..2c84551 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2157,6 +2157,8 @@ int btrfs_free_extent(struct btrfs_trans_handle *trans, u64 root_objectid, u64 owner, u64 offset); int btrfs_free_reserved_extent(struct btrfs_root *root, u64 start, u64 len); +int btrfs_update_reserved_bytes(struct btrfs_block_group_cache *cache, + u64 num_bytes, int reserve, int sinfo); int btrfs_prepare_extent_commit(struct btrfs_trans_handle *trans, struct btrfs_root *root); int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 7b3089b..caa4254 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -36,8 +36,6 @@ static int update_block_group(struct btrfs_trans_handle *trans, struct btrfs_root *root, u64 bytenr, u64 num_bytes, int alloc); -static int update_reserved_bytes(struct btrfs_block_group_cache *cache, - u64 num_bytes, int reserve, int sinfo); static int __btrfs_free_extent(struct btrfs_trans_handle *trans, struct btrfs_root *root, u64 bytenr, u64 num_bytes, u64 parent, @@ -4223,8 +4221,8 @@ int btrfs_pin_extent(struct btrfs_root *root, * update size of reserved extents. this function may return -EAGAIN * if ''reserve'' is true or ''sinfo'' is false. */ -static int update_reserved_bytes(struct btrfs_block_group_cache *cache, - u64 num_bytes, int reserve, int sinfo) +int btrfs_update_reserved_bytes(struct btrfs_block_group_cache *cache, + u64 num_bytes, int reserve, int sinfo) { int ret = 0; if (sinfo) { @@ -4704,10 +4702,10 @@ void btrfs_free_tree_block(struct btrfs_trans_handle *trans, WARN_ON(test_bit(EXTENT_BUFFER_DIRTY, &buf->bflags)); btrfs_add_free_space(cache, buf->start, buf->len); - ret = update_reserved_bytes(cache, buf->len, 0, 0); + ret = btrfs_update_reserved_bytes(cache, buf->len, 0, 0); if (ret == -EAGAIN) { /* block group became read-only */ - update_reserved_bytes(cache, buf->len, 0, 1); + btrfs_update_reserved_bytes(cache, buf->len, 0, 1); goto out; } @@ -5191,7 +5189,7 @@ checks: search_start - offset); BUG_ON(offset > search_start); - ret = update_reserved_bytes(block_group, num_bytes, 1, + ret = btrfs_update_reserved_bytes(block_group, num_bytes, 1, (data & BTRFS_BLOCK_GROUP_DATA)); if (ret == -EAGAIN) { btrfs_add_free_space(block_group, offset, num_bytes); @@ -5415,7 +5413,7 @@ int btrfs_free_reserved_extent(struct btrfs_root *root, u64 start, u64 len) ret = btrfs_discard_extent(root, start, len); btrfs_add_free_space(cache, start, len); - update_reserved_bytes(cache, len, 0, 1); + btrfs_update_reserved_bytes(cache, len, 0, 1); btrfs_put_block_group(cache); return ret; @@ -5614,7 +5612,7 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans, put_caching_control(caching_ctl); } - ret = update_reserved_bytes(block_group, ins->offset, 1, 1); + ret = btrfs_update_reserved_bytes(block_group, ins->offset, 1, 1); BUG_ON(ret); btrfs_put_block_group(block_group); ret = alloc_reserved_file_extent(trans, root, 0, root_objectid, -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Li Dongyang
2011-Mar-24 10:24 UTC
[PATCH V4 2/4] Btrfs: make btrfs_map_block() return entire free extent for each device of RAID0/1/10/DUP
btrfs_map_block() will only return a single stripe length, but we want the full extent be mapped to each disk when we are trimming the extent, so we add length to btrfs_bio_stripe and fill it if we are mapping for REQ_DISCARD. Signed-off-by: Li Dongyang <lidongyang@novell.com> --- fs/btrfs/volumes.c | 150 ++++++++++++++++++++++++++++++++++++++++++++-------- fs/btrfs/volumes.h | 1 + 2 files changed, 129 insertions(+), 22 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index dd13eb8..e81cce6 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2962,7 +2962,10 @@ static int __btrfs_map_block(struct btrfs_mapping_tree *map_tree, int rw, struct extent_map_tree *em_tree = &map_tree->map_tree; u64 offset; u64 stripe_offset; + u64 stripe_end_offset; u64 stripe_nr; + u64 stripe_nr_orig; + u64 stripe_nr_end; int stripes_allocated = 8; int stripes_required = 1; int stripe_index; @@ -2971,7 +2974,7 @@ static int __btrfs_map_block(struct btrfs_mapping_tree *map_tree, int rw, int max_errors = 0; struct btrfs_multi_bio *multi = NULL; - if (multi_ret && !(rw & REQ_WRITE)) + if (multi_ret && !(rw & (REQ_WRITE | REQ_DISCARD))) stripes_allocated = 1; again: if (multi_ret) { @@ -3017,7 +3020,15 @@ again: max_errors = 1; } } - if (multi_ret && (rw & REQ_WRITE) && + if (rw & REQ_DISCARD) { + if (map->type & (BTRFS_BLOCK_GROUP_RAID0 | + BTRFS_BLOCK_GROUP_RAID1 | + BTRFS_BLOCK_GROUP_DUP | + BTRFS_BLOCK_GROUP_RAID10)) { + stripes_required = map->num_stripes; + } + } + if (multi_ret && (rw & (REQ_WRITE | REQ_DISCARD)) && stripes_allocated < stripes_required) { stripes_allocated = map->num_stripes; free_extent_map(em); @@ -3037,12 +3048,15 @@ again: /* stripe_offset is the offset of this block in its stripe*/ stripe_offset = offset - stripe_offset; - if (map->type & (BTRFS_BLOCK_GROUP_RAID0 | BTRFS_BLOCK_GROUP_RAID1 | - BTRFS_BLOCK_GROUP_RAID10 | - BTRFS_BLOCK_GROUP_DUP)) { + if (rw & REQ_DISCARD) + *length = min_t(u64, em->len - offset, *length); + else if (map->type & (BTRFS_BLOCK_GROUP_RAID0 | + BTRFS_BLOCK_GROUP_RAID1 | + BTRFS_BLOCK_GROUP_RAID10 | + BTRFS_BLOCK_GROUP_DUP)) { /* we limit the length of each bio to what fits in a stripe */ *length = min_t(u64, em->len - offset, - map->stripe_len - stripe_offset); + map->stripe_len - stripe_offset); } else { *length = em->len - offset; } @@ -3052,8 +3066,19 @@ again: num_stripes = 1; stripe_index = 0; - if (map->type & BTRFS_BLOCK_GROUP_RAID1) { - if (unplug_page || (rw & REQ_WRITE)) + stripe_nr_orig = stripe_nr; + stripe_nr_end = (offset + *length + map->stripe_len - 1) & + (~(map->stripe_len - 1)); + do_div(stripe_nr_end, map->stripe_len); + stripe_end_offset = stripe_nr_end * map->stripe_len - + (offset + *length); + if (map->type & BTRFS_BLOCK_GROUP_RAID0) { + if (rw & REQ_DISCARD) + num_stripes = min_t(u64, map->num_stripes, + stripe_nr_end - stripe_nr_orig); + stripe_index = do_div(stripe_nr, map->num_stripes); + } else if (map->type & BTRFS_BLOCK_GROUP_RAID1) { + if (unplug_page || (rw & (REQ_WRITE | REQ_DISCARD))) num_stripes = map->num_stripes; else if (mirror_num) stripe_index = mirror_num - 1; @@ -3064,7 +3089,7 @@ again: } } else if (map->type & BTRFS_BLOCK_GROUP_DUP) { - if (rw & REQ_WRITE) + if (rw & (REQ_WRITE | REQ_DISCARD)) num_stripes = map->num_stripes; else if (mirror_num) stripe_index = mirror_num - 1; @@ -3077,6 +3102,10 @@ again: if (unplug_page || (rw & REQ_WRITE)) num_stripes = map->sub_stripes; + else if (rw & REQ_DISCARD) + num_stripes = min_t(u64, map->sub_stripes * + (stripe_nr_end - stripe_nr_orig), + map->num_stripes); else if (mirror_num) stripe_index += mirror_num - 1; else { @@ -3094,24 +3123,101 @@ again: } BUG_ON(stripe_index >= map->num_stripes); - for (i = 0; i < num_stripes; i++) { - if (unplug_page) { - struct btrfs_device *device; - struct backing_dev_info *bdi; - - device = map->stripes[stripe_index].dev; - if (device->bdev) { - bdi = blk_get_backing_dev_info(device->bdev); - if (bdi->unplug_io_fn) - bdi->unplug_io_fn(bdi, unplug_page); - } - } else { + if (rw & REQ_DISCARD) { + for (i = 0; i < num_stripes; i++) { multi->stripes[i].physical map->stripes[stripe_index].physical + stripe_offset + stripe_nr * map->stripe_len; multi->stripes[i].dev = map->stripes[stripe_index].dev; + + if (map->type & BTRFS_BLOCK_GROUP_RAID0) { + u64 stripes; + int last_stripe = (stripe_nr_end - 1) % + map->num_stripes; + int j; + + for (j = 0; j < map->num_stripes; j++) { + if ((stripe_nr_end - 1 - j) % + map->num_stripes == stripe_index) + break; + } + stripes = stripe_nr_end - 1 - j; + do_div(stripes, map->num_stripes); + multi->stripes[i].length = map->stripe_len * + (stripes - stripe_nr + 1); + + if (i == 0) { + multi->stripes[i].length -+ stripe_offset; + stripe_offset = 0; + } + if (stripe_index == last_stripe) + multi->stripes[i].length -+ stripe_end_offset; + } else if (map->type & BTRFS_BLOCK_GROUP_RAID10) { + u64 stripes; + int j; + int factor = map->num_stripes / + map->sub_stripes; + int last_stripe = (stripe_nr_end - 1) % factor; + last_stripe *= map->sub_stripes; + + for (j = 0; j < factor; j++) { + if ((stripe_nr_end - 1 - j) % factor =+ stripe_index / map->sub_stripes) + break; + } + stripes = stripe_nr_end - 1 - j; + do_div(stripes, factor); + multi->stripes[i].length = map->stripe_len * + (stripes - stripe_nr + 1); + + if (i < map->sub_stripes) { + multi->stripes[i].length -+ stripe_offset; + if (i == map->sub_stripes - 1) + stripe_offset = 0; + } + if (stripe_index >= last_stripe && + stripe_index <= (last_stripe + + map->sub_stripes - 1)) { + multi->stripes[i].length -+ stripe_end_offset; + } + } else + multi->stripes[i].length = *length; + + stripe_index++; + if (stripe_index == map->num_stripes) { + /* This could only happen for RAID0/10 */ + stripe_index = 0; + stripe_nr++; + } + } + } else { + for (i = 0; i < num_stripes; i++) { + if (unplug_page) { + struct btrfs_device *device; + struct backing_dev_info *bdi; + + device = map->stripes[stripe_index].dev; + if (device->bdev) { + bdi = blk_get_backing_dev_info(device-> + bdev); + if (bdi->unplug_io_fn) + bdi->unplug_io_fn(bdi, + unplug_page); + } + } else { + multi->stripes[i].physical + map->stripes[stripe_index].physical + + stripe_offset + + stripe_nr * map->stripe_len; + multi->stripes[i].dev + map->stripes[stripe_index].dev; + } + stripe_index++; } - stripe_index++; } if (multi_ret) { *multi_ret = multi; diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 7fb59d4..5ae2569 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -126,6 +126,7 @@ struct btrfs_fs_devices { struct btrfs_bio_stripe { struct btrfs_device *dev; u64 physical; + u64 length; /* only used for discard mappings */ }; struct btrfs_multi_bio { -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Li Dongyang
2011-Mar-24 10:24 UTC
[PATCH V4 3/4] Btrfs: adjust btrfs_discard_extent() return errors and trimmed bytes
Callers of btrfs_discard_extent() should check if we are mounted with -o discard, as we want to make fitrim to work even the fs is not mounted with -o discard. Also we should use REQ_DISCARD to map the free extent to get a full mapping, last we only return errors if 1. the error is not a EOPNOTSUPP 2. no device supports discard Signed-off-by: Li Dongyang <lidongyang@novell.com> --- fs/btrfs/ctree.h | 2 +- fs/btrfs/disk-io.c | 5 ++++- fs/btrfs/extent-tree.c | 45 ++++++++++++++++++++++++++------------------- 3 files changed, 31 insertions(+), 21 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 2c84551..94bb772 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2229,7 +2229,7 @@ u64 btrfs_account_ro_block_groups_free_space(struct btrfs_space_info *sinfo); int btrfs_error_unpin_extent_range(struct btrfs_root *root, u64 start, u64 end); int btrfs_error_discard_extent(struct btrfs_root *root, u64 bytenr, - u64 num_bytes); + u64 num_bytes, u64 *actual_bytes); int btrfs_force_chunk_alloc(struct btrfs_trans_handle *trans, struct btrfs_root *root, u64 type); diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 100b07f..98b60b0 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2947,7 +2947,10 @@ static int btrfs_destroy_pinned_extent(struct btrfs_root *root, break; /* opt_discard */ - ret = btrfs_error_discard_extent(root, start, end + 1 - start); + if (btrfs_test_opt(root, DISCARD)) + ret = btrfs_error_discard_extent(root, start, + end + 1 - start, + NULL); clear_extent_dirty(unpin, start, end, GFP_NOFS); btrfs_error_unpin_extent_range(root, start, end); diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index caa4254..10e542a 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1738,40 +1738,44 @@ static int remove_extent_backref(struct btrfs_trans_handle *trans, return ret; } -static void btrfs_issue_discard(struct block_device *bdev, +static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len) { - blkdev_issue_discard(bdev, start >> 9, len >> 9, GFP_KERNEL, 0); + return blkdev_issue_discard(bdev, start >> 9, len >> 9, GFP_KERNEL, 0); } static int btrfs_discard_extent(struct btrfs_root *root, u64 bytenr, - u64 num_bytes) + u64 num_bytes, u64 *actual_bytes) { int ret; - u64 map_length = num_bytes; + u64 discarded_bytes = 0; struct btrfs_multi_bio *multi = NULL; - if (!btrfs_test_opt(root, DISCARD)) - return 0; - /* Tell the block device(s) that the sectors can be discarded */ - ret = btrfs_map_block(&root->fs_info->mapping_tree, READ, - bytenr, &map_length, &multi, 0); + ret = btrfs_map_block(&root->fs_info->mapping_tree, REQ_DISCARD, + bytenr, &num_bytes, &multi, 0); if (!ret) { struct btrfs_bio_stripe *stripe = multi->stripes; int i; - if (map_length > num_bytes) - map_length = num_bytes; - for (i = 0; i < multi->num_stripes; i++, stripe++) { - btrfs_issue_discard(stripe->dev->bdev, - stripe->physical, - map_length); + ret = btrfs_issue_discard(stripe->dev->bdev, + stripe->physical, + stripe->length); + if (!ret) + discarded_bytes += stripe->length; + else if (ret != -EOPNOTSUPP) + break; } kfree(multi); } + if (discarded_bytes && ret == -EOPNOTSUPP) + ret = 0; + + if (actual_bytes) + *actual_bytes = discarded_bytes; + return ret; } @@ -4361,7 +4365,9 @@ int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans, if (ret) break; - ret = btrfs_discard_extent(root, start, end + 1 - start); + if (btrfs_test_opt(root, DISCARD)) + ret = btrfs_discard_extent(root, start, + end + 1 - start, NULL); clear_extent_dirty(unpin, start, end, GFP_NOFS); unpin_extent_range(root, start, end); @@ -5410,7 +5416,8 @@ int btrfs_free_reserved_extent(struct btrfs_root *root, u64 start, u64 len) return -ENOSPC; } - ret = btrfs_discard_extent(root, start, len); + if (btrfs_test_opt(root, DISCARD)) + ret = btrfs_discard_extent(root, start, len, NULL); btrfs_add_free_space(cache, start, len); btrfs_update_reserved_bytes(cache, len, 0, 1); @@ -8728,7 +8735,7 @@ int btrfs_error_unpin_extent_range(struct btrfs_root *root, u64 start, u64 end) } int btrfs_error_discard_extent(struct btrfs_root *root, u64 bytenr, - u64 num_bytes) + u64 num_bytes, u64 *actual_bytes) { - return btrfs_discard_extent(root, bytenr, num_bytes); + return btrfs_discard_extent(root, bytenr, num_bytes, actual_bytes); } -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Li Dongyang
2011-Mar-24 10:24 UTC
[PATCH V4 4/4] Btrfs: add btrfs_trim_fs() to handle FITRIM
We take an free extent out from allocator, trim it, then put it back, but before we trim the block group, we should make sure the block group is cached, so plus a little change to make cache_block_group() run without a transaction. Signed-off-by: Li Dongyang <lidongyang@novell.com> --- fs/btrfs/ctree.h | 1 + fs/btrfs/extent-tree.c | 50 +++++++++++++++++++++++- fs/btrfs/free-space-cache.c | 92 +++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/free-space-cache.h | 2 + fs/btrfs/ioctl.c | 46 +++++++++++++++++++++ 5 files changed, 190 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 94bb772..df206c1 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2232,6 +2232,7 @@ int btrfs_error_discard_extent(struct btrfs_root *root, u64 bytenr, u64 num_bytes, u64 *actual_bytes); int btrfs_force_chunk_alloc(struct btrfs_trans_handle *trans, struct btrfs_root *root, u64 type); +int btrfs_trim_fs(struct btrfs_root *root, struct fstrim_range *range); /* ctree.c */ int btrfs_bin_search(struct extent_buffer *eb, struct btrfs_key *key, diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 10e542a..d876759 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -440,7 +440,7 @@ static int cache_block_group(struct btrfs_block_group_cache *cache, * allocate blocks for the tree root we can''t do the fast caching since * we likely hold important locks. */ - if (!trans->transaction->in_commit && + if (trans && (!trans->transaction->in_commit) && (root && root != root->fs_info->tree_root)) { spin_lock(&cache->lock); if (cache->cached != BTRFS_CACHE_NO) { @@ -8739,3 +8739,51 @@ int btrfs_error_discard_extent(struct btrfs_root *root, u64 bytenr, { return btrfs_discard_extent(root, bytenr, num_bytes, actual_bytes); } + +int btrfs_trim_fs(struct btrfs_root *root, struct fstrim_range *range) +{ + struct btrfs_fs_info *fs_info = root->fs_info; + struct btrfs_block_group_cache *cache = NULL; + u64 group_trimmed; + u64 start; + u64 end; + u64 trimmed = 0; + int ret = 0; + + cache = btrfs_lookup_block_group(fs_info, range->start); + + while (cache) { + if (cache->key.objectid >= (range->start + range->len)) { + btrfs_put_block_group(cache); + break; + } + + start = max(range->start, cache->key.objectid); + end = min(range->start + range->len, + cache->key.objectid + cache->key.offset); + + if (end - start >= range->minlen) { + if (!block_group_cache_done(cache)) { + ret = cache_block_group(cache, NULL, root, 0); + if (!ret) + wait_block_group_cache_done(cache); + } + ret = btrfs_trim_block_group(cache, + &group_trimmed, + start, + end, + range->minlen); + + trimmed += group_trimmed; + if (ret) { + btrfs_put_block_group(cache); + break; + } + } + + cache = next_block_group(fs_info->tree_root, cache); + } + + range->len = trimmed; + return ret; +} diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index a039065..d0dc812 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -2154,3 +2154,95 @@ void btrfs_init_free_cluster(struct btrfs_free_cluster *cluster) cluster->block_group = NULL; } +int btrfs_trim_block_group(struct btrfs_block_group_cache *block_group, + u64 *trimmed, u64 start, u64 end, u64 minlen) +{ + struct btrfs_free_space *entry = NULL; + struct btrfs_fs_info *fs_info = block_group->fs_info; + u64 bytes = 0; + u64 actually_trimmed; + int ret = 0; + + *trimmed = 0; + + while (start < end) { + spin_lock(&block_group->tree_lock); + + if (block_group->free_space < minlen) { + spin_unlock(&block_group->tree_lock); + break; + } + + entry = tree_search_offset(block_group, start, 0, 1); + if (!entry) + entry = tree_search_offset(block_group, + offset_to_bitmap(block_group, + start), + 1, 1); + + if (!entry || entry->offset >= end) { + spin_unlock(&block_group->tree_lock); + break; + } + + if (entry->bitmap) { + ret = search_bitmap(block_group, entry, &start, &bytes); + if (!ret) { + if (start >= end) { + spin_unlock(&block_group->tree_lock); + break; + } + bytes = min(bytes, end - start); + bitmap_clear_bits(block_group, entry, + start, bytes); + if (entry->bytes == 0) + free_bitmap(block_group, entry); + } else { + start = entry->offset + BITS_PER_BITMAP * + block_group->sectorsize; + spin_unlock(&block_group->tree_lock); + ret = 0; + continue; + } + } else { + start = entry->offset; + bytes = min(entry->bytes, end - start); + unlink_free_space(block_group, entry); + kfree(entry); + } + + spin_unlock(&block_group->tree_lock); + + if (bytes >= minlen) { + int update_ret; + update_ret = btrfs_update_reserved_bytes(block_group, + bytes, 1, 1); + + ret = btrfs_error_discard_extent(fs_info->extent_root, + start, + bytes, + &actually_trimmed); + + btrfs_add_free_space(block_group, + start, bytes); + if (!update_ret) + btrfs_update_reserved_bytes(block_group, + bytes, 0, 1); + + if (ret) + break; + *trimmed += actually_trimmed; + } + start += bytes; + bytes = 0; + + if (fatal_signal_pending(current)) { + ret = -ERESTARTSYS; + break; + } + + cond_resched(); + } + + return ret; +} diff --git a/fs/btrfs/free-space-cache.h b/fs/btrfs/free-space-cache.h index e49ca5c..65c3b93 100644 --- a/fs/btrfs/free-space-cache.h +++ b/fs/btrfs/free-space-cache.h @@ -68,4 +68,6 @@ u64 btrfs_alloc_from_cluster(struct btrfs_block_group_cache *block_group, int btrfs_return_cluster_to_free_space( struct btrfs_block_group_cache *block_group, struct btrfs_free_cluster *cluster); +int btrfs_trim_block_group(struct btrfs_block_group_cache *block_group, + u64 *trimmed, u64 start, u64 end, u64 minlen); #endif diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 5fdb2ab..eff9228 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -40,6 +40,7 @@ #include <linux/xattr.h> #include <linux/vmalloc.h> #include <linux/slab.h> +#include <linux/blkdev.h> #include "compat.h" #include "ctree.h" #include "disk-io.h" @@ -225,6 +226,49 @@ static int btrfs_ioctl_getversion(struct file *file, int __user *arg) return put_user(inode->i_generation, arg); } +static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg) +{ + struct btrfs_root *root = fdentry(file)->d_sb->s_fs_info; + struct btrfs_fs_info *fs_info = root->fs_info; + struct btrfs_device *device; + struct request_queue *q; + struct fstrim_range range; + u64 minlen = ULLONG_MAX; + u64 num_devices = 0; + int ret; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + mutex_lock(&fs_info->fs_devices->device_list_mutex); + list_for_each_entry(device, &fs_info->fs_devices->devices, dev_list) { + if (!device->bdev) + continue; + q = bdev_get_queue(device->bdev); + if (blk_queue_discard(q)) { + num_devices++; + minlen = min((u64)q->limits.discard_granularity, + minlen); + } + } + mutex_unlock(&fs_info->fs_devices->device_list_mutex); + if (!num_devices) + return -EOPNOTSUPP; + + if (copy_from_user(&range, arg, sizeof(range))) + return -EFAULT; + + range.minlen = max(range.minlen, minlen); + ret = btrfs_trim_fs(root, &range); + if (ret < 0) + return ret; + + if (copy_to_user(arg, &range, sizeof(range))) + return -EFAULT; + + return 0; +} + static noinline int create_subvol(struct btrfs_root *root, struct dentry *dentry, char *name, int namelen, @@ -2388,6 +2432,8 @@ long btrfs_ioctl(struct file *file, unsigned int return btrfs_ioctl_setflags(file, argp); case FS_IOC_GETVERSION: return btrfs_ioctl_getversion(file, argp); + case FITRIM: + return btrfs_ioctl_fitrim(file, argp); case BTRFS_IOC_SNAP_CREATE: return btrfs_ioctl_snap_create(file, argp, 0); case BTRFS_IOC_SNAP_CREATE_V2: -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2011-Mar-27 18:10 UTC
Re: [PATCH V4 0/4] Btrfs: batched discard support for btrfs
Excerpts from Li Dongyang''s message of 2011-03-24 06:24:24 -0400:> Dear list, > This is V4 of batched discard support, now we will get full mapping of > the free space on each device for RAID0/1/10/DUP instead of just a single > stripe length, and tested with xfsstests 251, Thanks.I''ve pushed this out into the for-linus branch, along with a full merge to 2.6.39 current git. Please take a look and make sure I''ve merged it correctly. Thanks! -chris> Changelog V4: > *make btrfs_map_block() return full mapping. > Changelog V3: > *fix style problems. > *rebase to 2.6.38-rc7. > Changelog V2: > *Check if we have devices support trim before trying to trim the fs, also adjust > minlen according to the discard_granularity. > *Update reserved extent calculations in btrfs_trim_block_group(). > *Call cond_resched() without checking need_resched() > *Use bitmap_clear_bits() and unlink_free_space() instead of btrfs_remove_free_space(), > so we won''t search the same extent for twice. > *Try harder in btrfs_discard_extent(), now we won''t report errors > if it''s not a EOPNOTSUPP. > *make sure the block group is cached before trimming it,or we''ll see an empty caching > tree if the block group is not cached. > *Minor return value fix in btrfs_discard_block_group().-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2011-Mar-28 01:30 UTC
Re: [PATCH V4 0/4] Btrfs: batched discard support for btrfs
Excerpts from Chris Mason''s message of 2011-03-27 14:10:46 -0400:> Excerpts from Li Dongyang''s message of 2011-03-24 06:24:24 -0400: > > Dear list, > > This is V4 of batched discard support, now we will get full mapping of > > the free space on each device for RAID0/1/10/DUP instead of just a single > > stripe length, and tested with xfsstests 251, Thanks. > > I''ve pushed this out into the for-linus branch, along with a full merge > to 2.6.39 current git. > > Please take a look and make sure I''ve merged it correctly.Hmmm, this was doing mod operations on 64 bit numbers, so it didn''t compile at all on 32 bit machines. I''ve fixed it up and pushed the result out to for-linus. Please check the math ;) -chris> > Thanks! > > -chris > > > Changelog V4: > > *make btrfs_map_block() return full mapping. > > Changelog V3: > > *fix style problems. > > *rebase to 2.6.38-rc7. > > Changelog V2: > > *Check if we have devices support trim before trying to trim the fs, also adjust > > minlen according to the discard_granularity. > > *Update reserved extent calculations in btrfs_trim_block_group(). > > *Call cond_resched() without checking need_resched() > > *Use bitmap_clear_bits() and unlink_free_space() instead of btrfs_remove_free_space(), > > so we won''t search the same extent for twice. > > *Try harder in btrfs_discard_extent(), now we won''t report errors > > if it''s not a EOPNOTSUPP. > > *make sure the block group is cached before trimming it,or we''ll see an empty caching > > tree if the block group is not cached. > > *Minor return value fix in btrfs_discard_block_group().-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2011-Mar-28 01:39 UTC
Re: [PATCH V4 0/4] Btrfs: batched discard support for btrfs
Excerpts from Chris Mason''s message of 2011-03-27 21:30:20 -0400:> Excerpts from Chris Mason''s message of 2011-03-27 14:10:46 -0400: > > Excerpts from Li Dongyang''s message of 2011-03-24 06:24:24 -0400: > > > Dear list, > > > This is V4 of batched discard support, now we will get full mapping of > > > the free space on each device for RAID0/1/10/DUP instead of just a single > > > stripe length, and tested with xfsstests 251, Thanks. > > > > I''ve pushed this out into the for-linus branch, along with a full merge > > to 2.6.39 current git. > > > > Please take a look and make sure I''ve merged it correctly. > > Hmmm, this was doing mod operations on 64 bit numbers, so it didn''t > compile at all on 32 bit machines. I''ve fixed it up and pushed the > result out to for-linus. Please check the math ;)BTW, I just rebased this so the incremental fix was before merging into Linus'' tree. -chris> > -chris > > > > > Thanks! > > > > -chris > > > > > Changelog V4: > > > *make btrfs_map_block() return full mapping. > > > Changelog V3: > > > *fix style problems. > > > *rebase to 2.6.38-rc7. > > > Changelog V2: > > > *Check if we have devices support trim before trying to trim the fs, also adjust > > > minlen according to the discard_granularity. > > > *Update reserved extent calculations in btrfs_trim_block_group(). > > > *Call cond_resched() without checking need_resched() > > > *Use bitmap_clear_bits() and unlink_free_space() instead of btrfs_remove_free_space(), > > > so we won''t search the same extent for twice. > > > *Try harder in btrfs_discard_extent(), now we won''t report errors > > > if it''s not a EOPNOTSUPP. > > > *make sure the block group is cached before trimming it,or we''ll see an empty caching > > > tree if the block group is not cached. > > > *Minor return value fix in btrfs_discard_block_group().-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Li Dongyang
2011-Mar-28 09:25 UTC
Re: [PATCH V4 0/4] Btrfs: batched discard support for btrfs
On Monday, March 28, 2011 09:39:26 AM Chris Mason wrote:> Excerpts from Chris Mason''s message of 2011-03-27 21:30:20 -0400: > > Excerpts from Chris Mason''s message of 2011-03-27 14:10:46 -0400: > > > Excerpts from Li Dongyang''s message of 2011-03-24 06:24:24 -0400: > > > > Dear list, > > > > This is V4 of batched discard support, now we will get full mapping of > > > > the free space on each device for RAID0/1/10/DUP instead of just a single > > > > stripe length, and tested with xfsstests 251, Thanks. > > > > > > I''ve pushed this out into the for-linus branch, along with a full merge > > > to 2.6.39 current git. > > > > > > Please take a look and make sure I''ve merged it correctly.Looks good to me.> > > > Hmmm, this was doing mod operations on 64 bit numbers, so it didn''t > > compile at all on 32 bit machines. I''ve fixed it up and pushed the > > result out to for-linus. Please check the math ;) >sorry for being so stupid, thanks for fixing ;-) Br, Li Dongyang> BTW, I just rebased this so the incremental fix was before merging into > Linus'' tree. > > -chris > > > > > -chris > > > > > > > > Thanks! > > > > > > -chris > > > > > > > Changelog V4: > > > > *make btrfs_map_block() return full mapping. > > > > Changelog V3: > > > > *fix style problems. > > > > *rebase to 2.6.38-rc7. > > > > Changelog V2: > > > > *Check if we have devices support trim before trying to trim the fs, also adjust > > > > minlen according to the discard_granularity. > > > > *Update reserved extent calculations in btrfs_trim_block_group(). > > > > *Call cond_resched() without checking need_resched() > > > > *Use bitmap_clear_bits() and unlink_free_space() instead of btrfs_remove_free_space(), > > > > so we won''t search the same extent for twice. > > > > *Try harder in btrfs_discard_extent(), now we won''t report errors > > > > if it''s not a EOPNOTSUPP. > > > > *make sure the block group is cached before trimming it,or we''ll see an empty caching > > > > tree if the block group is not cached. > > > > *Minor return value fix in btrfs_discard_block_group(). >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html