Hi everyone, This patchset is the latest approach I''m using for the Ceph storage daemon to keep track of which data has safely committed to disk. The basic idea is to not use the (problematic) user transaction ioctls at all. Instead, the daemon quiesces its own write requests, initiates an async snapshot, and then continues. The snapshot approach is nice because it provides rollback. If something goes wrong, we can cleanly go back to the most recent consistent commit. The performance is also very similar to what I was doing before (using the ''flushoncommit'' mount option and tiggering a sync_fs to flush data). The only difference is the old snapshots stick around for a bit longer before I delete them and the references get dropped. The first patch introduces a generic btrfs_commit_transaction_async() helper, which starts btrfs_commit_transaction asynchronously and returns either when the commit starts (blocked=1) or when it has done it''s dirty work (blocked=0). The second patch adds ioctls that let you start and wait for an asynchronous commit. The third introduces a SNAP_CREATE_ASYNC ioctl that creates a snap but returns before it hits disk. The fourth patch returns the commiting transid to userspace, so that it can be fed to the WAIT_SYNC ioctl. I''m not that happy with the interface, though; any suggestions for alternatives would be great. Alternatively, I could get by without knowing the exact transid and it wouldn''t be the end of the world. The final patch lets you delete a snapshot/subvol reference without doing an immediate commit (btrfs_end_transaction instead of btrfs_commit_transaction). AFAICS there''s no reason the commit has to happen immediately (user expectations aside). Overall I like this much better than the various user transaction proposals. It''s simpler, does the job, and the primitives should be useful for other applications. Let me know what you think! I''m doing more testing this week, but so far I haven''t seen any problems with these changes. Thanks- sage Sage Weil (5): Btrfs: async transaction commit Btrfs: add START_SYNC, WAIT_SYNC ioctls Btrfs: add SNAP_CREATE_ASYNC ioctl Btrfs: return transid to userspace from SNAP_CREATE_ASYNC ioctl btrfs: add SNAP_DESTROY_ASYNC ioctl fs/btrfs/ctree.h | 1 + fs/btrfs/disk-io.c | 1 + fs/btrfs/ioctl.c | 94 ++++++++++++++++++++++---- fs/btrfs/ioctl.h | 10 +++- fs/btrfs/transaction.c | 171 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/transaction.h | 4 + 6 files changed, 265 insertions(+), 16 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Add support for an async transaction commit that is ordered such that any subsequent operations will join the following transaction, but does not wait until the current commit is fully on disk. This avoids much of the latency associated with the btrfs_commit_transaction for callers concerned with serialization and not safety. The wait_for_unblock flag controls whether we wait for the ''middle'' portion of commit_transaction to complete, which is necessary if the caller expects some of the modifications contained in the commit to be available (this is the case for subvol/snapshot creation). Signed-off-by: Sage Weil <sage@newdream.net> --- fs/btrfs/ctree.h | 1 + fs/btrfs/disk-io.c | 1 + fs/btrfs/transaction.c | 119 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/transaction.h | 3 + 4 files changed, 124 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 1111584..1d0d5f0 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -840,6 +840,7 @@ struct btrfs_fs_info { struct btrfs_transaction *running_transaction; wait_queue_head_t transaction_throttle; wait_queue_head_t transaction_wait; + wait_queue_head_t transaction_blocked_wait; wait_queue_head_t async_submit_wait; struct btrfs_super_block super_copy; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 11d0ad3..6f3cb33 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1701,6 +1701,7 @@ struct btrfs_root *open_ctree(struct super_block *sb, init_waitqueue_head(&fs_info->transaction_throttle); init_waitqueue_head(&fs_info->transaction_wait); + init_waitqueue_head(&fs_info->transaction_blocked_wait); init_waitqueue_head(&fs_info->async_submit_wait); __setup_root(4096, 4096, 4096, 4096, tree_root, diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index e75cc84..52b5488 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -882,6 +882,123 @@ int btrfs_transaction_in_commit(struct btrfs_fs_info *info) return ret; } +/* + * wait for the current transaction commit to start and block subsequent + * transaction joins + */ +static void wait_current_trans_commit_start(struct btrfs_root *root, + struct btrfs_transaction *trans) +{ + DEFINE_WAIT(wait); + + if (trans->in_commit) + return; + + while (1) { + prepare_to_wait(&root->fs_info->transaction_blocked_wait, &wait, + TASK_UNINTERRUPTIBLE); + if (trans->in_commit) { + finish_wait(&root->fs_info->transaction_blocked_wait, + &wait); + break; + } + mutex_unlock(&root->fs_info->trans_mutex); + schedule(); + mutex_lock(&root->fs_info->trans_mutex); + finish_wait(&root->fs_info->transaction_blocked_wait, &wait); + } +} + +/* + * wait for the current transaction to start and then become unblocked. + * caller holds ref. + */ +static void wait_current_trans_commit_start_and_unblock(struct btrfs_root *root, + struct btrfs_transaction *trans) +{ + DEFINE_WAIT(wait); + + if (trans->commit_done || (trans->in_commit && !trans->blocked)) + return; + + while (1) { + prepare_to_wait(&root->fs_info->transaction_wait, &wait, + TASK_UNINTERRUPTIBLE); + if (trans->commit_done || + (trans->in_commit && !trans->blocked)) { + finish_wait(&root->fs_info->transaction_wait, + &wait); + break; + } + mutex_unlock(&root->fs_info->trans_mutex); + schedule(); + mutex_lock(&root->fs_info->trans_mutex); + finish_wait(&root->fs_info->transaction_wait, + &wait); + } +} + +/* + * commit transactions asynchronously. once btrfs_commit_transaction_async + * returns, any subsequent transaction will not be allowed to join. + */ +struct btrfs_async_commit { + struct btrfs_trans_handle *newtrans; + struct btrfs_root *root; + struct delayed_work work; +}; + +static void do_async_commit(struct work_struct *work) +{ + struct btrfs_async_commit *ac + container_of(work, struct btrfs_async_commit, work.work); + + btrfs_commit_transaction(ac->newtrans, ac->root); + kfree(ac); +} + +int btrfs_commit_transaction_async(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + int wait_for_unblock) +{ + struct btrfs_async_commit *ac; + struct btrfs_transaction *cur_trans; + + ac = kmalloc(sizeof(*ac), GFP_NOFS); + BUG_ON(!ac); + + INIT_DELAYED_WORK(&ac->work, do_async_commit); + ac->root = root; + ac->newtrans = btrfs_join_transaction(root, 0); + + /* take transaction reference */ + mutex_lock(&root->fs_info->trans_mutex); + cur_trans = trans->transaction; + cur_trans->use_count++; + mutex_unlock(&root->fs_info->trans_mutex); + + btrfs_end_transaction(trans, root); + schedule_delayed_work(&ac->work, 0); + + /* wait for transaction to start and unblock */ + mutex_lock(&root->fs_info->trans_mutex); + if (wait_for_unblock) + wait_current_trans_commit_start_and_unblock(root, cur_trans); + else + wait_current_trans_commit_start(root, cur_trans); + put_transaction(cur_trans); + mutex_unlock(&root->fs_info->trans_mutex); + + return 0; +} + +/* + * btrfs_transaction state sequence: + * in_commit = 0, blocked = 0 (initial) + * in_commit = 1, blocked = 1 + * blocked = 0 + * commit_done = 1 + */ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, struct btrfs_root *root) { @@ -931,6 +1048,8 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, trans->transaction->in_commit = 1; trans->transaction->blocked = 1; + wake_up(&root->fs_info->transaction_blocked_wait); + if (cur_trans->list.prev != &root->fs_info->trans_list) { prev_trans = list_entry(cur_trans->list.prev, struct btrfs_transaction, list); diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h index 93c7ccb..f9c52f2 100644 --- a/fs/btrfs/transaction.h +++ b/fs/btrfs/transaction.h @@ -101,6 +101,9 @@ int btrfs_defrag_root(struct btrfs_root *root, int cacheonly); int btrfs_clean_old_snapshots(struct btrfs_root *root); int btrfs_commit_transaction(struct btrfs_trans_handle *trans, struct btrfs_root *root); +int btrfs_commit_transaction_async(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + int wait_for_unblock); int btrfs_end_transaction_throttle(struct btrfs_trans_handle *trans, struct btrfs_root *root); void btrfs_throttle(struct btrfs_root *root); -- 1.6.6.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
START_SYNC will start a sync/commit, but not wait for it to complete. Any modification started after the ioctl returns is guaranteed not to be included in the commit. If a non-NULL pointer is passed, the transaction id will be returned to userspace. WAIT_SYNC will wait for any in-progress commit to complete. If a transaction id is specified, the ioctl will block and then return (success) when the specified transaction has committed. If it has already committed when we call the ioctl, it returns immediately. If the specified transaction doesn''t exist, it returns EINVAL. If no transaction id is specified, WAIT_SYNC will wait for the currently committing transaction to finish it''s commit to disk. If there is no currently committing transaction, it returns success. These ioctls are useful for applications which want to impose an ordering on when fs modifications reach disk, but do not want to wait for the full (slow) commit process to do so. Picky callers can take the transid returned by START_SYNC and feed it to WAIT_SYNC, and be certain to wait only as long as necessary for the transaction _they_ started to reach disk. Sloppy callers can START_SYNC and WAIT_SYNC without a transid, and provided they didn''t wait too long between the calls, they will get the same result. However, if a second commit starts before they call WAIT_SYNC, they may end up waiting longer for it to commit as well. Even so, a START_SYNC+WAIT_SYNC still guarantees that any operation completed before the START_SYNC reaches disk. Signed-off-by: Sage Weil <sage@newdream.net> --- fs/btrfs/ioctl.c | 34 +++++++++++++++++++++++++++++++ fs/btrfs/ioctl.h | 2 + fs/btrfs/transaction.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/transaction.h | 1 + 4 files changed, 89 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 2845c6c..8e29712 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1965,6 +1965,36 @@ long btrfs_ioctl_trans_end(struct file *file) return 0; } +static long btrfs_ioctl_start_sync(struct file *file, void __user *argp) +{ + struct btrfs_root *root = BTRFS_I(file->f_dentry->d_inode)->root; + struct btrfs_trans_handle *trans; + u64 transid; + + trans = btrfs_start_transaction(root, 0); + transid = trans->transid; + btrfs_commit_transaction_async(trans, root, 0); + + if (argp) + if (copy_to_user(argp, &transid, sizeof(transid))) + return -EFAULT; + return 0; +} + +static long btrfs_ioctl_wait_sync(struct file *file, void __user *argp) +{ + struct btrfs_root *root = BTRFS_I(file->f_dentry->d_inode)->root; + u64 transid; + + if (argp) { + if (copy_from_user(&transid, argp, sizeof(transid))) + return -EFAULT; + } else { + transid = 0; /* current trans */ + } + return btrfs_wait_for_commit(root, transid); +} + long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -2015,6 +2045,10 @@ long btrfs_ioctl(struct file *file, unsigned int case BTRFS_IOC_SYNC: btrfs_sync_fs(file->f_dentry->d_sb, 1); return 0; + case BTRFS_IOC_START_SYNC: + return btrfs_ioctl_start_sync(file, argp); + case BTRFS_IOC_WAIT_SYNC: + return btrfs_ioctl_wait_sync(file, argp); } return -ENOTTY; diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h index 424694a..e9a2f7e 100644 --- a/fs/btrfs/ioctl.h +++ b/fs/btrfs/ioctl.h @@ -178,4 +178,6 @@ struct btrfs_ioctl_space_args { #define BTRFS_IOC_DEFAULT_SUBVOL _IOW(BTRFS_IOCTL_MAGIC, 19, u64) #define BTRFS_IOC_SPACE_INFO _IOWR(BTRFS_IOCTL_MAGIC, 20, \ struct btrfs_ioctl_space_args) +#define BTRFS_IOC_START_SYNC _IOR(BTRFS_IOCTL_MAGIC, 21, __u64) +#define BTRFS_IOC_WAIT_SYNC _IOW(BTRFS_IOCTL_MAGIC, 22, __u64) #endif diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 52b5488..6de9983 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -239,6 +239,58 @@ static noinline int wait_for_commit(struct btrfs_root *root, return 0; } +int btrfs_wait_for_commit(struct btrfs_root *root, u64 transid) +{ + struct btrfs_transaction *cur_trans = NULL, *t; + int ret; + + mutex_lock(&root->fs_info->trans_mutex); + + ret = 0; + if (transid) { + if (transid <= root->fs_info->last_trans_committed) + goto out_unlock; + + /* find specified transaction */ + list_for_each_entry(t, &root->fs_info->trans_list, list) { + if (t->transid == transid) { + cur_trans = t; + break; + } + if (t->transid > transid) + break; + } + ret = -EINVAL; + if (!cur_trans) + goto out_unlock; /* bad transid */ + } else { + /* find newest transaction that is committing | committed */ + list_for_each_entry_reverse(t, &root->fs_info->trans_list, + list) { + if (t->in_commit) { + if (t->commit_done) + goto out_unlock; + cur_trans = t; + break; + } + } + if (!cur_trans) + goto out_unlock; /* nothing committing|committed */ + } + + cur_trans->use_count++; + mutex_unlock(&root->fs_info->trans_mutex); + + wait_for_commit(root, cur_trans); + + mutex_lock(&root->fs_info->trans_mutex); + put_transaction(cur_trans); + ret = 0; +out_unlock: + mutex_unlock(&root->fs_info->trans_mutex); + return ret; +} + #if 0 /* * rate limit against the drop_snapshot code. This helps to slow down new diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h index f9c52f2..d4f7b66 100644 --- a/fs/btrfs/transaction.h +++ b/fs/btrfs/transaction.h @@ -90,6 +90,7 @@ struct btrfs_trans_handle *btrfs_join_transaction(struct btrfs_root *root, int num_blocks); struct btrfs_trans_handle *btrfs_start_ioctl_transaction(struct btrfs_root *r, int num_blocks); +int btrfs_wait_for_commit(struct btrfs_root *root, u64 transid); int btrfs_write_and_wait_transaction(struct btrfs_trans_handle *trans, struct btrfs_root *root); int btrfs_commit_tree_roots(struct btrfs_trans_handle *trans, -- 1.6.6.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Create a snap without waiting for it to commit to disk. The ioctl is ordered such that subsequent operations will not be contained by the created snapshot, and the commit is initiated, but the ioctl does not wait for the snapshot to commit to disk. Signed-off-by: Sage Weil <sage@newdream.net> --- fs/btrfs/ioctl.c | 41 +++++++++++++++++++++++++++++------------ fs/btrfs/ioctl.h | 2 ++ 2 files changed, 31 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 8e29712..7ea4ff0 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -224,7 +224,8 @@ static int btrfs_ioctl_getversion(struct file *file, int __user *arg) static noinline int create_subvol(struct btrfs_root *root, struct dentry *dentry, - char *name, int namelen) + char *name, int namelen, + u64 *async_transid) { struct btrfs_trans_handle *trans; struct btrfs_key key; @@ -342,7 +343,12 @@ static noinline int create_subvol(struct btrfs_root *root, d_instantiate(dentry, btrfs_lookup_dentry(dir, dentry)); fail: - err = btrfs_commit_transaction(trans, root); + if (async_transid) { + *async_transid = trans->transid; + err = btrfs_commit_transaction_async(trans, root, 1); + } else { + err = btrfs_commit_transaction(trans, root); + } if (err && !ret) ret = err; @@ -351,7 +357,7 @@ fail: } static int create_snapshot(struct btrfs_root *root, struct dentry *dentry, - char *name, int namelen) + char *name, int namelen, u64 *async_transid) { struct inode *inode; struct btrfs_pending_snapshot *pending_snapshot; @@ -392,7 +398,12 @@ static int create_snapshot(struct btrfs_root *root, struct dentry *dentry, pending_snapshot->root = root; list_add(&pending_snapshot->list, &trans->transaction->pending_snapshots); - ret = btrfs_commit_transaction(trans, root); + if (async_transid) { + *async_transid = trans->transid; + ret = btrfs_commit_transaction_async(trans, root, 1); + } else { + ret = btrfs_commit_transaction(trans, root); + } BUG_ON(ret); btrfs_unreserve_metadata_space(root, 6); @@ -425,7 +436,8 @@ static inline int btrfs_may_create(struct inode *dir, struct dentry *child) */ static noinline int btrfs_mksubvol(struct path *parent, char *name, int namelen, - struct btrfs_root *snap_src) + struct btrfs_root *snap_src, + u64 *async_transid) { struct inode *dir = parent->dentry->d_inode; struct dentry *dentry; @@ -457,10 +469,10 @@ static noinline int btrfs_mksubvol(struct path *parent, if (snap_src) { error = create_snapshot(snap_src, dentry, - name, namelen); + name, namelen, async_transid); } else { error = create_subvol(BTRFS_I(dir)->root, dentry, - name, namelen); + name, namelen, async_transid); } if (!error) fsnotify_mkdir(dir, dentry); @@ -825,12 +837,14 @@ out_unlock: } static noinline int btrfs_ioctl_snap_create(struct file *file, - void __user *arg, int subvol) + void __user *arg, int subvol, + int async) { struct btrfs_root *root = BTRFS_I(fdentry(file)->d_inode)->root; struct btrfs_ioctl_vol_args *vol_args; struct file *src_file; int namelen; + u64 transid = 0; int ret = 0; if (root->fs_info->sb->s_flags & MS_RDONLY) @@ -849,7 +863,7 @@ static noinline int btrfs_ioctl_snap_create(struct file *file, if (subvol) { ret = btrfs_mksubvol(&file->f_path, vol_args->name, namelen, - NULL); + NULL, async ? &transid : NULL); } else { struct inode *src_inode; src_file = fget(vol_args->fd); @@ -867,7 +881,8 @@ static noinline int btrfs_ioctl_snap_create(struct file *file, goto out; } ret = btrfs_mksubvol(&file->f_path, vol_args->name, namelen, - BTRFS_I(src_inode)->root); + BTRFS_I(src_inode)->root, + async ? &transid : NULL); fput(src_file); } out: @@ -2009,9 +2024,11 @@ long btrfs_ioctl(struct file *file, unsigned int case FS_IOC_GETVERSION: return btrfs_ioctl_getversion(file, argp); case BTRFS_IOC_SNAP_CREATE: - return btrfs_ioctl_snap_create(file, argp, 0); + return btrfs_ioctl_snap_create(file, argp, 0, 0); + case BTRFS_IOC_SNAP_CREATE_ASYNC: + return btrfs_ioctl_snap_create(file, argp, 0, 1); case BTRFS_IOC_SUBVOL_CREATE: - return btrfs_ioctl_snap_create(file, argp, 1); + return btrfs_ioctl_snap_create(file, argp, 1, 0); case BTRFS_IOC_SNAP_DESTROY: return btrfs_ioctl_snap_destroy(file, argp); case BTRFS_IOC_DEFAULT_SUBVOL: diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h index e9a2f7e..d9169d8 100644 --- a/fs/btrfs/ioctl.h +++ b/fs/btrfs/ioctl.h @@ -180,4 +180,6 @@ struct btrfs_ioctl_space_args { struct btrfs_ioctl_space_args) #define BTRFS_IOC_START_SYNC _IOR(BTRFS_IOCTL_MAGIC, 21, __u64) #define BTRFS_IOC_WAIT_SYNC _IOW(BTRFS_IOCTL_MAGIC, 22, __u64) +#define BTRFS_IOC_SNAP_CREATE_ASYNC _IOW(BTRFS_IOCTL_MAGIC, 23, \ + struct btrfs_ioctl_vol_args) #endif -- 1.6.6.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Sage Weil
2010-Mar-22 19:13 UTC
[PATCH 4/5] Btrfs: return transid to userspace from SNAP_CREATE_ASYNC ioctl
I''m not sure about this one. The idea is to start a snapshot commit with the SNAP_CREATE_ASYNC ioctl, and wait for it with the WAIT_SYNC ioctl. I''m not sure that modifying btrfs_ioctl_vol_args is the way to do that. That technically changes the ABI (albeit in a way that nobody will actually notice). I''m not sure it''s worth creating a second, different ioctl_async_vol_args struct and complicating ioctl.c funcs. OTOH, a WAIT_SYNC without an explicit trans will almost always accomplish the same thing. Worst case we wait a bit longer than needed (for a second racing commit to complete). Signed-off-by: Sage Weil <sage@newdream.net> --- fs/btrfs/ioctl.c | 7 +++++++ fs/btrfs/ioctl.h | 4 +++- 2 files changed, 10 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 7ea4ff0..fd824a7 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -885,6 +885,13 @@ static noinline int btrfs_ioctl_snap_create(struct file *file, async ? &transid : NULL); fput(src_file); } + if (!ret && async) { + struct btrfs_ioctl_vol_args __user *user_vol_args = arg; + + if (copy_to_user(&user_vol_args->transid, &transid, + sizeof(transid))) + return -EFAULT; + } out: kfree(vol_args); return ret; diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h index d9169d8..4f7fe37 100644 --- a/fs/btrfs/ioctl.h +++ b/fs/btrfs/ioctl.h @@ -22,12 +22,14 @@ #define BTRFS_IOCTL_MAGIC 0x94 #define BTRFS_VOL_NAME_MAX 255 -#define BTRFS_PATH_NAME_MAX 4087 +#define BTRFS_PATH_NAME_MAX 4071 /* this should be 4k */ struct btrfs_ioctl_vol_args { __s64 fd; char name[BTRFS_PATH_NAME_MAX + 1]; + __u64 transid; + __u64 reserved; }; #define BTRFS_INO_LOOKUP_PATH_MAX 4080 -- 1.6.6.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
There is no reason (aside from user expectations that space be freed RIGHT NOW) that transaction removing the root reference commit immediately. Add an alternative ioctl that does not commit immediately for those who don''t need it to. Signed-off-by: Sage Weil <sage@newdream.net> --- fs/btrfs/ioctl.c | 12 +++++++++--- fs/btrfs/ioctl.h | 2 ++ 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index fd824a7..c8e6470 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1255,7 +1255,8 @@ static noinline int btrfs_ioctl_ino_lookup(struct file *file, } static noinline int btrfs_ioctl_snap_destroy(struct file *file, - void __user *arg) + void __user *arg, + int async) { struct dentry *parent = fdentry(file); struct dentry *dentry; @@ -1338,7 +1339,10 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file, dest->root_key.objectid); BUG_ON(ret); - ret = btrfs_commit_transaction(trans, root); + if (async) + ret = btrfs_end_transaction(trans, root); + else + ret = btrfs_commit_transaction(trans, root); BUG_ON(ret); inode->i_flags |= S_DEAD; out_up_write: @@ -2037,7 +2041,9 @@ long btrfs_ioctl(struct file *file, unsigned int case BTRFS_IOC_SUBVOL_CREATE: return btrfs_ioctl_snap_create(file, argp, 1, 0); case BTRFS_IOC_SNAP_DESTROY: - return btrfs_ioctl_snap_destroy(file, argp); + return btrfs_ioctl_snap_destroy(file, argp, 0); + case BTRFS_IOC_SNAP_DESTROY_ASYNC: + return btrfs_ioctl_snap_destroy(file, argp, 1); case BTRFS_IOC_DEFAULT_SUBVOL: return btrfs_ioctl_default_subvol(file, argp); case BTRFS_IOC_DEFRAG: diff --git a/fs/btrfs/ioctl.h b/fs/btrfs/ioctl.h index 4f7fe37..8109257 100644 --- a/fs/btrfs/ioctl.h +++ b/fs/btrfs/ioctl.h @@ -184,4 +184,6 @@ struct btrfs_ioctl_space_args { #define BTRFS_IOC_WAIT_SYNC _IOW(BTRFS_IOCTL_MAGIC, 22, __u64) #define BTRFS_IOC_SNAP_CREATE_ASYNC _IOW(BTRFS_IOCTL_MAGIC, 23, \ struct btrfs_ioctl_vol_args) +#define BTRFS_IOC_SNAP_DESTROY_ASYNC _IOW(BTRFS_IOCTL_MAGIC, 24, \ + struct btrfs_ioctl_vol_args) #endif -- 1.6.6.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Reasonably Related Threads
- [PATCH 0/6] Btrfs commit fixes, async subvol operations
- [PATCH v0 00/18] btfs: Subvolume Quota Groups
- [GIT PULL v3] Btrfs: improve write ahead log with sub transaction
- [RFC] All my fsync changes
- [PATCH 1/2] Btrfs: fix wrong handle at error path of create_snapshot() when the commit fails