This is my reworked series of transaction abort fixes. The only ones that have changed since yesterday are patches 5 and 6. Now we use the fs_state flag to tell if our transaction aborted and we make sure to actually call the transaction abort stuff if we have a commit error so the error gets set properly. Patch 6 was fixed up to get rid of a memory leak we had when we''d abort a transaction. With these patches I''ve been able to do all sorts of horrible things and have the transactions abort properly and still have a nice clean file system left over. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Josef Bacik
2012-Jun-01 13:55 UTC
[PATCH 1/7] Btrfs: pass locked_page into extent_clear_unlock_delalloc if theres an error
While doing my enospc work I got a transaction abortion that resulted in a
panic when we tried to unlock_page() an already unlocked page. This is
because we aren''t calling extent_clear_unlock_delalloc with the locked
page
so it was unlocking all the pages in the range. This is wrong since
__extent_writepage expects to have the page locked still unless we return
*page_started as 1. This should keep us from panicing. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
fs/btrfs/inode.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 46d8732..e91f985 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -830,7 +830,7 @@ static noinline int cow_file_range(struct inode *inode,
if (IS_ERR(trans)) {
extent_clear_unlock_delalloc(inode,
&BTRFS_I(inode)->io_tree,
- start, end, NULL,
+ start, end, locked_page,
EXTENT_CLEAR_UNLOCK_PAGE |
EXTENT_CLEAR_UNLOCK |
EXTENT_CLEAR_DELALLOC |
@@ -963,7 +963,7 @@ out:
out_unlock:
extent_clear_unlock_delalloc(inode,
&BTRFS_I(inode)->io_tree,
- start, end, NULL,
+ start, end, locked_page,
EXTENT_CLEAR_UNLOCK_PAGE |
EXTENT_CLEAR_UNLOCK |
EXTENT_CLEAR_DELALLOC |
--
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Josef Bacik
2012-Jun-01 13:55 UTC
[PATCH 2/7] Btrfs: fix locking in btrfs_destroy_delayed_refs
The transaction abort stuff was throwing warnings from the list debugging
code because we do a list_del_init outside of the delayed_refs spin lock.
The delayed refs locking makes baby Jesus cry so it''s not hard to get
wrong,
but we need to take the ref head mutex to make sure it''s not being
processed
currently, and so if it is we need to drop the spin lock and then take and
drop the mutex and do the search again. If we can take the mutex then we
can safely remove the head from the list and carry on. Now when the
transaction aborts I don''t get the list debugging warnings. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
fs/btrfs/disk-io.c | 30 +++++++++++++++++-------------
1 files changed, 17 insertions(+), 13 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index b0d49e2..0224c25 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3395,7 +3395,6 @@ int btrfs_destroy_delayed_refs(struct btrfs_transaction
*trans,
delayed_refs = &trans->delayed_refs;
-again:
spin_lock(&delayed_refs->lock);
if (delayed_refs->num_entries == 0) {
spin_unlock(&delayed_refs->lock);
@@ -3403,31 +3402,36 @@ again:
return ret;
}
- node = rb_first(&delayed_refs->root);
- while (node) {
+ while ((node = rb_first(&delayed_refs->root)) != NULL) {
ref = rb_entry(node, struct btrfs_delayed_ref_node, rb_node);
- node = rb_next(node);
-
- ref->in_tree = 0;
- rb_erase(&ref->rb_node, &delayed_refs->root);
- delayed_refs->num_entries--;
atomic_set(&ref->refs, 1);
if (btrfs_delayed_ref_is_head(ref)) {
struct btrfs_delayed_ref_head *head;
head = btrfs_delayed_node_to_head(ref);
- spin_unlock(&delayed_refs->lock);
- mutex_lock(&head->mutex);
+ if (!mutex_trylock(&head->mutex)) {
+ atomic_inc(&ref->refs);
+ spin_unlock(&delayed_refs->lock);
+
+ /* Need to wait for the delayed ref to run */
+ mutex_lock(&head->mutex);
+ mutex_unlock(&head->mutex);
+ btrfs_put_delayed_ref(ref);
+
+ continue;
+ }
+
kfree(head->extent_op);
delayed_refs->num_heads--;
if (list_empty(&head->cluster))
delayed_refs->num_heads_ready--;
list_del_init(&head->cluster);
- mutex_unlock(&head->mutex);
- btrfs_put_delayed_ref(ref);
- goto again;
}
+ ref->in_tree = 0;
+ rb_erase(&ref->rb_node, &delayed_refs->root);
+ delayed_refs->num_entries--;
+
spin_unlock(&delayed_refs->lock);
btrfs_put_delayed_ref(ref);
--
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Josef Bacik
2012-Jun-01 13:55 UTC
[PATCH 3/7] Btrfs: check the return code of btrfs_save_ino_cache
In doing my enospc work I would sometimes error out in btrfs_save_ino_cache which would abort the transaction but we''d still end up with a corrupted file system. This is because we don''t actually check the return value and so if somethign goes wrong we just exit out and screw everything up. This fixes this particular part. Thanks, Btrfs: check the return code of btrfs_save_ino_cache In doing my enospc work I would sometimes error out in btrfs_save_ino_cache which would abort the transaction but we''d still end up with a corrupted file system. This is because we don''t actually check the return value and so if somethign goes wrong we just exit out and screw everything up. This fixes this particular part. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> --- fs/btrfs/transaction.c | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 82b03af..7aed0e8 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -823,7 +823,9 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans, btrfs_update_reloc_root(trans, root); btrfs_orphan_commit_root(trans, root); - btrfs_save_ino_cache(root, trans); + err = btrfs_save_ino_cache(root, trans); + if (err) + goto out; /* see comments in should_cow_block() */ root->force_cow = 0; @@ -848,6 +850,7 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans, } } spin_unlock(&fs_info->fs_roots_radix_lock); +out: return err; } -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Josef Bacik
2012-Jun-01 13:55 UTC
[PATCH 4/7] Btrfs: wake up transaction waiters when aborting a transaction
I was getting lots of hung tasks and a NULL pointer dereference because we
are not cleaning up the transaction properly when it aborts. First we need
to reset the running_transaction to NULL so we don''t get a bad
dereference
for any start_transaction callers after this. Also we cannot rely on
waitqueue_active() since it''s just a list_empty(), so just call
wake_up()
directly since that will do the barrier for us and such. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
fs/btrfs/disk-io.c | 9 +++------
fs/btrfs/transaction.c | 4 ++++
2 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 0224c25..050db9b 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3584,16 +3584,13 @@ void btrfs_cleanup_one_transaction(struct
btrfs_transaction *cur_trans,
/* FIXME: cleanup wait for commit */
cur_trans->in_commit = 1;
cur_trans->blocked = 1;
- if (waitqueue_active(&root->fs_info->transaction_blocked_wait))
- wake_up(&root->fs_info->transaction_blocked_wait);
+ wake_up(&root->fs_info->transaction_blocked_wait);
cur_trans->blocked = 0;
- if (waitqueue_active(&root->fs_info->transaction_wait))
- wake_up(&root->fs_info->transaction_wait);
+ wake_up(&root->fs_info->transaction_wait);
cur_trans->commit_done = 1;
- if (waitqueue_active(&cur_trans->commit_wait))
- wake_up(&cur_trans->commit_wait);
+ wake_up(&cur_trans->commit_wait);
btrfs_destroy_pending_snapshots(cur_trans);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 7aed0e8..4e6f63e 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1205,6 +1205,10 @@ static void cleanup_transaction(struct btrfs_trans_handle
*trans,
spin_lock(&root->fs_info->trans_lock);
list_del_init(&cur_trans->list);
+ if (cur_trans == root->fs_info->running_transaction) {
+ root->fs_info->running_transaction = NULL;
+ root->fs_info->trans_no_join = 0;
+ }
spin_unlock(&root->fs_info->trans_lock);
btrfs_cleanup_one_transaction(trans->transaction, root);
--
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Josef Bacik
2012-Jun-01 13:55 UTC
[PATCH 5/7] Btrfs: abort the transaction if the commit fails
If a transaction commit fails we don''t abort it so we don''t
set an error on
the file system. This patch fixes that by actually calling the abort stuff
and then adding a check for a fs error in the transaction start stuff to
make sure it is caught properly. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
fs/btrfs/transaction.c | 10 ++++++++--
1 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 4e6f63e..ead64e1 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -99,6 +99,10 @@ loop:
kmem_cache_free(btrfs_transaction_cachep, cur_trans);
cur_trans = root->fs_info->running_transaction;
goto loop;
+ } else if (root->fs_info->fs_state & BTRFS_SUPER_FLAG_ERROR) {
+ spin_unlock(&root->fs_info->trans_lock);
+ kmem_cache_free(btrfs_transaction_cachep, cur_trans);
+ return -EROFS;
}
atomic_set(&cur_trans->num_writers, 1);
@@ -1197,12 +1201,14 @@ int btrfs_commit_transaction_async(struct
btrfs_trans_handle *trans,
static void cleanup_transaction(struct btrfs_trans_handle *trans,
- struct btrfs_root *root)
+ struct btrfs_root *root, int err)
{
struct btrfs_transaction *cur_trans = trans->transaction;
WARN_ON(trans->use_count > 1);
+ btrfs_abort_transaction(trans, root, err);
+
spin_lock(&root->fs_info->trans_lock);
list_del_init(&cur_trans->list);
if (cur_trans == root->fs_info->running_transaction) {
@@ -1514,7 +1520,7 @@ cleanup_transaction:
// WARN_ON(1);
if (current->journal_info == trans)
current->journal_info = NULL;
- cleanup_transaction(trans, root);
+ cleanup_transaction(trans, root, ret);
return ret;
}
--
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
So we''re forcing the eb''s to have their ref count set to 1 so
invalidatepage
works but this breaks lots of things, for example root nodes, and is just
plain wrong, we don''t need to just evict all of this stuff. Also drop
the
invalidatepage altogether and add a page_cache_release(). With this patch
we no longer hang when trying to access the root nodes after an aborted
transaction and we no longer leak memory. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
fs/btrfs/disk-io.c | 6 ++----
1 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 050db9b..ea2b1d2 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3519,11 +3519,9 @@ static int btrfs_destroy_marked_extents(struct btrfs_root
*root,
&(&BTRFS_I(page->mapping->host)->io_tree)->buffer,
offset >> PAGE_CACHE_SHIFT);
spin_unlock(&dirty_pages->buffer_lock);
- if (eb) {
+ if (eb)
ret = test_and_clear_bit(EXTENT_BUFFER_DIRTY,
&eb->bflags);
- atomic_set(&eb->refs, 1);
- }
if (PageWriteback(page))
end_page_writeback(page);
@@ -3537,8 +3535,8 @@ static int btrfs_destroy_marked_extents(struct btrfs_root
*root,
spin_unlock_irq(&page->mapping->tree_lock);
}
- page->mapping->a_ops->invalidatepage(page, 0);
unlock_page(page);
+ page_cache_release(page);
}
}
--
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Josef Bacik
2012-Jun-01 13:55 UTC
[PATCH 7/7] Btrfs: unlock everything properly in the error case for nocow
I was getting hung on umount when a transaction was aborted because a range
of one of the free space inodes was still locked. This is because the nocow
stuff doesn''t unlock anything on error. This fixed the problem and I
verified that is what was happening. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
fs/btrfs/inode.c | 37 +++++++++++++++++++++++++++++++++++--
1 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e91f985..96b841d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1136,8 +1136,18 @@ static noinline int run_delalloc_nocow(struct inode
*inode,
u64 ino = btrfs_ino(inode);
path = btrfs_alloc_path();
- if (!path)
+ if (!path) {
+ extent_clear_unlock_delalloc(inode,
+ &BTRFS_I(inode)->io_tree,
+ start, end, locked_page,
+ EXTENT_CLEAR_UNLOCK_PAGE |
+ EXTENT_CLEAR_UNLOCK |
+ EXTENT_CLEAR_DELALLOC |
+ EXTENT_CLEAR_DIRTY |
+ EXTENT_SET_WRITEBACK |
+ EXTENT_END_WRITEBACK);
return -ENOMEM;
+ }
nolock = btrfs_is_free_space_inode(root, inode);
@@ -1147,6 +1157,15 @@ static noinline int run_delalloc_nocow(struct inode
*inode,
trans = btrfs_join_transaction(root);
if (IS_ERR(trans)) {
+ extent_clear_unlock_delalloc(inode,
+ &BTRFS_I(inode)->io_tree,
+ start, end, locked_page,
+ EXTENT_CLEAR_UNLOCK_PAGE |
+ EXTENT_CLEAR_UNLOCK |
+ EXTENT_CLEAR_DELALLOC |
+ EXTENT_CLEAR_DIRTY |
+ EXTENT_SET_WRITEBACK |
+ EXTENT_END_WRITEBACK);
btrfs_free_path(path);
return PTR_ERR(trans);
}
@@ -1327,8 +1346,11 @@ out_check:
}
btrfs_release_path(path);
- if (cur_offset <= end && cow_start == (u64)-1)
+ if (cur_offset <= end && cow_start == (u64)-1) {
cow_start = cur_offset;
+ cur_offset = end;
+ }
+
if (cow_start != (u64)-1) {
ret = cow_file_range(inode, locked_page, cow_start, end,
page_started, nr_written, 1);
@@ -1347,6 +1369,17 @@ error:
if (!ret)
ret = err;
+ if (ret && cur_offset < end)
+ extent_clear_unlock_delalloc(inode,
+ &BTRFS_I(inode)->io_tree,
+ cur_offset, end, locked_page,
+ EXTENT_CLEAR_UNLOCK_PAGE |
+ EXTENT_CLEAR_UNLOCK |
+ EXTENT_CLEAR_DELALLOC |
+ EXTENT_CLEAR_DIRTY |
+ EXTENT_SET_WRITEBACK |
+ EXTENT_END_WRITEBACK);
+
btrfs_free_path(path);
return ret;
}
--
1.7.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Josef Bacik
2012-Jun-04 18:07 UTC
Re: [PATCH 3/7] Btrfs: check the return code of btrfs_save_ino_cache
On Fri, Jun 01, 2012 at 09:55:51AM -0400, Josef Bacik wrote:> In doing my enospc work I would sometimes error out in btrfs_save_ino_cache > which would abort the transaction but we''d still end up with a corrupted > file system. This is because we don''t actually check the return value and > so if somethign goes wrong we just exit out and screw everything up. This > fixes this particular part. Thanks,Dropping this patch, it doesn''t actually matter if the space cache gets written out or not and it actually fails if the caching has not finished which can lead to a transaction being aborted for no reason. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html