We all keep getting those stupid warnings from use_block_rsv when running
stress.sh, and it''s because the delayed insertion stuff is being
stupid. It''s
not the delayed insertion stuffs fault, it''s all just stupid. When
marking an
inode dirty for oh say updating the time on it, we just do a
btrfs_join_transaction, which doesn''t reserve any space. This is
stupid because
we''re going to have to have space reserve to make this change, but we
do it
because it''s fast because chances are we''re going to call it
over and over again
and it doesn''t matter. Well thanks to the delayed insertion stuff this
is
mostly the case, so we do actually need to make this reservation. So if
trans->bytes_reserved is 0 then try to do a normal reservation. If not
return
ENOSPC which will make the btrfs_dirty_inode start a proper transaction which
will let it do the whole ENOSPC dance and reserve enough space for the delayed
insertion to steal the reservation from the transaction.
The other stupid thing we do is not reserve space for the inode when writing to
the thing. Usually this is ok since we have to update the time so we''d
have
already done all this work before we get to the endio stuff, so it
doesn''t
matter. But this is stupid because we could write the data after the
transaction commits where we changed the mtime of the inode so we have to cow
all the way down to the inode anyway. This used to be masked by the delalloc
reservation stuff, but because we delay the update it doesn''t get
masked in this
case. So again the delayed insertion stuff bites us in the ass. So if our
trans->block_rsv is delalloc, just steal the reservation from the delalloc
reserve. Hopefully this won''t bite us in the ass, but I''ve
said that before.
With this patch stress.sh no longer spits out those stupid warnings (famous last
words). Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
---
V1->V2: I stupidly thought I could get away with some flushing if we needed
space but I was wrong, we could deadlock, so add a btrfs_add_bytes_noflush
variant that will not do any flushing and will just return ENOSPC which will let
us fallback and do our full flushing.
fs/btrfs/ctree.h | 3 +++
fs/btrfs/delayed-inode.c | 36 ++++++++++++++++++++++++++++--------
fs/btrfs/extent-tree.c | 18 ++++++++++++++++++
3 files changed, 49 insertions(+), 8 deletions(-)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 3b5c04f..f956bbf 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2258,6 +2258,9 @@ void btrfs_free_block_rsv(struct btrfs_root *root,
int btrfs_block_rsv_add(struct btrfs_root *root,
struct btrfs_block_rsv *block_rsv,
u64 num_bytes);
+int btrfs_block_rsv_add_noflush(struct btrfs_root *root,
+ struct btrfs_block_rsv *block_rsv,
+ u64 num_bytes);
int btrfs_block_rsv_check(struct btrfs_root *root,
struct btrfs_block_rsv *block_rsv, int min_factor);
int btrfs_block_rsv_refill(struct btrfs_root *root,
diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index fc4026a..bbe8496 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -624,13 +624,36 @@ static int btrfs_delayed_inode_reserve_metadata(
u64 num_bytes;
int ret;
- if (!trans->bytes_reserved)
- return 0;
-
src_rsv = trans->block_rsv;
dst_rsv = &root->fs_info->delayed_block_rsv;
num_bytes = btrfs_calc_trans_metadata_size(root, 1);
+
+ /*
+ * btrfs_dirty_inode will update the inode under btrfs_join_transaction
+ * which doesn''t reserve space for speed. This is a problem since we
+ * still need to reserve space for this update, so try to reserve the
+ * space.
+ *
+ * Now if src_rsv == delalloc_block_rsv we''ll let it just steal since
+ * we''re accounted for.
+ */
+ if (!trans->bytes_reserved &&
+ src_rsv != &root->fs_info->delalloc_block_rsv) {
+ ret = btrfs_block_rsv_add_noflush(root, dst_rsv, num_bytes);
+ /*
+ * Since we''re under a transaction reserve_metadata_bytes could
+ * try to commit the transaction which will make it return
+ * EAGAIN to make us stop the transaction we have, so return
+ * ENOSPC instead so that btrfs_dirty_inode knows what to do.
+ */
+ if (ret == -EAGAIN)
+ ret = -ENOSPC;
+ if (!ret)
+ node->bytes_reserved = num_bytes;
+ return ret;
+ }
+
ret = btrfs_block_rsv_migrate(src_rsv, dst_rsv, num_bytes);
if (!ret)
node->bytes_reserved = num_bytes;
@@ -1686,11 +1709,8 @@ int btrfs_delayed_update_inode(struct btrfs_trans_handle
*trans,
}
ret = btrfs_delayed_inode_reserve_metadata(trans, root, delayed_node);
- /*
- * we must reserve enough space when we start a new transaction,
- * so reserving metadata failure is impossible
- */
- BUG_ON(ret);
+ if (ret)
+ goto release_node;
fill_stack_inode_item(trans, &delayed_node->inode_item, inode);
delayed_node->inode_dirty = 1;
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 19c2d48..8dfe2e0 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3814,6 +3814,24 @@ int btrfs_block_rsv_add(struct btrfs_root *root,
return ret;
}
+int btrfs_block_rsv_add_noflush(struct btrfs_root *root,
+ struct btrfs_block_rsv *block_rsv,
+ u64 num_bytes)
+{
+ int ret;
+
+ if (num_bytes == 0)
+ return 0;
+
+ ret = reserve_metadata_bytes(root, block_rsv, num_bytes, 0);
+ if (!ret) {
+ block_rsv_add_bytes(block_rsv, num_bytes, 1);
+ return 0;
+ }
+
+ return ret;
+}
+
int btrfs_block_rsv_check(struct btrfs_root *root,
struct btrfs_block_rsv *block_rsv, int min_factor)
{
--
1.7.5.2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2011-Nov-05 00:47 UTC
Re: [PATCH] Btrfs: fix delayed insertion reservations V2
On Fri, Nov 04, 2011 at 05:18:54PM -0400, Josef Bacik wrote:> V1->V2: I stupidly thought I could get away with some flushing if we needed > space but I was wrong, we could deadlock, so add a btrfs_add_bytes_noflush > variant that will not do any flushing and will just return ENOSPC which will let > us fallback and do our full flushing. > fs/btrfs/ctree.h | 3 +++ > fs/btrfs/delayed-inode.c | 36 ++++++++++++++++++++++++++++-------- > fs/btrfs/extent-tree.c | 18 ++++++++++++++++++ > 3 files changed, 49 insertions(+), 8 deletions(-)This helps but it I''ve got a bunch of these in the log now: btrfs_dirty_inode: 205 callbacks suppressed btrfs: fail to dirty inode 1294 error -28 btrfs: fail to dirty inode 1208 error -28 btrfs: fail to dirty inode 714 error -28 btrfs: fail to dirty inode 1772 error -28 btrfs: fail to dirty inode 1389 error -28 btrfs: fail to dirty inode 770 error -28 btrfs: fail to dirty inode 345 error -28 btrfs: fail to dirty inode 1533 error -28 btrfs: fail to dirty inode 624 error -28 btrfs: fail to dirty inode 1085 error -28 -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2011-Nov-05 11:46 UTC
Re: [PATCH] Btrfs: fix delayed insertion reservations V2
On Fri, Nov 04, 2011 at 08:47:56PM -0400, Chris Mason wrote:> On Fri, Nov 04, 2011 at 05:18:54PM -0400, Josef Bacik wrote: > > V1->V2: I stupidly thought I could get away with some flushing if we needed > > space but I was wrong, we could deadlock, so add a btrfs_add_bytes_noflush > > variant that will not do any flushing and will just return ENOSPC which will let > > us fallback and do our full flushing. > > fs/btrfs/ctree.h | 3 +++ > > fs/btrfs/delayed-inode.c | 36 ++++++++++++++++++++++++++++-------- > > fs/btrfs/extent-tree.c | 18 ++++++++++++++++++ > > 3 files changed, 49 insertions(+), 8 deletions(-) > > This helps but it I''ve got a bunch of these in the log now: > > btrfs_dirty_inode: 205 callbacks suppressed > btrfs: fail to dirty inode 1294 error -28 > btrfs: fail to dirty inode 1208 error -28 > btrfs: fail to dirty inode 714 error -28 > btrfs: fail to dirty inode 1772 error -28 > btrfs: fail to dirty inode 1389 error -28 > btrfs: fail to dirty inode 770 error -28 > btrfs: fail to dirty inode 345 error -28 > btrfs: fail to dirty inode 1533 error -28 > btrfs: fail to dirty inode 624 error -28 > btrfs: fail to dirty inode 1085 error -28I should add this only seems to happen during xfstest 083. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Mitch Harder
2011-Nov-05 12:23 UTC
Re: [PATCH] Btrfs: fix delayed insertion reservations V2
On Sat, Nov 5, 2011 at 6:46 AM, Chris Mason <chris.mason@oracle.com> wrote:> On Fri, Nov 04, 2011 at 08:47:56PM -0400, Chris Mason wrote: >> On Fri, Nov 04, 2011 at 05:18:54PM -0400, Josef Bacik wrote: >> > V1->V2: I stupidly thought I could get away with some flushing if we needed >> > space but I was wrong, we could deadlock, so add a btrfs_add_bytes_noflush >> > variant that will not do any flushing and will just return ENOSPC which will let >> > us fallback and do our full flushing. >> > fs/btrfs/ctree.h | 3 +++ >> > fs/btrfs/delayed-inode.c | 36 ++++++++++++++++++++++++++++-------- >> > fs/btrfs/extent-tree.c | 18 ++++++++++++++++++ >> > 3 files changed, 49 insertions(+), 8 deletions(-) >> >> This helps but it I''ve got a bunch of these in the log now: >> >> btrfs_dirty_inode: 205 callbacks suppressed >> btrfs: fail to dirty inode 1294 error -28 >> btrfs: fail to dirty inode 1208 error -28 >> btrfs: fail to dirty inode 714 error -28 >> btrfs: fail to dirty inode 1772 error -28 >> btrfs: fail to dirty inode 1389 error -28 >> btrfs: fail to dirty inode 770 error -28 >> btrfs: fail to dirty inode 345 error -28 >> btrfs: fail to dirty inode 1533 error -28 >> btrfs: fail to dirty inode 624 error -28 >> btrfs: fail to dirty inode 1085 error -28 > > I should add this only seems to happen during xfstest 083. >This version of the patch clears the deadlock issue I was seeing with version 1 of this patch. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html