The first patch aims to fix the bug of repeatly building inode cache. The next two patches fix problems with the first one applied. Liu Bo (3): Btrfs: avoid building inode cache repeatly Btrfs: don''t build inode cache for orphan root Btrfs: fix EEXIST error when creating new file in subvolume/snapshot fs/btrfs/disk-io.c | 3 +++ fs/btrfs/inode-map.c | 25 +++++++++++++++++++++---- fs/btrfs/inode-map.h | 1 + fs/btrfs/root-tree.c | 3 +++ 4 files changed, 28 insertions(+), 4 deletions(-) -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Inode cache is similar to free space cache and in fact shares the same code, however, we don''t load inode cache unless we''re about to allocate inode id, then there is a case where we only commit the transaction during other operations, such as snapshot creation, we now update fs roots'' generation to the new transaction id, after that when we want to load the inode cache, we''ll find that it''s not valid thanks to the mismatch of generation, and we have to push btrfs-ino-cache thread to build inode cache from disk, and this operation is sometimes time-costing. So to fix the above, we load inode cache into memory during reading fs root. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- v2: fix race issue pointed by Miao. fs/btrfs/disk-io.c | 3 +++ fs/btrfs/inode-map.c | 6 ++++++ fs/btrfs/inode-map.h | 1 + fs/btrfs/root-tree.c | 3 +++ 4 files changed, 13 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 8072cfa..59af2aa 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1630,6 +1630,9 @@ again: } goto fail; } + + btrfs_start_ino_caching(root); + return root; fail: free_fs_root(root); diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c index ab485e5..f23b0df 100644 --- a/fs/btrfs/inode-map.c +++ b/fs/btrfs/inode-map.c @@ -179,6 +179,12 @@ static void start_caching(struct btrfs_root *root) BUG_ON(IS_ERR(tsk)); /* -ENOMEM */ } +void btrfs_start_ino_caching(struct btrfs_root *root) +{ + if (root) + start_caching(root); +} + int btrfs_find_free_ino(struct btrfs_root *root, u64 *objectid) { if (!btrfs_test_opt(root, INODE_MAP_CACHE)) diff --git a/fs/btrfs/inode-map.h b/fs/btrfs/inode-map.h index ddb347b..5acf943 100644 --- a/fs/btrfs/inode-map.h +++ b/fs/btrfs/inode-map.h @@ -4,6 +4,7 @@ void btrfs_init_free_ino_ctl(struct btrfs_root *root); void btrfs_unpin_free_ino(struct btrfs_root *root); void btrfs_return_ino(struct btrfs_root *root, u64 objectid); +void btrfs_start_ino_caching(struct btrfs_root *root); int btrfs_find_free_ino(struct btrfs_root *root, u64 *objectid); int btrfs_save_ino_cache(struct btrfs_root *root, struct btrfs_trans_handle *trans); diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c index ec71ea4..d4b6cfc 100644 --- a/fs/btrfs/root-tree.c +++ b/fs/btrfs/root-tree.c @@ -21,6 +21,7 @@ #include "transaction.h" #include "disk-io.h" #include "print-tree.h" +#include "inode-map.h" /* * Read a root item from the tree. In case we detect a root item smaller then @@ -316,6 +317,8 @@ int btrfs_find_orphan_roots(struct btrfs_root *tree_root) if (btrfs_root_refs(&root->root_item) == 0) btrfs_add_dead_root(root); + else + btrfs_start_ino_caching(root); } btrfs_free_path(path); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2013-Dec-16 07:25 UTC
[PATCH v2 2/3] Btrfs: don''t build inode cache for orphan root
We check if we have orphan roots when mounting btrfs, but orphan roots are those who are already dead and about to be freed, so don''t start building inode cache for them, otherwise we''ll get an ugly crash. Acked-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- fs/btrfs/inode-map.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c index f23b0df..b7fb1a8 100644 --- a/fs/btrfs/inode-map.c +++ b/fs/btrfs/inode-map.c @@ -141,7 +141,9 @@ static void start_caching(struct btrfs_root *root) int ret; u64 objectid; - if (!btrfs_test_opt(root, INODE_MAP_CACHE)) + /* Don''t even start if this is an orphan root. */ + if (!btrfs_test_opt(root, INODE_MAP_CACHE) || + btrfs_root_refs(&root->root_item) == 0) return; spin_lock(&root->cache_lock); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2013-Dec-16 07:25 UTC
[PATCH v2 3/3] Btrfs: fix EEXIST error when creating new file in subvolume/snapshot
While creating a subvolume/snapshot, we don''t use inode cache to allocate an inode id for the root dir ".", so inode cache doesn''t mark that id as used, and when we create a new file, it''ll be unhappy and throw out -EEXIST. Reviewed-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- fs/btrfs/inode-map.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c index b7fb1a8..77cb72a 100644 --- a/fs/btrfs/inode-map.c +++ b/fs/btrfs/inode-map.c @@ -532,6 +532,16 @@ static int btrfs_find_highest_objectid(struct btrfs_root *root, u64 *objectid) struct btrfs_key search_key; struct btrfs_key found_key; int slot; + u64 min_objectid; + + /* + * For fs/file tree, FIRST_FREE_OBJECTID is reserved for + * root dir "." + */ + if (is_fstree(root->root_key.objectid)) + min_objectid = BTRFS_FIRST_FREE_OBJECTID; + else + min_objectid = BTRFS_FIRST_FREE_OBJECTID - 1; path = btrfs_alloc_path(); if (!path) @@ -548,10 +558,9 @@ static int btrfs_find_highest_objectid(struct btrfs_root *root, u64 *objectid) slot = path->slots[0] - 1; l = path->nodes[0]; btrfs_item_key_to_cpu(l, &found_key, slot); - *objectid = max_t(u64, found_key.objectid, - BTRFS_FIRST_FREE_OBJECTID - 1); + *objectid = max_t(u64, found_key.objectid, min_objectid); } else { - *objectid = BTRFS_FIRST_FREE_OBJECTID - 1; + *objectid = min_objectid; } ret = 0; error: -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Dec-16 09:04 UTC
Re: [PATCH v2 1/3] Btrfs: avoid building inode cache repeatly
On mon, 16 Dec 2013 15:25:33 +0800, Liu Bo wrote:> Inode cache is similar to free space cache and in fact shares the same > code, however, we don''t load inode cache unless we''re about to allocate > inode id, then there is a case where we only commit the transaction during > other operations, such as snapshot creation, we now update fs roots'' generation > to the new transaction id, after that when we want to load the inode cache, > we''ll find that it''s not valid thanks to the mismatch of generation, and we > have to push btrfs-ino-cache thread to build inode cache from disk, and > this operation is sometimes time-costing. > > So to fix the above, we load inode cache into memory during reading fs root. > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > --- > v2: fix race issue pointed by Miao. > > fs/btrfs/disk-io.c | 3 +++ > fs/btrfs/inode-map.c | 6 ++++++ > fs/btrfs/inode-map.h | 1 + > fs/btrfs/root-tree.c | 3 +++ > 4 files changed, 13 insertions(+) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index 8072cfa..59af2aa 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -1630,6 +1630,9 @@ again: > } > goto fail; > } > + > + btrfs_start_ino_caching(root); > + > return root; > fail: > free_fs_root(root); > diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c > index ab485e5..f23b0df 100644 > --- a/fs/btrfs/inode-map.c > +++ b/fs/btrfs/inode-map.c > @@ -179,6 +179,12 @@ static void start_caching(struct btrfs_root *root) > BUG_ON(IS_ERR(tsk)); /* -ENOMEM */ > } > > +void btrfs_start_ino_caching(struct btrfs_root *root) > +{ > + if (root) > + start_caching(root); > +}We are sure root is not NULL, so this check is unnecessary. I dipped into the problem, I don''t think loading inode cache during reading fs root is a good way to fix this problem, because in some cases, we read the fs/file root, but we don''t want to allocate/free the inode id. I think we can add a flag, which is used to mark if the fs/file root has inode id cache. We can set the flag when we reading the fs/file root. If the flag is set but we don''t allocate/free the inode id from/to the inode id cache, we set the generation in the space cache header to 0, which can avoid loading a invalid inode cache, and then clear the flag. How about this idea? Thanks Miao> + > int btrfs_find_free_ino(struct btrfs_root *root, u64 *objectid) > { > if (!btrfs_test_opt(root, INODE_MAP_CACHE)) > diff --git a/fs/btrfs/inode-map.h b/fs/btrfs/inode-map.h > index ddb347b..5acf943 100644 > --- a/fs/btrfs/inode-map.h > +++ b/fs/btrfs/inode-map.h > @@ -4,6 +4,7 @@ > void btrfs_init_free_ino_ctl(struct btrfs_root *root); > void btrfs_unpin_free_ino(struct btrfs_root *root); > void btrfs_return_ino(struct btrfs_root *root, u64 objectid); > +void btrfs_start_ino_caching(struct btrfs_root *root); > int btrfs_find_free_ino(struct btrfs_root *root, u64 *objectid); > int btrfs_save_ino_cache(struct btrfs_root *root, > struct btrfs_trans_handle *trans); > diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c > index ec71ea4..d4b6cfc 100644 > --- a/fs/btrfs/root-tree.c > +++ b/fs/btrfs/root-tree.c > @@ -21,6 +21,7 @@ > #include "transaction.h" > #include "disk-io.h" > #include "print-tree.h" > +#include "inode-map.h" > > /* > * Read a root item from the tree. In case we detect a root item smaller then > @@ -316,6 +317,8 @@ int btrfs_find_orphan_roots(struct btrfs_root *tree_root) > > if (btrfs_root_refs(&root->root_item) == 0) > btrfs_add_dead_root(root); > + else > + btrfs_start_ino_caching(root); > } > > btrfs_free_path(path); >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2013-Dec-16 10:26 UTC
Re: [PATCH v2 1/3] Btrfs: avoid building inode cache repeatly
On Mon, Dec 16, 2013 at 05:04:33PM +0800, Miao Xie wrote:> On mon, 16 Dec 2013 15:25:33 +0800, Liu Bo wrote: > > Inode cache is similar to free space cache and in fact shares the same > > code, however, we don''t load inode cache unless we''re about to allocate > > inode id, then there is a case where we only commit the transaction during > > other operations, such as snapshot creation, we now update fs roots'' generation > > to the new transaction id, after that when we want to load the inode cache, > > we''ll find that it''s not valid thanks to the mismatch of generation, and we > > have to push btrfs-ino-cache thread to build inode cache from disk, and > > this operation is sometimes time-costing. > > > > So to fix the above, we load inode cache into memory during reading fs root. > > > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > > --- > > v2: fix race issue pointed by Miao. > > > > fs/btrfs/disk-io.c | 3 +++ > > fs/btrfs/inode-map.c | 6 ++++++ > > fs/btrfs/inode-map.h | 1 + > > fs/btrfs/root-tree.c | 3 +++ > > 4 files changed, 13 insertions(+) > > > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > > index 8072cfa..59af2aa 100644 > > --- a/fs/btrfs/disk-io.c > > +++ b/fs/btrfs/disk-io.c > > @@ -1630,6 +1630,9 @@ again: > > } > > goto fail; > > } > > + > > + btrfs_start_ino_caching(root); > > + > > return root; > > fail: > > free_fs_root(root); > > diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c > > index ab485e5..f23b0df 100644 > > --- a/fs/btrfs/inode-map.c > > +++ b/fs/btrfs/inode-map.c > > @@ -179,6 +179,12 @@ static void start_caching(struct btrfs_root *root) > > BUG_ON(IS_ERR(tsk)); /* -ENOMEM */ > > } > > > > +void btrfs_start_ino_caching(struct btrfs_root *root) > > +{ > > + if (root) > > + start_caching(root); > > +} > > We are sure root is not NULL, so this check is unnecessary. > > I dipped into the problem, I don''t think loading inode cache during reading > fs root is a good way to fix this problem, because in some cases, we read > the fs/file root, but we don''t want to allocate/free the inode id. > > I think we can add a flag, which is used to mark if the fs/file root has inode > id cache. We can set the flag when we reading the fs/file root. If the flag is > set but we don''t allocate/free the inode id from/to the inode id cache, we set > the generation in the space cache header to 0, which can avoid loading a invalid > inode cache, and then clear the flag. How about this idea?That''s same with the current code. If we don''t allocate/free inode ids, @root->cached remains BTRFS_CACHE_NO, and btrfs_save_ino_cache will set inode cache''s generation to 0. So the problem of rebuilding inode cache repeatly is not loading an invalid ino-cache. btrfs_save_ino_cache() cleanup a valid inode cache during transaction commit because of options INODE_MAP, and find that inode cache is not even loaded during that committed transaction, and it just skip writing out inode cache. Next time when we''re allocating inode ids, we load the inode cache and find it is there but already outdated so that we have to rebuild another one same with the previous cache. To fit what you concerned, btrfs_save_ino_cache() reminds us that only fs tree and subvol/snap need ino cache, so I think adding a check like that is enough to filter out those fs/file roots where we don''t want to allocate/free inode ids, eg. data reloc root. -liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2013-Dec-16 10:54 UTC
Re: [PATCH v2 1/3] Btrfs: avoid building inode cache repeatly
On Mon, 16 Dec 2013 18:26:02 +0800, Liu Bo wrote:> On Mon, Dec 16, 2013 at 05:04:33PM +0800, Miao Xie wrote: >> On mon, 16 Dec 2013 15:25:33 +0800, Liu Bo wrote: >>> Inode cache is similar to free space cache and in fact shares the same >>> code, however, we don''t load inode cache unless we''re about to allocate >>> inode id, then there is a case where we only commit the transaction during >>> other operations, such as snapshot creation, we now update fs roots'' generation >>> to the new transaction id, after that when we want to load the inode cache, >>> we''ll find that it''s not valid thanks to the mismatch of generation, and we >>> have to push btrfs-ino-cache thread to build inode cache from disk, and >>> this operation is sometimes time-costing. >>> >>> So to fix the above, we load inode cache into memory during reading fs root. >>> >>> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> >>> --- >>> v2: fix race issue pointed by Miao. >>> >>> fs/btrfs/disk-io.c | 3 +++ >>> fs/btrfs/inode-map.c | 6 ++++++ >>> fs/btrfs/inode-map.h | 1 + >>> fs/btrfs/root-tree.c | 3 +++ >>> 4 files changed, 13 insertions(+) >>> >>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >>> index 8072cfa..59af2aa 100644 >>> --- a/fs/btrfs/disk-io.c >>> +++ b/fs/btrfs/disk-io.c >>> @@ -1630,6 +1630,9 @@ again: >>> } >>> goto fail; >>> } >>> + >>> + btrfs_start_ino_caching(root); >>> + >>> return root; >>> fail: >>> free_fs_root(root); >>> diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c >>> index ab485e5..f23b0df 100644 >>> --- a/fs/btrfs/inode-map.c >>> +++ b/fs/btrfs/inode-map.c >>> @@ -179,6 +179,12 @@ static void start_caching(struct btrfs_root *root) >>> BUG_ON(IS_ERR(tsk)); /* -ENOMEM */ >>> } >>> >>> +void btrfs_start_ino_caching(struct btrfs_root *root) >>> +{ >>> + if (root) >>> + start_caching(root); >>> +} >> >> We are sure root is not NULL, so this check is unnecessary. >> >> I dipped into the problem, I don''t think loading inode cache during reading >> fs root is a good way to fix this problem, because in some cases, we read >> the fs/file root, but we don''t want to allocate/free the inode id. >> >> I think we can add a flag, which is used to mark if the fs/file root has inode >> id cache. We can set the flag when we reading the fs/file root. If the flag is >> set but we don''t allocate/free the inode id from/to the inode id cache, we set >> the generation in the space cache header to 0, which can avoid loading a invalid >> inode cache, and then clear the flag. How about this idea? > > That''s same with the current code.One important point I forgot is that use the generation in the space cache header to check the cache inode generation, don''t use the root generation,> If we don''t allocate/free inode ids, @root->cached remains BTRFS_CACHE_NO, and > btrfs_save_ino_cache will set inode cache''s generation to 0. > > So the problem of rebuilding inode cache repeatly is not loading an invalid > ino-cache. > > btrfs_save_ino_cache() cleanup a valid inode cache during transaction commit > because of options INODE_MAP, and find that inode cache is not even loaded > during that committed transaction, and it just skip writing out inode cache. > Next time when we''re allocating inode ids, we load the inode cache and find it > is there but already outdated so that we have to rebuild another one same with > the previous cache. > > To fit what you concerned, btrfs_save_ino_cache() reminds us that only fs tree > and subvol/snap need ino cache, so I think adding a check like that is enough > to filter out those fs/file roots where we don''t want to allocate/free inode > ids, eg. data reloc root.use a flag to indicate if the ino cache is dirty, just like free space cache. If not dirty, skip the save process. Thanks Miao -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2013-Dec-16 15:09 UTC
Re: [PATCH v2 1/3] Btrfs: avoid building inode cache repeatly
On Mon, 2013-12-16 at 15:25 +0800, Liu Bo wrote:> Inode cache is similar to free space cache and in fact shares the same > code, however, we don't load inode cache unless we're about to allocate > inode id, then there is a case where we only commit the transaction during > other operations, such as snapshot creation, we now update fs roots' generation > to the new transaction id, after that when we want to load the inode cache, > we'll find that it's not valid thanks to the mismatch of generation, and we > have to push btrfs-ino-cache thread to build inode cache from disk, and > this operation is sometimes time-costing. > > So to fix the above, we load inode cache into memory during reading fs root.Please reorder these so the patch that causes problems comes after the patches that fix the problems ;) IOW, please make it bisect friendly. -chris