Hi, Chris When I investigated the performance problem of file creation/deletion, I found btrfs spends lots of time in the b-tree search, so I consider whether we can use the latest search result in the same transaction or not. My idea follows: we can add mask or time stamp into b-tree''s node and leaf, then we know whether the node/leaf is COWed by the other task. If not, we check if the node/leaf of the latest search result contains the key that we want to search. By this way, we can reuse the latest search result in the same transaction and reduce the CPU time spent in the b-tree search. Chris, how do you think about it? Regards Miao -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2010-Oct-13 11:24 UTC
Re: the idea for improving the performance of b-tree search
On Wed, Oct 13, 2010 at 05:00:56PM +0800, Miao Xie wrote:> Hi, Chris > > When I investigated the performance problem of file creation/deletion, I found > btrfs spends lots of time in the b-tree search, so I consider whether we can use > the latest search result in the same transaction or not. > > My idea follows: > we can add mask or time stamp into b-tree''s node and leaf, then we know whether > the node/leaf is COWed by the other task. If not, we check if the node/leaf of > the latest search result contains the key that we want to search. By this way, > we can reuse the latest search result in the same transaction and reduce the CPU > time spent in the b-tree search. > > Chris, how do you think about it?Do your results show which btree searches are happening most often? My impression is that we are usually searching for inodes, and I see 3 different optimizations that are likely to help: 1) cache the block number of the leaf where the inode lives in the inode, or perhaps the whole path down from the root. 2) Do delayed insertion. Each time we create a file we do a number of btree insertions. One for the inode+name backrefs, one for any xattrs, two for the directory entries, one for the directory inode update. The directory name index is especially CPU intensive because it goes in hash order, which will mean a lot of btree balances. Batching these insertions, even in groups of 4 or 8 will make a big difference. 3) Do delayed inode deletion. See above, but change the word insert with delete ;) -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Nicol
2010-Oct-13 15:22 UTC
Re: the idea for improving the performance of b-tree search
On Wed, Oct 13, 2010 at 6:24 AM, Chris Mason <chris.mason@oracle.com> wrote:> 3) Do delayed inode deletion. See above, but change the word insert > with delete ;)And the reserved-for-the-future-but-not-used-yet flags field in the ioctl#21 control structure gets a use: bits to indicate for the completion of which of the potential delayed things the caller is waiting. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2010-Oct-13 15:45 UTC
Re: the idea for improving the performance of b-tree search
On Wed, Oct 13, 2010 at 10:22:30AM -0500, David Nicol wrote:> On Wed, Oct 13, 2010 at 6:24 AM, Chris Mason <chris.mason@oracle.com> wrote: > > > 3) Do delayed inode deletion. See above, but change the word insert > > with delete ;) > > And the reserved-for-the-future-but-not-used-yet flags field in the > ioctl#21 control structure gets a use: bits to indicate for the > completion of which of the potential delayed things the caller is > waiting.This is similar to the delayed allocation we already do for file data extents. It is supposed to be transparent to userland, and just a way to more efficiently poke the on disk structures. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Nicol
2010-Oct-13 20:38 UTC
Re: the idea for improving the performance of b-tree search
>> > 3) Do delayed inode deletion. See above, but change the word insert >> > with delete ;) >> >> And the reserved-for-the-future-but-not-used-yet flags field in the >> ioctl#21 control structure gets a use: bits to indicate for the >> completion of which of the potential delayed things the caller is >> waiting. > > This is similar to the delayed allocation we already do for file data > extents. It is supposed to be transparent to userland, and just a way > to more efficiently poke the on disk structures. > > -chrisWhat I mean is, if we want to give a userland way to see when any deferred anything is done, using one wait queue that wakes all waiters on all completion events will be simpler than having a separate wait queue structure for every class of event, at the cost of unnecessary tests when a waiter is woken on the wrong event. That is what is traded off when deciding how many wait queues -- just like locks -- to have; it''s a granularity question. Now that there is rmdir of empty subvolumes, though, ioctl#21 might be redundant; I suppose that''s your call. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Shaohua Li
2010-Oct-18 02:41 UTC
Re: the idea for improving the performance of b-tree search
Hi Miao & Chris, On Wed, Oct 13, 2010 at 05:00:56PM +0800, Miao Xie wrote:> When I investigated the performance problem of file creation/deletion, I found > btrfs spends lots of time in the b-tree search, so I consider whether we can use > the latest search result in the same transaction or not. > > My idea follows: > we can add mask or time stamp into b-tree''s node and leaf, then we know whether > the node/leaf is COWed by the other task. If not, we check if the node/leaf of > the latest search result contains the key that we want to search. By this way, > we can reuse the latest search result in the same transaction and reduce the CPU > time spent in the b-tree search.Does this patch help a little for the tree search? http://marc.info/?l=linux-btrfs&m=127175532803943&w=2 It doesn''t solve all the issues, but in my test, it does help. Thanks, Shaohua -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2010-Oct-18 09:36 UTC
Re: the idea for improving the performance of b-tree search
On Mon, 18 Oct 2010 10:41:41 +0800, Shaohua Li wrote:> Hi Miao& Chris, > On Wed, Oct 13, 2010 at 05:00:56PM +0800, Miao Xie wrote: >> When I investigated the performance problem of file creation/deletion, I found >> btrfs spends lots of time in the b-tree search, so I consider whether we can use >> the latest search result in the same transaction or not. >> >> My idea follows: >> we can add mask or time stamp into b-tree''s node and leaf, then we know whether >> the node/leaf is COWed by the other task. If not, we check if the node/leaf of >> the latest search result contains the key that we want to search. By this way, >> we can reuse the latest search result in the same transaction and reduce the CPU >> time spent in the b-tree search. > Does this patch help a little for the tree search? > http://marc.info/?l=linux-btrfs&m=127175532803943&w=2 > It doesn''t solve all the issues, but in my test, it does help.Thanks for your patch, but according to my test result, the time of b-tree search has not been reduced. I think the reason is that extent-state is used when btrfs writes data into the disk, so your patch is helpful when the disk I/O happens, but my test just measures the cost of the b-tree searchs/insertions/removals, the disk I/O doesn''t happen. Regards Miao -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html