thr3ads.net - zfs discuss - [zfs-discuss] Metadata (DDT) Cache Bias [Jun 2011]

If this information is useful, please help other people find it:
Share via:

Edward Ned Harvey

2011-Jun-02 12:17 UTC

[zfs-discuss] Metadata (DDT) Cache Bias

Based on observed behavior measuring performance of dedup, I would say, some
chunk of data and its associated metadata seem have approximately the same
"warmness" in the cache.  So when the data gets evicted, the
associated
metadata tends to be evicted too.  So whenever you have a cache miss,
instead of needing to fetch 1 thing from disk (the data) you need to fetch N
things from disk (data + the metadata.)

 

I would say, simply giving bias to the metadata would be useful.  So the
metadata would tend to stay in cache, even when the data itself is evicted.
Useful because the metadata is so *darn* small by comparison with the actual
data...  It carries a relatively small footprint in ram, but upon cache
miss, it more than doubles the disk fetch penalty.

 

If you consider the extreme bias...  If the system would never give up
metadata in cache until all the cached data were gone...  Then it would be
similar to the current primarycache=metadata, except that the system would
be willing to cache data too, whenever there was available cache otherwise
going to waste.

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110602/946449cf/attachment.html>

Roch

2011-Jun-03 13:25 UTC

head link

[zfs-discuss] Metadata (DDT) Cache Bias

Edward Ned Harvey writes:
 > Based on observed behavior measuring performance of dedup, I would say,
some
 > chunk of data and its associated metadata seem have approximately the same
 > "warmness" in the cache.  So when the data gets evicted, the
associated
 > metadata tends to be evicted too.  So whenever you have a cache miss,
 > instead of needing to fetch 1 thing from disk (the data) you need to fetch
N
 > things from disk (data + the metadata.)
 > 
 >  
 > 
 > I would say, simply giving bias to the metadata would be useful.  So the
 > metadata would tend to stay in cache, even when the data itself is
evicted.
 > Useful because the metadata is so *darn* small by comparison with the
actual
 > data...  It carries a relatively small footprint in ram, but upon cache
 > miss, it more than doubles the disk fetch penalty.
 > 
 >  
 > 
 > If you consider the extreme bias...  If the system would never give up
 > metadata in cache until all the cached data were gone...  Then it would be
 > similar to the current primarycache=metadata, except that the system would
 > be willing to cache data too, whenever there was available cache otherwise
 > going to waste.
 > 
 >  

Interesting. Now consider this :

We have an indirect block in memory (those are 16K
referencing 128 individual data blocks). We also have an
unrelated data block say 16K. Neither are currently being
reference nor have they been for a long time (otherwise they
move up to the head of the cache lists).  They reach the
tail of the primary cache together. I have room for one of
them in the secondary cache. 

Absent other information, do we think that the indirect
block is more valuable than the data block ? At first I also
wanted to say that metadata should be favored. Now I can''t come
up with an argument to favor either one. Therefore I think
we need to include more information than just data vs
metadata in the decision process.

Instant Poll : Yes/No ?

-r

Daniel Carosone

2011-Jun-04 00:26 UTC

head link

[zfs-discuss] Metadata (DDT) Cache Bias

Edward Ned Harvey writes:>  > If you consider the extreme bias...  If the system would never give
up
>  > metadata in cache until all the cached data were gone...  Then it
would be
>  > similar to the current primarycache=metadata, except that the system
would
>  > be willing to cache data too, whenever there was available cache
otherwise
>  > going to waste.
I like this, and it could be another value for the same property:
metabias, metadata-bias, perfer-metadata, whatever. 

On Fri, Jun 03, 2011 at 06:25:45AM -0700, Roch wrote:> Interesting. Now consider this :
> 
> We have an indirect block in memory (those are 16K
> referencing 128 individual data blocks). We also have an
> unrelated data block say 16K. Neither are currently being
> reference nor have they been for a long time (otherwise they
> move up to the head of the cache lists).  They reach the
> tail of the primary cache together. I have room for one of
> them in the secondary cache. 
> 
> Absent other information, do we think that the indirect
> block is more valuable than the data block ? At first I also
> wanted to say that metadata should be favored. Now I can''t come
> up with an argument to favor either one. 
The effectiveness of a cache depends on the likelihood of a hit
against a cached value, vs the cost of keeping it.

Including data that may allow us to predict this future likelihood
based on past access patterns can improve this immensely. This is what
the arc algorithm does quite well.  

Absent this information, we assume the probability of future access to
all data blocks not currently in ARC is approximately equal.  The
indirect metadata block is therefore 127x as likely to be needed as
the one data block, since if any of the data blocks is needed, so will
the indirect block to find it.
> Therefore I think we need to include more information than just data
> vs metadata in the decision process.
If we have the information to hand, it may help - but we don''t. 

The only thing I can think of we may have is whether either block was
ever on the "frequent" list, or only on the "recent" list,
to catch
the single-pass sequential access pattern and make it the lower
priority for cache residence.

I don''t know how feasible it is to check whether any of the blocks
referenced by the indirect block are themselves in arc, nor what that
might imply about the future likelihood of further accesses to other
blocks indirectly referenced by this one.
> Instant Poll : Yes/No ?
Yes for this as an RFE, or at least as a q&d implementation to measure
potential benefit.

--
Dan.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110604/f0ba1467/attachment.bin>

Richard Elling

2011-Jun-05 01:10 UTC

head link

[zfs-discuss] Metadata (DDT) Cache Bias

On Jun 3, 2011, at 6:25 AM, Roch wrote:> 
> Edward Ned Harvey writes:
>> Based on observed behavior measuring performance of dedup, I would say,
some
>> chunk of data and its associated metadata seem have approximately the
same
>> "warmness" in the cache.  So when the data gets evicted, the
associated
>> metadata tends to be evicted too.  So whenever you have a cache miss,
>> instead of needing to fetch 1 thing from disk (the data) you need to
fetch N
>> things from disk (data + the metadata.)
>> 
>> 
>> 
>> I would say, simply giving bias to the metadata would be useful.  So
the
>> metadata would tend to stay in cache, even when the data itself is
evicted.
>> Useful because the metadata is so *darn* small by comparison with the
actual
>> data...  It carries a relatively small footprint in ram, but upon cache
>> miss, it more than doubles the disk fetch penalty.
>> 
>> 
>> 
>> If you consider the extreme bias...  If the system would never give up
>> metadata in cache until all the cached data were gone...  Then it would
be
>> similar to the current primarycache=metadata, except that the system
would
>> be willing to cache data too, whenever there was available cache
otherwise
>> going to waste.
>> 
>> 
> 
> Interesting. Now consider this :
> 
> We have an indirect block in memory (those are 16K
> referencing 128 individual data blocks). We also have an
> unrelated data block say 16K. Neither are currently being
> reference nor have they been for a long time (otherwise they
> move up to the head of the cache lists).  They reach the
> tail of the primary cache together. I have room for one of
> them in the secondary cache. 
> 
> Absent other information, do we think that the indirect
> block is more valuable than the data block ? At first I also
> wanted to say that metadata should be favored. Now I can''t come
> up with an argument to favor either one. Therefore I think
> we need to include more information than just data vs
> metadata in the decision process.
> 
> Instant Poll : Yes/No ?
No.

Methinks the MRU/MFU balance algorithm adjustment is more fruitful.
 -- richard

Edward Ned Harvey

2011-Jun-05 14:56 UTC

head link

[zfs-discuss] Metadata (DDT) Cache Bias

> From: Richard Elling [mailto:richard.elling at gmail.com]
> Sent: Saturday, June 04, 2011 9:10 PM
> > Instant Poll : Yes/No ?
> 
> No.
> 
> Methinks the MRU/MFU balance algorithm adjustment is more fruitful.
Operating under the assumption that cache hits can be predicted, I agree
with RE.  However, that''s not always the case, and if you have a random
work
load with enough ram to hold the whole DDT, but you don''t have enough
ram to
hold your whole storage pool, then dedup hurts your performance
dramatically.  Your only option is to set primarycache=metadata, and simply
give up hope that you could *ever* have a userdata cache hit.

The purpose for starting this thread is to suggest it might be worthwhile
(particularly with dedup enabled) to at least have the *option* of always
keeping the metadata in cache, but still allow userdata to be cached too, up
to the size of c_max.  Just in case you might ever see a userdata cache hit.
;-)

And as long as we''re taking a moment to think outside the box, it might
as
well be suggested that this doesn''t have to be a binary decision,
all-or-nothing.  One way to implement such an idea would be to assign a
relative weight to metadata versus userdata.  Dan and Roch suggested a value
of 128x seems appropriate.  I''m sure some people would suggest infinite
metadata weight (which is synonymous to the aforementioned
primarycache=metadata, plus the ability to cache userdata in the remaining
unused ARC space.)

Tim Cook

2011-Jun-05 18:26 UTC

head link

[zfs-discuss] Metadata (DDT) Cache Bias

On Sun, Jun 5, 2011 at 9:56 AM, Edward Ned Harvey <
opensolarisisdeadlongliveopensolaris at nedharvey.com> wrote:
> > From: Richard Elling [mailto:richard.elling at gmail.com]
> > Sent: Saturday, June 04, 2011 9:10 PM
> > > Instant Poll : Yes/No ?
> >
> > No.
> >
> > Methinks the MRU/MFU balance algorithm adjustment is more fruitful.
>
> Operating under the assumption that cache hits can be predicted, I agree
> with RE.  However, that''s not always the case, and if you have a
random
> work
> load with enough ram to hold the whole DDT, but you don''t have
enough ram
> to
> hold your whole storage pool, then dedup hurts your performance
> dramatically.  Your only option is to set primarycache=metadata, and simply
> give up hope that you could *ever* have a userdata cache hit.
>
> The purpose for starting this thread is to suggest it might be worthwhile
> (particularly with dedup enabled) to at least have the *option* of always
> keeping the metadata in cache, but still allow userdata to be cached too,
> up
> to the size of c_max.  Just in case you might ever see a userdata cache
> hit.
> ;-)
>
> And as long as we''re taking a moment to think outside the box, it
might as
> well be suggested that this doesn''t have to be a binary decision,
> all-or-nothing.  One way to implement such an idea would be to assign a
> relative weight to metadata versus userdata.  Dan and Roch suggested a
> value
> of 128x seems appropriate.  I''m sure some people would suggest
infinite
> metadata weight (which is synonymous to the aforementioned
> primarycache=metadata, plus the ability to cache userdata in the remaining
> unused ARC space.)
>
>I''d go with the option of allowing both a weighted and a forced option.
I
agree though, if you do primarycache=metadata, the system should still
attempt to cache userdata if there is additional space remaining.

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110605/ac4dc14c/attachment.html>

Daniel Carosone

2011-Jun-06 00:39 UTC

head link

[zfs-discuss] Metadata (DDT) Cache Bias

On Sun, Jun 05, 2011 at 01:26:20PM -0500, Tim Cook
wrote:> I''d go with the option of allowing both a weighted and a forced
option. I
> agree though, if you do primarycache=metadata, the system should still
> attempt to cache userdata if there is additional space remaining.
I think I disagree. Remember that this is a per-dataset
attribute/option. One of the reasons to set it on a particular
dataset is precisely to leave room in the cache for other datasets,
because I know something about the access pattern, desired service
level, or underlying storage capability.

For example, for a pool on SSD, I will set secondarycache=none (since
l2arc offers no benefit, only cost in overhead and ssd wear). I may
also set primarycache=<something less than data> since a data miss is
still pretty fast, and I will get more value using my l1/l2 cache
resources for other datasets on slower media.

This is starting to point out that these tunables are a blunt
instrument. Perhaps what may be useful is some kind of service-level
priority attribute (default 0, values +/- small ints). This could be
used in a number of places, including when deciding which of two
otherwise-equal pages to evict/demote in cache.

That''s effectively what happens anyway since the blocks do go into arc
while in use, they''re just freed immediately after.

--
Dan.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110606/3ed4ce41/attachment.bin>

zfs discuss - Jun 2011 - Metadata (DDT) Cache Bias

[zfs-discuss] Metadata (DDT) Cache Bias

[zfs-discuss] Metadata (DDT) Cache Bias

[zfs-discuss] Metadata (DDT) Cache Bias

[zfs-discuss] Metadata (DDT) Cache Bias

[zfs-discuss] Metadata (DDT) Cache Bias

[zfs-discuss] Metadata (DDT) Cache Bias

[zfs-discuss] Metadata (DDT) Cache Bias