thr3ads.net - zfs discuss - [zfs-discuss] ARC de-allocation with large ram [Oct 2012]

If this information is useful, please help other people find it:
Share via:

Chris Nagele

2012-Oct-20 17:47 UTC

[zfs-discuss] ARC de-allocation with large ram

Hi. We''re running OmniOS as a ZFS storage server. For some reason, our
arc cache will grow to a certain point, then suddenly drops. I used
arcstat to catch it in action, but I was not able to capture what else
was going on in the system at the time. I''ll do that next.

read  hits  miss  hit%  l2read  l2hits  l2miss  l2hit%  arcsz  l2size
 166   166     0   100       0       0       0       0    85G    225G
5.9K  5.9K     0   100       0       0       0       0    85G    225G
 755   715    40    94      40       0      40       0    84G    225G
 17K   17K     0   100       0       0       0       0    67G    225G
 409   395    14    96      14       0      14       0    49G    225G
 388   364    24    93      24       0      24       0    41G    225G
 37K   37K    20    99      20       6      14      30    40G    225G

For reference, it''s a 12TB pool with 512GB SSD L2 ARC and 198GB RAM.
We have nothing else running on the system except NFS. We are also not
using dedupe. Here is the output of memstat at one point:

# echo ::memstat | mdb -k
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                   19061902             74460   38%
ZFS File Data            28237282            110301   56%
Anon                        43112               168    0%
Exec and libs                1522                 5    0%
Page cache                  13509                52    0%
Free (cachelist)             6366                24    0%
Free (freelist)           2958527             11556    6%

Total                    50322220            196571
Physical                 50322219            196571

According to "prstat -s rss" nothing else is consuming the memory.

   592 root       33M   26M sleep   59    0   0:00:33 0.0% fmd/27
    12 root       13M   11M sleep   59    0   0:00:08 0.0% svc.configd/21
   641 root       12M   11M sleep   59    0   0:04:48 0.0% snmpd/1
    10 root       14M   10M sleep   59    0   0:00:03 0.0% svc.startd/16
   342 root       12M 9084K sleep   59    0   0:00:15 0.0% hald/5
   321 root       14M 8652K sleep   59    0   0:03:00 0.0% nscd/52

So far I can''t figure out what could be causing this. The only other
thing I can think of is that we have a bunch of zfs send/receive
operations going on as backups across 10 datasets in the pool. I  am
not sure how snapshots and send/receive affect the arc. Does anyone
else have any ideas?

Thanks,
Chris

Robert Milkowski

2012-Oct-22 09:31 UTC

head link

[zfs-discuss] ARC de-allocation with large ram

Hi,

If after it decreases in size it stays there it might be similar to:

	7111576 arc shrinks in the absence of memory pressure

Also, see document:

	ZFS ARC can shrink down without memory pressure result in slow
performance [ID 1404581.1]

Specifically, check if arc_no_grow is set to 1 after the cache size is
decreased, and if it stays that way.

The fix is in one of the SRUs and I think it should be in 11.1
I don''t know if it was fixed in Illumos or even if Illumos was affected
by
this at all.


-- 
Robert Milkowski
http://milek.blogspot.com

> -----Original Message-----
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Chris Nagele
> Sent: 20 October 2012 18:47
> To: zfs-discuss at opensolaris.org
> Subject: [zfs-discuss] ARC de-allocation with large ram
> 
> Hi. We''re running OmniOS as a ZFS storage server. For some reason,
our
> arc cache will grow to a certain point, then suddenly drops. I used
> arcstat to catch it in action, but I was not able to capture what else
> was going on in the system at the time. I''ll do that next.
> 
> read  hits  miss  hit%  l2read  l2hits  l2miss  l2hit%  arcsz  l2size
>  166   166     0   100       0       0       0       0    85G    225G
> 5.9K  5.9K     0   100       0       0       0       0    85G    225G
>  755   715    40    94      40       0      40       0    84G    225G
>  17K   17K     0   100       0       0       0       0    67G    225G
>  409   395    14    96      14       0      14       0    49G    225G
>  388   364    24    93      24       0      24       0    41G    225G
>  37K   37K    20    99      20       6      14      30    40G    225G
> 
> For reference, it''s a 12TB pool with 512GB SSD L2 ARC and 198GB
RAM.
> We have nothing else running on the system except NFS. We are also not
> using dedupe. Here is the output of memstat at one point:
> 
> # echo ::memstat | mdb -k
> Page Summary                Pages                MB  %Tot
> ------------     ----------------  ----------------  ----
> Kernel                   19061902             74460   38%
> ZFS File Data            28237282            110301   56%
> Anon                        43112               168    0%
> Exec and libs                1522                 5    0%
> Page cache                  13509                52    0%
> Free (cachelist)             6366                24    0%
> Free (freelist)           2958527             11556    6%
> 
> Total                    50322220            196571
> Physical                 50322219            196571
> 
> According to "prstat -s rss" nothing else is consuming the
memory.
> 
>    592 root       33M   26M sleep   59    0   0:00:33 0.0% fmd/27
>     12 root       13M   11M sleep   59    0   0:00:08 0.0%
> svc.configd/21
>    641 root       12M   11M sleep   59    0   0:04:48 0.0% snmpd/1
>     10 root       14M   10M sleep   59    0   0:00:03 0.0%
> svc.startd/16
>    342 root       12M 9084K sleep   59    0   0:00:15 0.0% hald/5
>    321 root       14M 8652K sleep   59    0   0:03:00 0.0% nscd/52
> 
> So far I can''t figure out what could be causing this. The only
other
> thing I can think of is that we have a bunch of zfs send/receive
> operations going on as backups across 10 datasets in the pool. I  am
> not sure how snapshots and send/receive affect the arc. Does anyone
> else have any ideas?
> 
> Thanks,
> Chris
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Tomas Forsman

2012-Oct-22 09:39 UTC

head link

[zfs-discuss] ARC de-allocation with large ram

On 22 October, 2012 - Robert Milkowski sent me these 3,6K bytes:
> Hi,
> 
> If after it decreases in size it stays there it might be similar to:
> 
> 	7111576 arc shrinks in the absence of memory pressure
> 
> Also, see document:
> 
> 	ZFS ARC can shrink down without memory pressure result in slow
> performance [ID 1404581.1]
> 
> Specifically, check if arc_no_grow is set to 1 after the cache size is
> decreased, and if it stays that way.
> 
> The fix is in one of the SRUs and I think it should be in 11.1
> I don''t know if it was fixed in Illumos or even if Illumos was
affected by
> this at all.
The code that affects bug 7111576 was introduced between s10 and s11.

/Tomas
-- 
Tomas Forsman, stric at acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Ume?
`- Sysadmin at {cs,acc}.umu.se

Chris Nagele

2012-Oct-22 13:52 UTC

head link

[zfs-discuss] ARC de-allocation with large ram

> If after it decreases in size it stays there it might be similar to:
>
>         7111576 arc shrinks in the absence of memory pressure
After it dropped, it did build back up. Today is the first day that
these servers are working under real production load and it is looking
much better. arcstat is showing some nice numbers for arc, but l2 is
still building.

read  hits  miss  hit%  l2read  l2hits  l2miss  l2hit%  arcsz  l2size
 19K   17K  2.5K    87    2.5K     490    2.0K      19   148G    371G
 41K   39K  2.3K    94    2.3K     184    2.1K       7   148G    371G
 34K   34K   694    98     694      17     677       2   148G    371G
 16K   15K  1.0K    93    1.0K      16    1.0K       1   148G    371G
 39K   36K  2.3K    94    2.3K      20    2.3K       0   148G    371G
 23K   22K   746    96     746      76     670      10   148G    371G
 49K   47K  1.7K    96    1.7K     249    1.5K      14   148G    371G
 23K   21K  1.4K    93    1.4K      38    1.4K       2   148G    371G

My only guess is that the large zfs send / recv streams were affecting
the cache when they started and finished.

Thanks for the responses and help.

Chris

Richard Elling

2012-Oct-22 17:50 UTC

head link

[zfs-discuss] ARC de-allocation with large ram

On Oct 22, 2012, at 6:52 AM, Chris Nagele <nagele at wildbit.com> wrote:
>> If after it decreases in size it stays there it might be similar to:
>> 
>>        7111576 arc shrinks in the absence of memory pressure
> 
> After it dropped, it did build back up. Today is the first day that
> these servers are working under real production load and it is looking
> much better. arcstat is showing some nice numbers for arc, but l2 is
> still building.
> 
> read  hits  miss  hit%  l2read  l2hits  l2miss  l2hit%  arcsz  l2size
> 19K   17K  2.5K    87    2.5K     490    2.0K      19   148G    371G
> 41K   39K  2.3K    94    2.3K     184    2.1K       7   148G    371G
> 34K   34K   694    98     694      17     677       2   148G    371G
> 16K   15K  1.0K    93    1.0K      16    1.0K       1   148G    371G
> 39K   36K  2.3K    94    2.3K      20    2.3K       0   148G    371G
> 23K   22K   746    96     746      76     670      10   148G    371G
> 49K   47K  1.7K    96    1.7K     249    1.5K      14   148G    371G
> 23K   21K  1.4K    93    1.4K      38    1.4K       2   148G    371G
> 
> My only guess is that the large zfs send / recv streams were affecting
> the cache when they started and finished.
There are other cases where data is evicted from the ARC, though I
don''t
have a complete list at my fingertips. For example, if a zvol is closed, then
the data for the zvol is evicted.
 -- richard
> 
> Thanks for the responses and help.
> 
> Chris
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--

Richard.Elling at RichardElling.com
+1-760-896-4422









-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20121022/a7d36535/attachment-0001.html>

zfs discuss - Oct 2012 - ARC de-allocation with large ram

[zfs-discuss] ARC de-allocation with large ram

[zfs-discuss] ARC de-allocation with large ram

[zfs-discuss] ARC de-allocation with large ram

[zfs-discuss] ARC de-allocation with large ram

[zfs-discuss] ARC de-allocation with large ram