Hi all,
Since we''ve started running 2009.06 on a few servers we seem to be
hitting a problem with l2arc that causes it to stop receiving evicted
arc pages. Has anyone else seen this kind of problem?
The filesystem contains about 130G of compressed (lzjb) data, and looks
like:
$ zpool status -v data
pool: data
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
data ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t1d0p0 ONLINE 0 0 0
c1t9d0p0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t2d0p0 ONLINE 0 0 0
c1t10d0p0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t3d0p0 ONLINE 0 0 0
c1t11d0p0 ONLINE 0 0 0
logs ONLINE 0 0 0
c1t7d0p0 ONLINE 0 0 0
c1t15d0p0 ONLINE 0 0 0
cache
c1t14d0p0 ONLINE 0 0 0
c1t6d0p0 ONLINE 0 0 0
$ zpool iostat -v data
capacity operations bandwidth
pool used avail read write read write
------------- ----- ----- ----- ----- ----- -----
data 133G 275G 334 926 2.35M 8.62M
mirror 44.4G 91.6G 111 257 799K 1.60M
c1t1d0p0 - - 55 145 979K 1.61M
c1t9d0p0 - - 54 145 970K 1.61M
mirror 44.3G 91.7G 111 258 804K 1.61M
c1t2d0p0 - - 55 140 979K 1.61M
c1t10d0p0 - - 55 140 973K 1.61M
mirror 44.4G 91.6G 111 258 801K 1.61M
c1t3d0p0 - - 55 145 982K 1.61M
c1t11d0p0 - - 55 145 975K 1.61M
c1t7d0p0 12K 29.7G 0 76 71 1.90M
c1t15d0p0 152K 29.7G 0 78 11 1.96M
cache - - - - - -
c1t14d0p0 51.3G 23.2G 51 35 835K 4.07M
c1t6d0p0 48.7G 25.9G 45 34 750K 3.86M
------------- ----- ----- ----- ----- ----- -----
After adding quite a bit of data to l2arc, it quits getting new writes,
and read traffic is quite low, even though arc misses are quite high:
capacity operations bandwidth
pool used avail read write read write
------------- ----- ----- ----- ----- ----- -----
data 133G 275G 550 263 3.85M 1.57M
mirror 44.4G 91.6G 180 0 1.18M 0
c1t1d0p0 - - 88 0 3.22M 0
c1t9d0p0 - - 91 0 3.36M 0
mirror 44.3G 91.7G 196 0 1.29M 0
c1t2d0p0 - - 95 0 2.74M 0
c1t10d0p0 - - 100 0 3.60M 0
mirror 44.4G 91.6G 174 0 1.38M 0
c1t3d0p0 - - 85 0 2.71M 0
c1t11d0p0 - - 88 0 3.34M 0
c1t7d0p0 8K 29.7G 0 131 0 790K
c1t15d0p0 156K 29.7G 0 131 0 816K
cache - - - - - -
c1t14d0p0 51.3G 23.2G 16 0 271K 0
c1t6d0p0 48.7G 25.9G 14 0 224K 0
------------- ----- ----- ----- ----- ----- -----
$ perl arcstat.pl
Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz
c
21:21:31 10M 5M 53 5M 53 0 0 2M 31 857M
1G
21:21:32 209 84 40 84 40 0 0 60 32 833M
1G
21:21:33 255 57 22 57 22 0 0 9 4 832M
1G
21:21:34 630 483 76 483 76 0 0 232 63 831M
1G
Arcstats output, just for completeness:
$ kstat -n arcstats
module: zfs instance: 0
name: arcstats class: misc
c 1610325248
c_max 2147483648
c_min 1610325248
crtime 129.137246015
data_size 528762880
deleted 14452910
demand_data_hits 589823
demand_data_misses 3812972
demand_metadata_hits 4477921
demand_metadata_misses 2069450
evict_skip 5347558
hash_chain_max 13
hash_chains 521232
hash_collisions 9991276
hash_elements 1750708
hash_elements_max 2627838
hdr_size 25463208
hits 5067744
l2_abort_lowmem 3225
l2_cksum_bad 0
l2_evict_lock_retry 0
l2_evict_reading 0
l2_feeds 14531
l2_free_on_write 106576
l2_hdr_size 297244272
l2_hits 1255730
l2_io_error 0
l2_misses 4625372
l2_read_bytes 21028000256
l2_rw_clash 37
l2_size 26367979008
l2_write_bytes 107297759744
l2_writes_done 14354
l2_writes_error 0
l2_writes_hdr_miss 295
l2_writes_sent 14354
memory_throttle_count 3597
mfu_ghost_hits 376756
mfu_hits 4486621
misses 5882422
mru_ghost_hits 1183606
mru_hits 581123
mutex_miss 73477
other_size 15518616
p 489115556
prefetch_data_hits 0
prefetch_data_misses 0
prefetch_metadata_hits 0
prefetch_metadata_misses 0
recycle_miss 2052735
size 841141648
snaptime 13279.450336199
OS details:
$ uname -a
SunOS dbhost 5.11 snv_111b i86pc i386 i86pc Solaris
So, from everything I can see, no new data is being pushed to the l2arc
SSDs, and the system is missing from arc quite heavily.
This is a mysql database server, so if you are wondering about the
smallish arc size, it''s being artificially limited by "set
zfs:zfs_arc_max = 0x80000000" in /etc/system, so that the majority of
ram can be allocated to InnoDb.
Any help would be much appreciated.
Thank you,
Ethan