thr3ads.net - freebsd stable - zfs listing and CPU [Aug 2017]

If this information is useful, please help other people find it:
Share via:

Eugene M. Zheganin

2017-Aug-13 10:09 UTC

zfs listing and CPU

Hi,

On 12.08.2017 20:50, Paul Kraus wrote:>> On Aug 11, 2017, at 2:28 AM, Eugene M. Zheganin <emz at
norma.perm.ru> wrote:
>>
>> Why does the zfs listing eat so much of the CPU ?
>> 47114 root           1  20    0 40432K  3840K db->db  4   0:05
26.84% zfs
>> 47099 root           1  20    0 40432K  3840K zio->i 17   0:05
26.83% zfs
>> 47106 root           1  20    0 40432K  3840K db->db 21   0:05
26.81% zfs
>> 47150 root           1  20    0 40432K  3428K db->db 13   0:03
26.31% zfs
>> 47141 root           1  20    0 40432K  3428K zio->i 28   0:03
26.31% zfs
>> 47135 root           1  20    0 40432K  3312K g_wait  9   0:03 25.51%
zfs
>> This is from winter 2017 11-STABLE (r310734), one of the
'zfs'es is cloning, and all the others are 'zfs list -t all'. I
have like 25 gigs of free RAM, do I have any chance of speeding this up using
may be some caching or some sysctl tuning ? We are using a simple ZFS web API
that may issue concurrent or sequential listing requests, so as you can see they
sometimes do stack.
> How many snapshots do you have ? I have only seen this behavior with LOTS
(not hundreds, but thousands) of snapshots.[root at san1:~]# zfs list -t snapshot | wc -l
       88> What does your `iostat -x 1` look like ? I expect that you are probably
saturating your drives with random I/O.
Well, it's really long, and the disks are busy with random i/o indeed, 
but byst only for 20-30%. As about iostat - it's really long, because I 
have hundreds (not thousands) of zvols, and they do show up in iostat 
-x. But nothing unusual besides that.


Thanks.

Eugene.

Tenzin Lhakhang

2017-Aug-13 11:13 UTC

head link

zfs listing and CPU

You may want to have an async zfs-get program/script that regularly does a
zfs get -Ho and stores then in a local cache (redis or your own program) at
a set interval and then the api can hit the cache instead of directly
running get or list.
- Some silly person will try to benchmark your zfs web-API and overload
your server with zfs processes.
- Example: let me run [ ab -c 10 -n 10000 http://yourserver/zfs-api/list ]
-- Let me run 10 concurrent connection with a total of 10k requests to your
api (it's a simple one liner -- people will be tempted to benchmark like
this).

Example:
https://github.com/tlhakhan/ideal-potato/blob/master/zdux/routers/zfs/service.js#L9
- This is a JS example, but you can easily script it or another language
(golang) for cache separation and another program for the API.

Also, zfs does have a -c property to get cached values -- these values are
stored in an internal zfs process cache.  The -c doesn't help if you have
1000(0)s of filesystems, a single list can still take minutes.  Sending the
list is also several megabytes.

zfs list -Hrpc -o space

zfs get -Hrpc space all

- H= no headers
- r = recursive
- p = machine parseable
- c = hit cached entries

Fixes:  if ok, it may be good to stop the API, kill slowly the zfs list -t
all.

@ Eugene:
- I have seen single zfs list -Ho space -rt all take about 4-5 minutes, on
an 8000+ zfs_dataset zpool.

---
Notes:  my knowledge is from the illumos-zfs man pages but should apply
here.


On Sun, Aug 13, 2017 at 6:09 AM, Eugene M. Zheganin <emz at norma.perm.ru>
wrote:
> Hi,
>
> On 12.08.2017 20:50, Paul Kraus wrote:
>
>> On Aug 11, 2017, at 2:28 AM, Eugene M. Zheganin <emz at
norma.perm.ru>
>>> wrote:
>>>
>>> Why does the zfs listing eat so much of the CPU ?
>>> 47114 root           1  20    0 40432K  3840K db->db  4   0:05
26.84% zfs
>>> 47099 root           1  20    0 40432K  3840K zio->i 17   0:05
26.83% zfs
>>> 47106 root           1  20    0 40432K  3840K db->db 21   0:05
26.81% zfs
>>> 47150 root           1  20    0 40432K  3428K db->db 13   0:03
26.31% zfs
>>> 47141 root           1  20    0 40432K  3428K zio->i 28   0:03
26.31% zfs
>>> 47135 root           1  20    0 40432K  3312K g_wait  9   0:03
25.51% zfs
>>> This is from winter 2017 11-STABLE (r310734), one of the
'zfs'es is
>>> cloning, and all the others are 'zfs list -t all'. I have
like 25 gigs of
>>> free RAM, do I have any chance of speeding this up using may be
some
>>> caching or some sysctl tuning ? We are using a simple ZFS web API
that may
>>> issue concurrent or sequential listing requests, so as you can see
they
>>> sometimes do stack.
>>>
>> How many snapshots do you have ? I have only seen this behavior with
LOTS
>> (not hundreds, but thousands) of snapshots.
>>
> [root at san1:~]# zfs list -t snapshot | wc -l
>       88
>
>> What does your `iostat -x 1` look like ? I expect that you are probably
>> saturating your drives with random I/O.
>>
>
> Well, it's really long, and the disks are busy with random i/o indeed,
but
> byst only for 20-30%. As about iostat - it's really long, because I
have
> hundreds (not thousands) of zvols, and they do show up in iostat -x. But
> nothing unusual besides that.
>
>
> Thanks.
>
> Eugene.
>
>
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at
freebsd.org"
>

freebsd stable - Aug 2017 - zfs listing and CPU

zfs listing and CPU

zfs listing and CPU