On 7/13/2015 12:29, Adrian Chadd wrote:> hi,
>
> With that much storage and that many snapshots, I do think you need
> more than 96GB of RAM in the box. I'm hoping someone doing active ZFS
> work can comment..
>
> I don't think the ZFS code is completely "memory usage" safe.
The
> "old" Sun suggestions when I started using ZFS was "if your
server
> panics due to out of memory with ZFS, buy more memory."
>
> That said, there doesn't look like there's a leak anywhere - those
> dumps show you're using at least 32gig on each just in zfs data
> buffers.
That's normal.>
> Try tuning the ARC down a little?
The ARC is supposed to auto-size and use all available free memory. The
problem is that the VM system and ARC system both make assumptions that
under certain load patterns fight with one another, and when this
happens and ARC wins the system gets in trouble FAST. The pattern is
that the system will start to page RSS out rather than evict ARC, ARC
will fill the freed space, it pages more RSS out..... you see where this
winds up heading yes?
UMA contributes to the problem substantially when ZFS grabs chunks of
RAM of a given size and then frees them as UMA will hold that RAM in
reserve in the event of a subsequent allocation. For certain work
patterns this can get really ugly as you can wind up with huge amounts
of RAM held by the UMA system and unavailable, yet not in actual use.
The patch I posted the link to has addressed both of these issues on my
systems here (and a number of other people's as well); I continue to run
it here in production and have been extremely happy with it.
> -adrian
>
>
> On 13 July 2015 at 04:48, Christopher Forgeron <csforgeron at
gmail.com> wrote:
>>
>> TL;DR Summary: I can run FreeBSD out of memory quite consistently, and
it?s
>> not a TOS/mbuf exhaustion issue. It?s quite possible that ZFS is the
>> culprit, but shouldn?t the pager be able to handle aggressive memory
>> requests in a low memory situation gracefully, without needing custom
tuning
>> of ZFS / VM?
>>
>>
>> Hello,
>>
>> I?ve been dealing with some instability in my 10.1-RELEASE and
>> STABLEr282701M machines for the last few months.
>>
>> These machines are NFS/iSCSI storage machines, running on Dell M610x or
>> similar hardware, 96 Gig Memory, 10Gig Network Cards, dual Xeon
Processors ?
>> Fairly beefy stuff.
>>
>> Initially I thought it was more issues with TOS / jumbo mbufs, as I had
this
>> problem last year. I had thought that this was properly resolved, but
>> setting my MTU to 1500, and turning off TOS did give me a bit more
>> stability. Currently all my machines are set this way.
>>
>> Crashes were usually represented by loss of network connectivity, and
the
>> ctld daemon scrolling messages across the screen at full speed about
lost
>> connections.
>>
>> All of this did seem like more network stack problems, but with each
crash
>> I?d be able to learn a bit more.
>>
>> Usually there was nothing of any use in the logfile, but every now and
then
>> I?d get this:
>>
>> Jun 3 13:02:04 san0 kernel: WARNING: 172.16.0.97
>> (iqn.1998-01.com.vmware:esx5a-3387a188): failed to allocate memory
>> Jun 3 13:02:04 san0 kernel: WARNING: icl_pdu_new: failed to allocate
80
>> bytes
>> Jun 3 13:02:04 san0 kernel: WARNING: 172.16.0.97
>> (iqn.1998-01.com.vmware:esx5a-3387a188): failed to allocate memory
>> Jun 3 13:02:04 san0 kernel: WARNING: icl_pdu_new: failed to allocate
80
>> bytes
>> Jun 3 13:02:04 san0 kernel: WARNING: 172.16.0.97
>> (iqn.1998-01.com.vmware:esx5a-3387a188): failed to allocate memory
>> ---------
>> Jun 4 03:03:09 san0 kernel: WARNING: icl_pdu_new: failed to allocate
80
>> bytes
>> Jun 4 03:03:09 san0 kernel: WARNING: icl_pdu_new: failed to allocate
80
>> bytes
>> Jun 4 03:03:09 san0 kernel: WARNING: 172.16.0.97
>> (iqn.1998-01.com.vmware:esx5a-3387a188): failed to allocate memory
>> Jun 4 03:03:09 san0 kernel: WARNING: 172.16.0.97
>> (iqn.1998-01.com.vmware:esx5a-3387a188): connection error; dropping
>> connection
>> Jun 4 03:03:09 san0 kernel: WARNING: 172.16.0.97
>> (iqn.1998-01.com.vmware:esx5a-3387a188): connection error; dropping
>> connection
>> Jun 4 03:03:10 san0 kernel: WARNING: 172.16.0.97
>> (iqn.1998-01.com.vmware:esx5a-3387a188): waiting for CTL to terminate
tasks,
>> 1 remaining
>> Jun 4 06:04:27 san0 syslogd: kernel boot file is /boot/kernel/kernel
>>
>> So knowing that it seemed to be running out of memory, I started
leaving
>> leaving ?vmstat 5? running on a console, to see what it was displaying
>> during the crash.
>>
>> It was always the same thing:
>>
>> 0 0 0 1520M 4408M 15 0 0 0 25 19 0 0 21962 1667
91390
>> 0 33 67
>> 0 0 0 1520M 4310M 9 0 0 0 2 15 3 0 21527 1385
95165
>> 0 31 69
>> 0 0 0 1520M 4254M 7 0 0 0 14 19 0 0 17664 1739
72873
>> 0 18 82
>> 0 0 0 1520M 4145M 2 0 0 0 0 19 0 0 23557 1447
96941
>> 0 36 64
>> 0 0 0 1520M 4013M 4 0 0 0 14 19 0 0 4288 490
34685 0
>> 72 28
>> 0 0 0 1520M 3885M 2 0 0 0 0 19 0 0 11141 1038
69242
>> 0 52 48
>> 0 0 0 1520M 3803M 10 0 0 0 14 19 0 0 24102 1834
91050
>> 0 33 67
>> 0 0 0 1520M 8192B 2 0 0 0 2 15 1 0 19037 1131
77470
>> 0 45 55
>> 0 0 0 1520M 8192B 0 22 0 0 2 0 6 0 146 82
578 0
>> 0 100
>> 0 0 0 1520M 8192B 1 0 0 0 0 0 0 0 130 40
510 0
>> 0 100
>> 0 0 0 1520M 8192B 0 0 0 0 0 0 0 0 143 40
501 0
>> 0 100
>> 0 0 0 1520M 8192B 0 0 0 0 0 0 0 0 201 62
660 0
>> 0 100
>> 0 0 0 1520M 8192B 0 0 0 0 0 0 0 0 101 28
404 0
>> 0 100
>> 0 0 0 1520M 8192B 0 0 0 0 0 0 0 0 97 27
398 0
>> 0 100
>> 0 0 0 1520M 8192B 0 0 0 0 0 0 0 0 93 28
377 0
>> 0 100
>> 0 0 0 1520M 8192B 0 0 0 0 0 0 0 0 92 27
373 0
>> 0 100
>>
>>
>> I?d go from a decent amount of free memory to suddenly having none.
Vmstat
>> would stop outputting, console commands would hang, etc. The whole
system
>> would be useless.
>>
>> Looking into this, I came across a similar issue;
>>
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199189
>>
>> I started increasing v.v_free_min, and it helped ? My crashes went from
>> being ~every 6 hours to every few days.
>>
>> Currently I?m running with vm.v_free_min=1254507 ? That?s (1254507 *
4KiB) ,
>> or 4.78GiB of Reserve. The vmstat above is of a machine with that
setting
>> still running to 8B of memory.
>>
>> I have two issues here:
>>
>> 1) I don?t think I should ever be able to run the system into the
ground on
>> memory. Deny me new memory until the pager can free more.
>> 2) Setting ?min? doesn?t really mean ?min? as it can obviously go below
that
>> threshold.
>>
>>
>> I have plenty of local UFS swap (non-ZFS drives)
>>
>> Adrian requested that I output a few more diagnostic items, and this
is
>> what I?m running on a console now, in a loop:
>>
>> vmstat
>> netstat -m
>> vmstat -z
>> sleep 1
>>
>> The output of four crashes are attached here, as they can be a bit
long. Let
>> me know if that?s not a good way to report them. They will each start
>> mid-way through a vmstat ?z output, as that?s as far back as my
terminal
>> buffer allows.
>>
>>
>>
>> Now, I have a good idea of the conditions that are causing this: ZFS
>> Snapshots, run by cron, during times of high ZFS writes.
>>
>> The crashes are all nearly on the hour, as that?s when crontab triggers
my
>> python scripts to make new snapshots, and delete old ones.
>>
>> My average FreeBSD machine has ~ 30 zfs datasets, with each pool having
~20
>> TiB used. These all need to snapshot on the hour.
>>
>> By staggering the snapshots by a few minutes, I have been able to
reduce
>> crashing from every other day to perhaps once a week if I?m lucky ? But
if I
>> start moving a lot of data around, I can cause daily crashes again.
>>
>> It?s looking to be the memory demand of snapshotting lots of ZFS
datasets at
>> the same time while accepting a lot of write traffic.
>>
>> Now perhaps the answer is ?don?t do that? but I feel that FreeBSD
should be
>> robust enough to handle this. I don?t mind tuning for now to
>> reduce/eliminate this, but others shouldn?t run into this pain just
because
>> they heavily load their machines ? There must be a way of avoiding this
>> condition.
>>
>> Here are the contents of my /boot/loader.conf and sysctl.conf, so show
my
>> minimal tuning to make this problem a little more bearable:
>>
>> /boot/loader.conf
>> vfs.zfs.arc_meta_limit=49656727553
>> vfs.zfs.arc_max = 91489280512
>>
>> /etc/sysctl.conf
>> vm.v_free_min=1254507
>>
>>
>> Any suggestions/help is appreciated.
>>
>> Thank you.
>>
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at
freebsd.org"?#?0???a/?j?!y??~??n????>?????
>
--
Karl Denninger
karl at denninger.net <mailto:karl at denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2944 bytes
Desc: S/MIME Cryptographic Signature
URL:
<http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20150713/b2fe4ab5/attachment.bin>