Hi, I have a FreeBSD/amd64 8.2 server that has a few ZFS file systems served over NFS. It has 8 GB of memory. There are 6 disks of 1,5 TB each forming a pool with raidz2.>From time to time it crashes with some stack backtrace (included below).This already happened before the upgrade to 8.2. Now a crash of a file server is annoying, but if it reboots automatically, there is just a few minutes of downtime (most of it is even spent by the BIOS before it gets to boot the OS). However, it doesn't automatically reboot in 15 seconds, as promised. It just sits there the whole weekend, until I log onto the IPMI console and press the virtual reset button. This was visible before I did that (4-finger copy): panic: kmem_alloc(131072): kmem_map too small: 3428782080 total allocated cpuid = 0 KDB: stack backtrace: #0 0xffffffff805f4e0e at kdb_backtrace+0x5e #1 0xffffffff805c2d07 at panic+0x187 #2 0xffffffff80816830 at kmem_alloc+0 #3 0xffffffff8080e3ba at uma_large_malloc+0x4a #4 0xffffffff805b0167 at malloc+0xd7 #5 0xffffffff80e87849 at zil_lwb_write_start+0x289 #6 0xffffffff80e87b92 at zil_commit+0x242 #7 0xffffffff80ea035d at zfs_sync+0xcd #8 0xffffffff8065431a at sync_fsync+0x16a #9 0xffffffff806524be at sync_vnode+0x15e #10 0xffffffff806527b1 at sched_sync+0x1d1 #11 0xffffffff805994f8 at fork_exit+0x118 #11 0xffffffff8089547e at fork_trampoline+0xe Uptime: 11d12h56m20s Cannot dump. Device not defined or unavailable. Automatic reboot in 15 seconds - press a key on the console to abort and that is where it sat all weekend... Why doesn't the promised reboot happen? The kernel was still the GENERIC one as distributed with 8.2. Because of the reboot it will now be the stripped down one that I compiled myself. There is some tuning in /boot/loader.conf from previous attempts tune to avoid crashes. vm.kmem_size="16G" vfs.zfs.arc_max="4G" Is that still useful, or does it harm by now? Real memory is 8 GB. I note that if I look with sysctl, I see vm.kmem_size: 3739230208 vfs.zfs.arc_max: 2665488384 which doesn't seem to match these attempted settings. -Olaf. -- Pipe rene = new PipePicture(); assert(Not rene.GetType().Equals(Pipe));
On 2011-May-02 16:32:30 +0200, Olaf Seibert <O.Seibert@cs.ru.nl> wrote:>However, it doesn't automatically reboot in 15 seconds, as promised. >It just sits there the whole weekend, until I log onto the IPMI console >and press the virtual reset button.Your reference to IMPI indicates this is not a consumer PC. Can you please provide some details of the hardware. Are you running ipmitools or similar? Does "shutdown -r" or "reboot" work normally?>panic: kmem_alloc(131072): kmem_map too small: 3428782080 total allocatedI suggest you have a read of the thread beginning http://lists.freebsd.org/pipermail/freebsd-fs/2011-March/010862.html (note that mailman has split it into at least 3 threads). -- Peter Jeremy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20110503/85c2e14b/attachment.pgp
On Mon, May 02, 2011 at 04:32:30PM +0200, Olaf Seibert wrote:> I have a FreeBSD/amd64 8.2 server that has a few ZFS file systems served > over NFS. It has 8 GB of memory. There are 6 disks of 1,5 TB each > forming a pool with raidz2. > > >From time to time it crashes with some stack backtrace (included below). > This already happened before the upgrade to 8.2. > > Now a crash of a file server is annoying, but if it reboots > automatically, there is just a few minutes of downtime (most of it is > even spent by the BIOS before it gets to boot the OS). > > However, it doesn't automatically reboot in 15 seconds, as promised. > It just sits there the whole weekend, until I log onto the IPMI console > and press the virtual reset button.There are two things you might try fiddling with. These are sysctls so you can try them on the fly: hw.acpi.disable_on_reboot hw.acpi.handle_reboot On our systems we set hw.acpi.handle_reboot=1 to speed up the reboot process. I remember hearing long ago how some people had issues getting their machines to reboot (sometimes 100% of the time, other times occasionally); using ACPI to reboot the machine fixed their issues.> This was visible before I did that (4-finger copy): > > panic: kmem_alloc(131072): kmem_map too small: 3428782080 total allocated > cpuid = 0Check out the thread Peter Jeremy provided. This is a near-sure indicator of ZFS ARC exhaustion, and you seem to know of that. What's very interesting to me is this part of your mail:> There is some tuning in /boot/loader.conf from previous attempts tune to > avoid crashes. > > vm.kmem_size="16G" > vfs.zfs.arc_max="4G" > > Is that still useful, or does it harm by now? Real memory is 8 GB. > I note that if I look with sysctl, I see > > vm.kmem_size: 3739230208 > vfs.zfs.arc_max: 2665488384 > > which doesn't seem to match these attempted settings.Is this box running i386 or amd64? If amd64, I can't explain why your /boot/loader.conf settings aren't taking -- they should be for sure. Maybe provide us a full dmesg and XXX out things you consider sensitive. If i386, I'm not too surprised that some automatic defaults get chosen instead of what you ask. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB |