Marc Bevand
2008-Mar-30 11:22 UTC
[zfs-discuss] Setting ''kernelbase'' to < 0xd0000000 causes init(1M) to segfault
(Keywords: solaris hang zfs scrub heap space kernelbase marvell 88sx6081) I am experiencing system hangs on a 32-bit x86 box with 1.5 GB RAM running Solaris 10 Update 4 (with only patch 125205-07) during ZFS scrubs of an almost full 3 TB zpool (6 disks on a AOC-SAT2-MV8 controller). I found out they are caused by memory contention in the kernel heap: ''kstat vmem::heap'' shows it is 97% full and ''echo ::threadlist -v | mdb -k'' shows most threads are blocked in memory allocation routines. When trying to give more memory to the kernel by passing ''-B kernelbase=0x80000000'' to the kernel, it fails to boot up. The console is flooded with this line repeating over and over: WARNING: init(1M) exited on fatal signal 9: restarting automatically init segfaults the same way with any kernelbase value less than the default of 0xd0000000 (I tried 0x50000000, 0x80000000, 0x90000000, 0xc0000000, 0xcf000000). It works fine with values greater than or equal to the default (I only tried 0xd0000000 and 0xd1000000). How can I troubleshoot this crash ? Is it caused by the system being unable to access / (standard UFS partition on a disk connected to a SATA controller supported by the marvell88sx driver) ? Could some drivers such as marvell88sx not support non-standard kernelbase values ? Alternatively, can I make ZFS use less heap space ? I don''t think the ARC cache use heap space, or does it ? None of my other ZFS servers have this heap space restriction because they are 64-bit. -marc
Marc Bevand
2008-Mar-31 02:44 UTC
[zfs-discuss] Setting kernelbase=0x80000000 only works under snv_83
For the record a parallel install of snv_83 on the same machine allows me to set kernelbase to 0x80000000 with no pb, no init crash. This increased the kernel heap size to 1912 MB (up from 632 MB with kernelbase=0xd0000000 in sol10u4) and the system doesn''t hang anymore. The max heap usage I have seen so far is 1220 MB. Is the init(1M) segfault pb known in sol10u4 ? Has/will it be fixed in u5 ? -marc