Paul Armstrong
2009-Dec-23 03:17 UTC
[zfs-discuss] Recovering ZFS stops after syseventconfd can''t fork
I have a machine connected to an HDS with a corrupted pool. While running zpool import -nfFX on the pool, it spawns a large number of zfsdle processes and eventually the machine hangs for 20-30 seconds, spits out error messages zfs: [ID 346414 kern.warning] WARNING: Couldn''t create process for zfs pool "hds1" syseventconfd[6885]: [ID 437529 daemon.error] cannot fork - Resource temporarily unavailable (tips on finding out which resource was temporarily unavailable would be appreciated too as there seems to be plenty of resources around shortly before it dies, I''m used to only seeing that when I''ve run out of ram or LWPs but with 3GB of RAM free according to ''echo ::memstat | mdb -k'' and 2.1G LWPs available according to prctl I''m stumped) and then the recovery aborts with -bash-4.0# zpool import -fFX hds1 cannot import ''hds1'': one or more devices is currently unavailable Destroy and re-create the pool from a backup source. The number of zfsdle processes running while the recovery is running is quite high per disk: 268 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,0:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,10:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,11:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,14:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,15:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,18:a 268 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,19:a 268 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,1:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,1c:a 268 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,1d:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,20:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,21:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,24:a 268 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,25:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,28:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,29:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,2c:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,2d:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,30:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,31:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,34:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,35:a 268 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,38:a 270 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,39:a 268 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,4:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,5:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,8:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,9:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,c:a 269 /devices/pci at 0,0/pci8086,25e2 at 2/pci8086,3500 at 0/pci8086,3510 at 0/pci10df,fe00 at 0,1/fp at 0,0/disk at w50060e8010037135,d:a The total here is a little over 8,000 (fairly close to 8K) but not over that and there''s nothing that I can see in prctl that''s near 8K and related (it''s only things like max-msg-messages and max-port-ids). The pool was corrupted under S10u6 (machine crashed and then the filesystem died during a scrub) and I''m trying to recover it with SXCE 129. Ideas? Here''s the layout: -bash-4.0# zpool import pool: hds1 id: 15655551015334264270 state: FAULTED status: The pool metadata is corrupted. action: The pool cannot be imported due to damaged devices or data. see: http://www.sun.com/msg/ZFS-8000-72 config: hds1 FAULTED corrupted data raidz1-0 ONLINE c1t50060E8010037135d0p0 ONLINE c1t50060E8010037135d4 ONLINE c1t50060E8010037135d8 ONLINE c1t50060E8010037135d12 ONLINE c1t50060E8010037135d16 ONLINE raidz1-1 ONLINE c1t50060E8010037135d40 ONLINE c1t50060E8010037135d44 ONLINE c1t50060E8010037135d48 ONLINE c1t50060E8010037135d52 ONLINE c1t50060E8010037135d56 ONLINE raidz1-2 ONLINE c1t50060E8010037135d20 ONLINE c1t50060E8010037135d24 ONLINE c1t50060E8010037135d28 ONLINE c1t50060E8010037135d32 ONLINE c1t50060E8010037135d36 ONLINE raidz1-3 ONLINE c1t50060E8010037135d1 ONLINE c1t50060E8010037135d5 ONLINE c1t50060E8010037135d9 ONLINE c1t50060E8010037135d13 ONLINE c1t50060E8010037135d17 ONLINE raidz1-4 ONLINE c1t50060E8010037135d21 ONLINE c1t50060E8010037135d25 ONLINE c1t50060E8010037135d29 ONLINE c1t50060E8010037135d33 ONLINE c1t50060E8010037135d37 ONLINE raidz1-5 ONLINE c1t50060E8010037135d41 ONLINE c1t50060E8010037135d45 ONLINE c1t50060E8010037135d49 ONLINE c1t50060E8010037135d53 ONLINE c1t50060E8010037135d57 ONLINE <snip several working volumes> Thanks, Paul -- This message posted from opensolaris.org
Anton B. Rang
2009-Dec-23 03:52 UTC
[zfs-discuss] Recovering ZFS stops after syseventconfd can''t fork
Something over 8000 sounds vaguely like the default maximum process count. What does ''ulimit -a'' show? I don''t know why you''re seeing so many zfsdle processes, though ? sounds like a bug to me. -- This message posted from opensolaris.org
Paul Armstrong
2009-Dec-23 04:02 UTC
[zfs-discuss] Recovering ZFS stops after syseventconfd can''t fork
bash-4.0# ulimit -a core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited open files (-n) 256 pipe size (512 bytes, -p) 10 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 29995 virtual memory (kbytes, -v) unlimited -- This message posted from opensolaris.org
Paul Armstrong
2009-Dec-23 04:27 UTC
[zfs-discuss] Recovering ZFS stops after syseventconfd can''t fork
I''m surprised at the number as well. Running it again, I''m seeing it jump fairly high just before the fork errors: bash-4.0# ps -ef | grep zfsdle | wc -l 20930 (the next run of ps failed due to the fork error). So maybe it is running out of processes. ZFS file data from ::memstat just went down to 29MiB (from 22GiB) too which may or may not be related. Message was edited by: psa -- This message posted from opensolaris.org
Carson Gaspar
2009-Dec-23 18:39 UTC
[zfs-discuss] Recovering ZFS stops after syseventconfd can''t fork
Paul Armstrong wrote:> I''m surprised at the number as well. > > Running it again, I''m seeing it jump fairly high just before the fork errors: > bash-4.0# ps -ef | grep zfsdle | wc -l > 20930 > > (the next run of ps failed due to the fork error). > So maybe it is running out of processes. > > ZFS file data from ::memstat just went down to 29MiB (from 22GiB) too which may or may not be related. > > Message was edited by: psaNote that I saw the exact same thing when my pool got trashed. My fix was to rename /etc/sysevent/config/SUNW,EC_dev_status,ESC_dev_dle,sysevent.conf I _suspect_ the problem is that the developers don''t expect zfsdle to hang. So they don''t bother to use a lock or check if one is already running. They just spawn more, and more, and more... It would be lovely if someone who understands what this creature is were to fix this rather catastrophic bug. -- Carson
Paul Armstrong
2009-Dec-28 18:17 UTC
[zfs-discuss] Recovering ZFS stops after syseventconfd can''t fork
Alas, even moving the file out of the way and rebooting the box (to guarantee state) didn''t work: -bash-4.0# zpool import -nfFX hds1 echo $? -bash-4.0# echo $? 1 Do you need to be able to read all the labels for each disk in the array in order to recover?>From zdb -l on one of the disks:-------------------------------------------- LABEL 3 -------------------------------------------- failed to unpack label 3 Thanks, Paul -- This message posted from opensolaris.org
Cindy Swearingen
2010-Jan-05 16:46 UTC
[zfs-discuss] Recovering ZFS stops after syseventconfd can''t fork
Hi Paul, I opened 6914208 to cover the sysevent/zfsdle problem. If the system crashed due to a power failure and the disk labels for this pool were corrupted, then I think you will need to follow the steps to get the disks relabeled correctly. You might review some previous postings by Victor Latuskin that describe these steps. Thanks, Cindy On 12/28/09 11:17, Paul Armstrong wrote:> Alas, even moving the file out of the way and rebooting the box (to guarantee state) didn''t work: > > -bash-4.0# zpool import -nfFX hds1 > echo $? > -bash-4.0# echo $? > 1 > > Do you need to be able to read all the labels for each disk in the array in order to recover? > >>From zdb -l on one of the disks: > -------------------------------------------- > LABEL 3 > -------------------------------------------- > failed to unpack label 3 > > Thanks, > Paul