Hallo everybody, I have a Solaris 11 box here (Sun X4270) that crashes with a kernel panic during the import of a zpool (some 30TB) containing ~500 zfs filesystems after reboot. This causes a reboot loop, until booted single user and removed /etc/zfs/zpool.cache.>From /var/adm/messages:savecore: [ID 570001 auth.error] reboot after panic: BAD TRAP: type=e (#pf Page fault) rp=ffffff002f9cec50 addr=20 occurred in module "zfs" due to a NULL pointer dereference savecore: [ID 882351 auth.error] Saving compressed system crash dump in /var/crash/vmdump.2 This is what mdb tells: mdb unix.2 vmcore.2 Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc pcplusmp scsi_vhci zfs mpt sd ip hook neti arp usba uhci sockfs qlc fctl s1394 kssl lofs random fcp idm sata fcip cpc crypto ufs logindmux ptm sppp ] $c zap_leaf_lookup_closest+0x45(ffffff0700ca2a98, 0, 0, ffffff002f9cedb0) fzap_cursor_retrieve+0xcd(ffffff0700ca2a98, ffffff002f9ceed0, ffffff002f9cef10) zap_cursor_retrieve+0x195(ffffff002f9ceed0, ffffff002f9cef10) zfs_purgedir+0x4d(ffffff0721d32c20) zfs_rmnode+0x57(ffffff0721d32c20) zfs_zinactive+0xb4(ffffff0721d32c20) zfs_inactive+0x1a3(ffffff0721d3a700, ffffff07149dc1a0, 0) fop_inactive+0xb1(ffffff0721d3a700, ffffff07149dc1a0, 0) vn_rele+0x58(ffffff0721d3a700) zfs_unlinked_drain+0xa7(ffffff07022dab40) zfsvfs_setup+0xf1(ffffff07022dab40, 1) zfs_domount+0x152(ffffff07223e3c70, ffffff0717830080) zfs_mount+0x4e3(ffffff07223e3c70, ffffff07223e5900, ffffff002f9cfe20, ffffff07149dc1a0) fsop_mount+0x22(ffffff07223e3c70, ffffff07223e5900, ffffff002f9cfe20, ffffff07149dc1a0) domount+0xd2f(0, ffffff002f9cfe20, ffffff07223e5900, ffffff07149dc1a0, ffffff002f9cfe18) mount+0xc0(ffffff0713612c78, ffffff002f9cfe98) syscall_ap+0x92() _sys_sysenter_post_swapgs+0x149() I can import the pool readonly. The server is a mirror for our primary file server and is synced via zfs send/receive. I saw a similar effect some time ago on a opensolaris box (build 111b). That time my final solution was to copy over the read only mounted stuff to a newly created pool. As it is the second time this failure occures (on different machines) I''m really concerned about overall reliability.... Any suggestions? thx Carsten
2012-03-27 11:14, Carsten John write:> I saw a similar effect some time ago on a opensolaris box (build 111b). That time my final solution was to copy over the read only mounted stuff to a newly created pool. As it is the second time this failure occures (on different machines) I''m really concerned about overall reliability.... > > > > Any suggestions?A couple of months ago I reported a similar issue (though with a different stacktrace and code path). I tracked it to code in freeing of deduped blocks where a valid code path could return a NULL pointer, but further routines used the pointer as if it is always valid - thus a NULL dereference when the pool was imported RW and tried to release blocks marked for deletion. Adding a check for non-NULLness in my private rebuild of oi_151a has fixed the issue. I wouldn''t be surprised to see similar slackiness in other parts of the code now. Not checking input values in routines seems like an arrogant mistake waiting to fire (and it did for us). I am not sure how to make a webrev and ultimately a signed-off contribution upstream, but I posted my patch and research on the list and in illumos bugtracker. I am not sure how you can fix a S11 system though. If it is at zpool v28 or older, you can try to import it into an openindiana installation, perhaps rebuilt for similar patched code that would check for NULLs and fix your pool (and then reuse it in S11 if you must). The source is there on http://src.illumos.org and your stacktrace should tell you in which functions you should start looking... Good luck, //Jim
On Tue, Mar 27, 2012 at 3:14 AM, Carsten John <cjohn at mpi-bremen.de> wrote:> Hallo everybody, > > I have a Solaris 11 box here (Sun X4270) that crashes with a kernel panic during the import of a zpool (some 30TB) containing ~500 zfs filesystems after reboot. This causes a reboot loop, until booted single user and removed /etc/zfs/zpool.cache. > > > From /var/adm/messages: > > savecore: [ID 570001 auth.error] reboot after panic: BAD TRAP: type=e (#pf Page fault) rp=ffffff002f9cec50 addr=20 occurred in module "zfs" due to a NULL pointer dereference > savecore: [ID 882351 auth.error] Saving compressed system crash dump in /var/crash/vmdump.2 >I ran into a very similar problem with Solaris 10U9 and the replica (zfs send | zfs recv destination) of a zpool of about 25 TB of data. The problem was an incomplete snapshot (the zfs send | zfs recv had been interrupted). On boot the system was trying to import the zpool and as part of that it was trying to destroy the offending (incomplete) snapshot. This was zpool version 22 and destruction of snapshots is handled as a single TXG. The problem was that the operation was running the system out of RAM (32 GB worth). There is a fix for this and it is in zpool 26 (or newer), but any snapshots created while the zpool is at a version prior to 26 will have the problem on-disk. We have support with Oracle and were able to get a loaner system with 128 GB RAM to clean up the zpool (it took about 75 GB RAM to do so). If you are at zpool 26 or later this is not your problem. If you are at zpool < 26, then test for an incomplete snapshot by importing the pool read only, then `zdb -d <zpool> | grep ''%''` as the incomplete snapshot will have a ''%'' instead of a ''@'' as the dataset / snapshot separator. You can also run the zdb against the _un_imported_ zpool using the -e option to zdb. See the following Oracle Bugs for more information. CR# 6876953 CR# 6910767 CR# 7082249 CR#7082249 has been marked as a duplicate of CR# 6948890 P.S. I have a suspect that the incomplete snapshot was also corrupt in some strange way, but could never make a solid determination of that. We think what caused the zfs send | zfs recv to be interrupted was hitting an e1000g Ethernet device driver bug. -- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, Troy Civic Theatre Company -> Technical Advisor, RPI Players
Hi Carsten, This was supposed to be fixed in build 164 of Nevada (6742788). If you are still seeing this issue in S11, I think you should raise a bug with relevant details. As Paul has suggested, this could also be due to incomplete snapshot. I have seen interrupted zfs recv''s causing weired bugs. Thanks, Deepak. On 03/27/12 12:44 PM, Carsten John wrote:> Hallo everybody, > > I have a Solaris 11 box here (Sun X4270) that crashes with a kernel panic during the import of a zpool (some 30TB) containing ~500 zfs filesystems after reboot. This causes a reboot loop, until booted single user and removed /etc/zfs/zpool.cache. > > > From /var/adm/messages: > > savecore: [ID 570001 auth.error] reboot after panic: BAD TRAP: type=e (#pf Page fault) rp=ffffff002f9cec50 addr=20 occurred in module "zfs" due to a NULL pointer dereference > savecore: [ID 882351 auth.error] Saving compressed system crash dump in /var/crash/vmdump.2 > > This is what mdb tells: > > mdb unix.2 vmcore.2 > Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc pcplusmp scsi_vhci zfs mpt sd ip hook neti arp usba uhci sockfs qlc fctl s1394 kssl lofs random fcp idm sata fcip cpc crypto ufs logindmux ptm sppp ] > $c > zap_leaf_lookup_closest+0x45(ffffff0700ca2a98, 0, 0, ffffff002f9cedb0) > fzap_cursor_retrieve+0xcd(ffffff0700ca2a98, ffffff002f9ceed0, ffffff002f9cef10) > zap_cursor_retrieve+0x195(ffffff002f9ceed0, ffffff002f9cef10) > zfs_purgedir+0x4d(ffffff0721d32c20) > zfs_rmnode+0x57(ffffff0721d32c20) > zfs_zinactive+0xb4(ffffff0721d32c20) > zfs_inactive+0x1a3(ffffff0721d3a700, ffffff07149dc1a0, 0) > fop_inactive+0xb1(ffffff0721d3a700, ffffff07149dc1a0, 0) > vn_rele+0x58(ffffff0721d3a700) > zfs_unlinked_drain+0xa7(ffffff07022dab40) > zfsvfs_setup+0xf1(ffffff07022dab40, 1) > zfs_domount+0x152(ffffff07223e3c70, ffffff0717830080) > zfs_mount+0x4e3(ffffff07223e3c70, ffffff07223e5900, ffffff002f9cfe20, ffffff07149dc1a0) > fsop_mount+0x22(ffffff07223e3c70, ffffff07223e5900, ffffff002f9cfe20, ffffff07149dc1a0) > domount+0xd2f(0, ffffff002f9cfe20, ffffff07223e5900, ffffff07149dc1a0, ffffff002f9cfe18) > mount+0xc0(ffffff0713612c78, ffffff002f9cfe98) > syscall_ap+0x92() > _sys_sysenter_post_swapgs+0x149() > > > I can import the pool readonly. > > The server is a mirror for our primary file server and is synced via zfs send/receive. > > I saw a similar effect some time ago on a opensolaris box (build 111b). That time my final solution was to copy over the read only mounted stuff to a newly created pool. As it is the second time this failure occures (on different machines) I''m really concerned about overall reliability.... > > > > Any suggestions? > > > thx > > Carsten > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-----Original message----- To: ZFS Discussions <zfs-discuss at opensolaris.org>; From: Paul Kraus <paul at kraus-haus.org> Sent: Tue 27-03-2012 15:05 Subject: Re: [zfs-discuss] kernel panic during zfs import> On Tue, Mar 27, 2012 at 3:14 AM, Carsten John <cjohn at mpi-bremen.de> wrote: > > Hallo everybody, > > > > I have a Solaris 11 box here (Sun X4270) that crashes with a kernel panic > during the import of a zpool (some 30TB) containing ~500 zfs filesystems after > reboot. This causes a reboot loop, until booted single user and removed > /etc/zfs/zpool.cache. > > > > > > From /var/adm/messages: > > > > savecore: [ID 570001 auth.error] reboot after panic: BAD TRAP: type=e (#pf > Page fault) rp=ffffff002f9cec50 addr=20 occurred in module "zfs" due to a NULL > pointer dereference > > savecore: [ID 882351 auth.error] Saving compressed system crash dump in > /var/crash/vmdump.2 > > > > I ran into a very similar problem with Solaris 10U9 and the > replica (zfs send | zfs recv destination) of a zpool of about 25 TB of > data. The problem was an incomplete snapshot (the zfs send | zfs recv > had been interrupted). On boot the system was trying to import the > zpool and as part of that it was trying to destroy the offending > (incomplete) snapshot. This was zpool version 22 and destruction of > snapshots is handled as a single TXG. The problem was that the > operation was running the system out of RAM (32 GB worth). There is a > fix for this and it is in zpool 26 (or newer), but any snapshots > created while the zpool is at a version prior to 26 will have the > problem on-disk. We have support with Oracle and were able to get a > loaner system with 128 GB RAM to clean up the zpool (it took about 75 > GB RAM to do so). > > If you are at zpool 26 or later this is not your problem. If you > are at zpool < 26, then test for an incomplete snapshot by importing > the pool read only, then `zdb -d <zpool> | grep ''%''` as the incomplete > snapshot will have a ''%'' instead of a ''@'' as the dataset / snapshot > separator. You can also run the zdb against the _un_imported_ zpool > using the -e option to zdb. > > See the following Oracle Bugs for more information. > > CR# 6876953 > CR# 6910767 > CR# 7082249 > > CR#7082249 has been marked as a duplicate of CR# 6948890 > > P.S. I have a suspect that the incomplete snapshot was also corrupt in > some strange way, but could never make a solid determination of that. > We think what caused the zfs send | zfs recv to be interrupted was > hitting an e1000g Ethernet device driver bug. > > -- > {--------1---------2---------3---------4---------5---------6---------7---------} > Paul Kraus > -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) > -> Sound Coordinator, Schenectady Light Opera Company ( > http://www.sloctheater.org/ ) > -> Technical Advisor, Troy Civic Theatre Company > -> Technical Advisor, RPI Players > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >Hi, this scenario seems to fit. The machine that was sending the snapshot is on OpenSolaris Build 111b (which is running zpool version 14). I rebooted the receiving machine due to a hanging "zfs receive" that couldn''t be killed. zdb -d -e <pool> does not give any useful information: zdb -d -e san_pool Dataset san_pool [ZPL], ID 18, cr_txg 1, 36.0K, 11 objects When importing the pool readonly, I get an error about two datasets: zpool import -o readonly=on san_pool cannot set property for ''san_pool/home/someuser'': dataset is read-only cannot set property for ''san_pool/home/someotheruser'': dataset is read-only As this is a mirror machine, I still have the option to destroy the pool and copy over the stuff via send/receive from the primary. But nobody knows how long this will work until I''m hit again.... If an interrupted send/receive can screw up a 30TB target pool, then send/receive isn''t an option for replication data at all, furthermore it should be flagged as "don''t use it if your target pool might contain any valuable data" I wil reproduce the crash once more and try to file a bug report for S11 as recommended by Deepak (not so easy these days...). thanks Carsten