My OpenSolaris snv134 on my production system does no more boot. I get kernel panic on booting with the following message: panic[cpu0]/thread=ffffff000f639c60: zfs: allocating allocated segment(offset=422941467136 size=8192) I attached the full stack trace per file attachment. I need urgently a solution how to repair the filesystem to reboot the server with all web services on it! I tried to import my zpool with zfs-fuse 0.7.0 with Linux: root at rescue ~/tmp/official/src/cmd/zpool # ./zpool import pool: rpool id: 7742244823439206706 state: ONLINE status: The pool was last accessed by another system. action: The pool can be imported using its name or numeric identifier and the ''-f'' flag. see: http://www.sun.com/msg/ZFS-8000-EY config: rpool ONLINE mirror-0 ONLINE disk/by-id/ata-SAMSUNG_HD753LJ_S13UJDWZ301210-part5 ONLINE disk/by-id/ata-SAMSUNG_HD753LJ_S13UJDWZ301202-part5 ONLINE root at rescue ~/tmp/official/src/cmd/zpool # ./zpool root at rescue ~/tmp/official/src/cmd/zpool # mkdir /mnt2 root at rescue ~/tmp/official/src/cmd/zpool # ./zpool import -R /mnt2 7742244823439206706 cannot import ''rpool'': pool may be in use from other system, it was last accessed by (hostid: 0xc4db24) on Mon Sep 13 04:11:47 2010 use ''-f'' to import anyway root at rescue ~/tmp/official/src/cmd/zpool # ./zpool import -f -R /mnt2 7742244823439206706 zfsfuse_ioctl_read_loop(): file descriptor closed internal error: out of memory I need urgently a patch (by preference today). I''m ready to pay for it if needed! If I would be at least be able to repair my filesystem with zfs-fuse I would be happy to recover my data. I posted a bug about this on: https://defect.opensolaris.org/bz/show_bug.cgi?id=17026 -- This message posted from opensolaris.org
On Sep 13, 2010, at 3:22 PM, Stephan Ferraro wrote:> My OpenSolaris snv134 on my production system does no more boot. I get kernel > panic on booting with the following message: > > panic[cpu0]/thread=ffffff000f639c60: zfs: allocating allocated > segment(offset=422941467136 size=8192)You can import it without any risk in readonly mode (zpool import -o readonly=on ...) on build 147+, IIRC there''s a release of SchilliX that has that code. That way you can save all your data. I think this is appropriate SchilliX announcement: http://mail.opensolaris.org/pipermail/opensolaris-discuss/2010-September/059793.html Somewhat more risky approach that can potentially work with your build 134 is to add set aok=1 set zfs:zfs_recover=1 to /etc/system, reboot and then retry import. if it succeeds it is best to copy all you data into new pool. regards victor> I attached the full stack trace per file attachment. > > I need urgently a solution how to repair the filesystem to reboot the server > with all web services on it! > > I tried to import my zpool with zfs-fuse 0.7.0 with Linux: > root at rescue ~/tmp/official/src/cmd/zpool # ./zpool import > pool: rpool > id: 7742244823439206706 > state: ONLINE > status: The pool was last accessed by another system. > action: The pool can be imported using its name or numeric identifier and > the ''-f'' flag. > see: http://www.sun.com/msg/ZFS-8000-EY > config: > > rpool ONLINE > mirror-0 ONLINE > disk/by-id/ata-SAMSUNG_HD753LJ_S13UJDWZ301210-part5 ONLINE > disk/by-id/ata-SAMSUNG_HD753LJ_S13UJDWZ301202-part5 ONLINE > root at rescue ~/tmp/official/src/cmd/zpool # ./zpool > root at rescue ~/tmp/official/src/cmd/zpool # mkdir /mnt2 > root at rescue ~/tmp/official/src/cmd/zpool # ./zpool import -R /mnt2 7742244823439206706 > cannot import ''rpool'': pool may be in use from other system, it was last accessed by (hostid: 0xc4db24) on Mon Sep 13 04:11:47 2010 > use ''-f'' to import anyway > root at rescue ~/tmp/official/src/cmd/zpool # ./zpool import -f -R /mnt2 7742244823439206706 > zfsfuse_ioctl_read_loop(): file descriptor closed > internal error: out of memory > > I need urgently a patch (by preference today). I''m ready to pay for it if needed! > If I would be at least be able to repair my filesystem with zfs-fuse I would be happy to recover my data. > > I posted a bug about this on: > https://defect.opensolaris.org/bz/show_bug.cgi?id=17026 > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-code mailing list > zfs-code at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-code
I can''t reboot: My whole booting partition is ZFS. I can boot a OpenSolaris snv_111b or Linux rescue system and then to try to mount my filesystems over that. - When I execute "zfs import rpool" on OpenSolaris, OpenSolaris automatically reboots. - When I execute "zfs import rpool" on Linux (ZFS-FUSE, latest version), then I get: # ./zpool import -f rpool zfsfuse_ioctl_read_loop(): file descriptor closed internal error: out of memory I can''t start scrub because "import" fails already. It seems to be a bug in the ZFS source code in all existing versions. -- This message posted from opensolaris.org
On Sep 13, 2010, at 4:19 PM, Stephan Ferraro wrote:> I can''t reboot: > My whole booting partition is ZFS. > I can boot a OpenSolaris snv_111b or Linux rescue system and then to try to mount my filesystems over that. > > - When I execute "zfs import rpool" on OpenSolaris, OpenSolaris automatically reboots.When booted off of OpenSolaris LiveCD, you can run the following commands before trying to import: echo "aok/W 1" | mdb -kw echo "zfs_recover/W 1" | mdb -kw and then you can try to perform import. Victor> - When I execute "zfs import rpool" on Linux (ZFS-FUSE, latest version), then I get: > # ./zpool import -f rpool > zfsfuse_ioctl_read_loop(): file descriptor closed > internal error: out of memory > > I can''t start scrub because "import" fails already. It seems to be a bug in the ZFS source code in all existing versions. > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-code mailing list > zfs-code at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-code
It seems to hang on zfs_ioctl with the parameter ZFS_IOC_POOL_IMPORT: if (zfs_ioctl(hdl, ZFS_IOC_POOL_IMPORT, &zc) != 0) { So I guess I have no to debug the ZFS FUSE daemon to find out where is the importing problem? gdb trace output: Breakpoint 5, zpool_import_props (hdl=0x685000, config=0x686e38, newname=0x0, props=0x0, importfaulted=B_FALSE) at lib/libzfs/libzfs_pool.c:1455 1455 { (gdb) 1456 zfs_cmd_t zc = { 0 }; (gdb) 1458 nvlist_t *nvi = NULL; (gdb) 1465 verify(nvlist_lookup_string(config, ZPOOL_CONFIG_POOL_NAME, (gdb) 1468 (void) snprintf(errbuf, sizeof (errbuf), dgettext(TEXT_DOMAIN, (gdb) 1471 if (newname != NULL) { (gdb) 1478 thename = origname; (gdb) 1481 if (props) { (gdb) 1496 (void) strlcpy(zc.zc_name, thename, sizeof (zc.zc_name)); (gdb) 1498 verify(nvlist_lookup_uint64(config, ZPOOL_CONFIG_POOL_GUID, (gdb) 1501 if (zcmd_write_conf_nvlist(hdl, &zc, config) != 0) { (gdb) 1505 returned_size = zc.zc_nvlist_conf_size + 512; (gdb) 1506 if (zcmd_alloc_dst_nvlist(hdl, &zc, returned_size) != 0) { (gdb) 1511 zc.zc_cookie = (uint64_t)importfaulted; (gdb) 1512 ret = 0; (gdb) 1513 if (zfs_ioctl(hdl, ZFS_IOC_POOL_IMPORT, &zc) != 0) { (gdb) zfsfuse_ioctl_read_loop(): file descriptor closed 1516 (void) zcmd_read_dst_nvlist(hdl, &zc, &nvi); (gdb) -- This message posted from opensolaris.org
On Sep 13, 2010, at 4:42 PM, Stephan Ferraro wrote:> It seems to hang on zfs_ioctl with the parameter ZFS_IOC_POOL_IMPORT: > if (zfs_ioctl(hdl, ZFS_IOC_POOL_IMPORT, &zc) != 0) {this is expected. I think the best option for you is to pick up latest SchilliX and import pool readonly. Victor> So I guess I have no to debug the ZFS FUSE daemon to find out where is the importing problem? > > > gdb trace output: > > Breakpoint 5, zpool_import_props (hdl=0x685000, config=0x686e38, newname=0x0, props=0x0, importfaulted=B_FALSE) at lib/libzfs/libzfs_pool.c:1455 > 1455 { > (gdb) > 1456 zfs_cmd_t zc = { 0 }; > (gdb) > 1458 nvlist_t *nvi = NULL; > (gdb) > 1465 verify(nvlist_lookup_string(config, ZPOOL_CONFIG_POOL_NAME, > (gdb) > 1468 (void) snprintf(errbuf, sizeof (errbuf), dgettext(TEXT_DOMAIN, > (gdb) > 1471 if (newname != NULL) { > (gdb) > 1478 thename = origname; > (gdb) > 1481 if (props) { > (gdb) > 1496 (void) strlcpy(zc.zc_name, thename, sizeof (zc.zc_name)); > (gdb) > 1498 verify(nvlist_lookup_uint64(config, ZPOOL_CONFIG_POOL_GUID, > (gdb) > 1501 if (zcmd_write_conf_nvlist(hdl, &zc, config) != 0) { > (gdb) > 1505 returned_size = zc.zc_nvlist_conf_size + 512; > (gdb) > 1506 if (zcmd_alloc_dst_nvlist(hdl, &zc, returned_size) != 0) { > (gdb) > 1511 zc.zc_cookie = (uint64_t)importfaulted; > (gdb) > 1512 ret = 0; > (gdb) > 1513 if (zfs_ioctl(hdl, ZFS_IOC_POOL_IMPORT, &zc) != 0) { > (gdb) > zfsfuse_ioctl_read_loop(): file descriptor closed > 1516 (void) zcmd_read_dst_nvlist(hdl, &zc, &nvi); > (gdb) > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-code mailing list > zfs-code at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-code
I have now identified the same problem on ZFS-FUSE as on OpenSolaris svn_134: [New Thread 0x7fa90e838950 (LWP 29241)] [New Thread 0x7fa55950 (LWP 29242)] Program received signal SIGABRT, Aborted. [Switching to Thread 0x67224950 (LWP 29041)] 0x00007fa90faa6ed5 in raise () from /lib/libc.so.6 (gdb) (gdb) (gdb) (gdb) ba #0 0x00007fa90faa6ed5 in raise () from /lib/libc.so.6 #1 0x00007fa90faa83f3 in abort () from /lib/libc.so.6 #2 0x00000000004fe130 in vpanic (fmt=0x528ea8 "zfs: allocating allocated segment(offset=%llu size=%llu)\n", adx=0x672239d0) at lib/libsolkerncompat/cmn_err.c:53 #3 0x00000000004fe25e in vcmn_err (ce=3, fmt=0x528ea8 "zfs: allocating allocated segment(offset=%llu size=%llu)\n", adx=0x672239d0) at lib/libsolkerncompat/cmn_err.c:73 #4 0x00000000004999cf in zfs_panic_recover (fmt=0x528ea8 "zfs: allocating allocated segment(offset=%llu size=%llu)\n") at lib/libzpool/spa_misc.c:1185 #5 0x000000000049ae07 in space_map_add (sm=0x7fa90d268650, start=422941467136, size=8192) at lib/libzpool/space_map.c:102 #6 0x000000000049bef2 in space_map_load (sm=0x7fa90d268650, ops=0x7a2860, maptype=1 ''\001'', smo=0x7fa90d268030, os=0x7fa910f6dc40) at lib/libzpool/space_map.c:326 #7 0x000000000048391c in metaslab_activate (msp=0x7fa90d268000, activation_weight=4611686018427387904, size=512) at lib/libzpool/metaslab.c:860 #8 0x0000000000484c59 in metaslab_group_alloc (mg=0x7fa90e868f10, size=512, txg=400585, min_distance=93759373312, dva=0x7fa90d1c8a80, d=1) at lib/libzpool/metaslab.c:1211 #9 0x00000000004850db in metaslab_alloc_dva (spa=0x7fa910f80000, mc=0x7fa910fa2e80, psize=512, dva=0x7fa90d1c8a80, d=1, hintdva=0x0, txg=400585, flags=0) at lib/libzpool/metaslab.c:1354 #10 0x0000000000485c5f in metaslab_alloc (spa=0x7fa910f80000, mc=0x7fa910fa2e80, psize=512, bp=0x7fa90d1c8a80, ndvas=2, txg=400585, hintbp=0x0, flags=0) at lib/libzpool/metaslab.c:1537 #11 0x00000000004d78db in zio_dva_allocate (zio=0x7fa8e7adfba0) at lib/libzpool/zio.c:2126 #12 0x00000000004d3b58 in zio_execute (zio=0x7fa8e7adfba0) at lib/libzpool/zio.c:1162 #13 0x0000000000505bf9 in taskq_thread (arg=0x7fa910f85120) at lib/libsolkerncompat/taskq.c:1376 #14 0x00007fa9109a6fc7 in start_thread () from /lib/libpthread.so.0 #15 0x00007fa90fb4464d in clone () from /lib/libc.so.6 #16 0x0000000000000000 in ?? () (gdb) -- This message posted from opensolaris.org
I added a printf to get more information about the values: if (ss != NULL && ss->ss_start <= start && ss->ss_end >= end) { printf("ss=%llu != NULL ss->ss_start=%llu <= start=%llu && ss->ss_end=%llu >= end=%llu\n", ss, ss->ss_start, start, ss->ss_end, end); zfs_panic_recover("zfs: allocating allocated segment" "(offset=%llu size=%llu)\n", (longlong_t)start, (longlong_t)size); return; } So the content of: if (ss != NULL && ss->ss_start <= start && ss->ss_end >= end) { is: ss=140293012085376 != NULL ss->ss_start=422941467136 <= start=422941467136 && ss->ss_end=422941561344 >= end=422941475328 -- This message posted from opensolaris.org
> > On Sep 13, 2010, at 4:42 PM, Stephan Ferraro wrote: > > > It seems to hang on zfs_ioctl with the parameter > ZFS_IOC_POOL_IMPORT: > > if (zfs_ioctl(hdl, ZFS_IOC_POOL_IMPORT, &zc) != 0) > { > > this is expected. > > I think the best option for you is to pick up latest > SchilliX and import pool readonly.My hosting provider provides no solution to boot a rescue CD from a custom ISO file. I can''t boot SchiliX there. Is there not a possibility to simply fix this space_map.c file? (if the error is inside) -- This message posted from opensolaris.org
> > On Sep 13, 2010, at 4:19 PM, Stephan Ferraro wrote: > > > I can''t reboot: > > My whole booting partition is ZFS. > > I can boot a OpenSolaris snv_111b or Linux rescue > system and then to try to mount my filesystems over > that. > > > > - When I execute "zfs import rpool" on OpenSolaris, > OpenSolaris automatically reboots. > > When booted off of OpenSolaris LiveCD, you can run > the following commands before trying to import: > > echo "aok/W 1" | mdb -kw > echo "zfs_recover/W 1" | mdb -kw > > and then you can try to perform import.Ok THIS WORKED! THOUSAND TIMES THANKS FOR THIS! I started now scrub to see if there is a problem. Then I will retry to reboot. And then try to backup my over 400 GB of compressed data. -- This message posted from opensolaris.org