Jim Klimov
2011-Nov-08 18:30 UTC
[zfs-discuss] Single-disk rpool with inconsistent checksums, import fails
Hello all, I have an oi_148a PC with a single root disk, and since recently it fails to boot - hangs after the copyright message whenever I use any of my GRUB menu options. Booting with an oi_148a LiveUSB I had around since installation, I ran some zdb traversals over the rpool and zpool import attempts. The imports fail by running the kernel out of RAM (as recently discussed in the list with Paul Kraus''s problems). However, in my current case, the rpool has just 11.2Gb allocated with 8.7Gb "available". So almost all of it could fit in the 8Gb RAM of this computer (no more can be placed into the motherboard). And I don''t believe there is so much metadata as to exhaust the RAM during an import attempt. I have also tried "rollback" imports with -F, but they have also failed so far. I am not ready to copypaste the zdb/zpool outputs here (I have to get text files off that box), but in short: 1) "zdb -bsvL -e <rpool-GUID>" showed that there are some problems: * "deferred free" block count is not zero, although small (144 blocks amounting to 1.4Mbytes), and it remained at this value over several import attempts. I have removed a swap volume some time before the failure, so this might be its leftovers. * It had also output this line: block traversal size 11986202624 != alloc 11986203136 (unreachable 512) I believe this refers to the allocated data size in bytes, and that one sector (512b) is deemed unreachable. Is that so fatal? 2) "zdb -bsvc -e <rpool-GUID>" showed that there are some consistency problems. Namely, five blocks had mismatching checksums. They were named "plain file" blocks with no further details (like what files they might be parts of). But I hope that this means no metadata was hurt so far. 3) I''ve tried importing the pool in several ways (including normal and rollback mounts, readonly and "-n"), but so far all attempts led to to the computer hanging within a minute ("vmstat 1" shows that free RAM plummets towards the zero mark). I''ve tried preparing the system tunables as well: :; echo "aok/W 1" | mdb -kw :; echo "zfs_recover/W 1" | mdb -kw and sometimes adding: :; echo zfs_vdev_max_pending/W0t5 | mdb -kw :; echo zfs_resilver_delay/W0t0 | mdb -kw :; echo zfs_resilver_min_time_ms/W0t20000 | mdb -kw :; echo zfs_txg_synctime/W0t1 | mdb -kw In this case I am not very hesitant to recreate the rpool and reinstall the OS - it was mostly needed to server the separate data pool. However this option is not always an acceptable one, so I wonder if anything can be done to repair an inconsistent non-redundant pool - at least to make it importable again in order to evacuate some of the settings and tunings that I''ve made over time. //Jim
Jim Klimov
2011-Nov-08 19:57 UTC
[zfs-discuss] Single-disk rpool with inconsistent checksums, import fails
2011-11-08 22:30, Jim Klimov wrote:> Hello all, > > I have an oi_148a PC with a single root disk, and since > recently it fails to boot - hangs after the copyright > message whenever I use any of my GRUB menu options.Thanks to my wife''s sister, who is my hands and eyes near the problematic PC, here''s some ZDB output from this rpool: # zpool import pool: rpool id: 17995958177810353692 state: ONLINE status: The pool was last accessed by another system. action: The pool can be imported using its name or numeric identifier and the ''-f'' flag. see: http://www.sun.com/msg/ZFS-8000-EY config: rpool ONLINE c4t1d0s0 ONLINE So here it is - a single-device "rpool". There are some on-disk errors, so some of zdb walks fail: root at openindiana:~# time zdb -bb -e 17995958177810353692 Traversing all blocks to verify nothing leaked ... Assertion failed: ss->ss_start <= start (0x79e22600 <= 0x79e1dc00), file ../../../uts/common/fs/zfs/space_map.c, line 173 Abort (core dumped) real 0m12.184s user 0m0.367s sys 0m0.474s root at openindiana:~# time zdb -bsvc -e 17995958177810353692 Traversing all blocks to verify checksums and verify nothing leaked ... Assertion failed: ss->ss_start <= start (0x79e22600 <= 0x79e1dc00), file ../../../uts/common/fs/zfs/space_map.c, line 173 Abort (core dumped) real 0m12.019s user 0m0.360s sys 0m0.458s However "-bsvL" and "-bsvcL" (with checksum-checks) do finish, results of the former test (more complete) are listed below: root at openindiana:~# time zdb -bsvcL -e 17995958177810353692 Traversing all blocks to verify checksums ... zdb_blkptr_cb: Got error 50 reading <182, 19177, 0, 1> DVA[0]=<0:a8c8e600:20000> [L0 ZFS plain file] fletcher4 uncompressed LE contiguous unique single size=20000L/20000P birth=82L/82P fill=1 cksum=3401f5fe522b:109ee10ba48ed38c:e7f49c220f7b8bc:ff405ef051b91e65 -- skipping zdb_blkptr_cb: Got error 50 reading <182, 19202, 0, 1> DVA[0]=<0:a9030a00:20000> [L0 ZFS plain file] fletcher4 uncompressed LE contiguous unique single size=20000L/20000P birth=82L/82P fill=1 cksum=11c4c738b0ba:7bb81bce3313913:8f85a7abf1b9e34:58e8746d63119393 -- skipping zdb_blkptr_cb: Got error 50 reading <182, 24924, 0, 0> DVA[0]=<0:b1aaec00:14a00> [L0 ZFS plain file] fletcher4 uncompressed LE contiguous unique single size=14a00L/14a00P birth=85L/85P fill=1 cksum=270679cd905d:6119a969a134566:6f0f7da64c4d2d90:3ab86aa985abef02 -- skipping zdb_blkptr_cb: Got error 50 reading <182, 24944, 0, 0> DVA[0]=<0:b1cdf000:10800> [L0 ZFS plain file] fletcher4 uncompressed LE contiguous unique single size=10800L/10800P birth=85L/85P fill=1 cksum=1ebb4d1ae9f5:3cf5f42afa9a332:757613fc2d2de7b3:5f197017333a4f89 -- skipping zdb_blkptr_cb: Got error 50 reading <493, 947, 0, 165> DVA[0]=<0:b3efc200:20000> [L0 ZFS plain file] fletcher4 uncompressed LE contiguous unique single size=20000L/20000P birth=26691L/26691P fill=1 cksum=2cdc2ae22d10:b33d31bcbc0d8da:f1571c9975e151b0:a037073594569635 -- skipping Error counts: errno count 50 5 block traversal size 11986202624 != alloc 11986203136 (unreachable 512) bp count: 405927 bp logical: 15030449664 avg: 37027 bp physical: 12995855872 avg: 32015 compression: 1.16 bp allocated: 13172434944 avg: 32450 compression: 1.14 bp deduped: 1186232320 ref>1: 12767 deduplication: 1.09 SPA allocated: 11986203136 used: 56.17% Blocks LSIZE PSIZE ASIZE avg comp %Total Type - - - - - - - unallocated 2 32K 4K 12.0K 6.00K 8.00 0.00 object directory 3 1.50K 1.50K 4.50K 1.50K 1.00 0.00 object array 1 16K 1.50K 4.50K 4.50K 10.67 0.00 packed nvlist - - - - - - - packed nvlist size 197 24.2M 1.87M 5.61M 29.2K 12.92 0.04 bpobj - - - - - - - bpobj header - - - - - - - SPA space map header 1.27K 6.79M 3.25M 9.8M 7.70K 2.09 0.08 SPA space map 8 144K 144K 144K 18.0K 1.00 0.00 ZIL intent log 26.6K 426M 91.1M 182M 6.86K 4.67 1.45 DMU dnode 75 150K 39.0K 80.0K 1.07K 3.85 0.00 DMU objset - - - - - - - DSL directory 23 12.0K 11.5K 34.5K 1.50K 1.04 0.00 DSL directory child map 21 11.5K 10.5K 31.5K 1.50K 1.10 0.00 DSL dataset snap map 49 707K 79.5K 239K 4.87K 8.89 0.00 DSL props - - - - - - - DSL dataset - - - - - - - ZFS znode - - - - - - - ZFS V0 ACL 321K 12.0G 10.5G 10.5G 33.4K 1.14 85.46 ZFS plain file 26.8K 41.5M 19.1M 38.2M 1.42K 2.17 0.30 ZFS directory 18 17.5K 9.00K 18.0K 1K 1.94 0.00 ZFS master node 50 84.5K 25.0K 50.0K 1K 3.38 0.00 ZFS delete queue 12.1K 1.50G 1.50G 1.50G 127K 1.00 12.22 zvol object 1 1K 512 1K 1K 2.00 0.00 zvol prop - - - - - - - other uint8[] - - - - - - - other uint64[] - - - - - - - other ZAP - - - - - - - persistent error log 2 256K 44.0K 132K 66.0K 5.82 0.00 SPA history - - - - - - - SPA history offsets 1 512 512 1.50K 1.50K 1.00 0.00 Pool properties - - - - - - - DSL permissions - - - - - - - ZFS ACL - - - - - - - ZFS SYSACL - - - - - - - FUID table - - - - - - - FUID table size 2 2K 1K 3.00K 1.50K 2.00 0.00 DSL dataset next clones - - - - - - - scan work queue 146 103K 73.0K 146K 1K 1.40 0.00 ZFS user/group used - - - - - - - ZFS user/group quota 1 512 512 1.50K 1.50K 1.00 0.00 snapshot refcount tags 7.14K 28.6M 17.5M 52.6M 7.37K 1.63 0.42 DDT ZAP algorithm 2 32K 4K 12.0K 6.00K 8.00 0.00 DDT statistics - - - - - - - System attributes 18 9.00K 9.00K 18.0K 1K 1.00 0.00 SA master node 18 27.0K 9.00K 18.0K 1K 3.00 0.00 SA attr registration 44 704K 77.0K 154K 3.50K 9.14 0.00 SA attr layouts - - - - - - - scan translations - - - - - - - deduplicated block 133 71.0K 66.5K 200K 1.50K 1.07 0.00 DSL deadlist map - - - - - - - DSL deadlist map hdr 3 2.50K 1.50K 4.50K 1.50K 1.67 0.00 DSL dir clones 27 3.38M 122K 365K 13.5K 28.44 0.00 bpobj subobj 144 1.42M 228K 683K 4.74K 6.37 0.01 deferred free 4 130K 130K 130K 32.5K 1.00 0.00 dedup ditto 396K 14.0G 12.1G 12.3G 31.7K 1.16 100.00 Total capacity operations bandwidth ---- errors ---- description used avail read write read write read write cksum rpool 11.2G 8.71G 308 0 11.2M 0 0 0 5 /dev/dsk/c4t1d0s0 11.2G 8.71G 308 0 11.2M 0 0 0 10 real 38m56.588s user 4m15.708s sys 0m56.255s I see a non-empty deferred-free list and, apparently, blocks with checksum errors. If I read this right, four blocks are from old generations (TXGs 82 and 85?), and one is more recent (26691). What else does a trained eye see which I don''t? According to "zdb -l" below, current TXG numbers are in 560 million range... root at openindiana:~# zdb -l /dev/dsk/c4t1d0s0 -------------------------------------------- LABEL 0 -------------------------------------------- version: 28 name: ''rpool'' state: 0 txg: 560647931 pool_guid: 17995958177810353692 hostid: 13583512 hostname: '''' top_guid: 3656218981390172871 guid: 3656218981390172871 vdev_children: 1 vdev_tree: type: ''disk'' id: 0 guid: 3656218981390172871 path: ''/dev/dsk/c4t1d0s0'' devid: ''id1,sd at SATA_____ST3808110AS_________________5LR557KB/a'' phys_path: ''/pci at 0,0/pci8086,2847 at 1c,4/pci1043,81e4 at 0/disk at 1,0:a'' whole_disk: 0 metaslab_array: 30 metaslab_shift: 27 ashift: 9 asize: 21430272000 is_log: 0 DTL: 4098 create_txg: 4 -------------------------------------------- LABEL 1 -------------------------------------------- version: 28 name: ''rpool'' state: 0 txg: 560647931 pool_guid: 17995958177810353692 hostid: 13583512 hostname: '''' top_guid: 3656218981390172871 guid: 3656218981390172871 vdev_children: 1 vdev_tree: type: ''disk'' id: 0 guid: 3656218981390172871 path: ''/dev/dsk/c4t1d0s0'' devid: ''id1,sd at SATA_____ST3808110AS_________________5LR557KB/a'' phys_path: ''/pci at 0,0/pci8086,2847 at 1c,4/pci1043,81e4 at 0/disk at 1,0:a'' whole_disk: 0 metaslab_array: 30 metaslab_shift: 27 ashift: 9 asize: 21430272000 is_log: 0 DTL: 4098 create_txg: 4 -------------------------------------------- LABEL 2 -------------------------------------------- version: 28 name: ''rpool'' state: 0 txg: 560647931 pool_guid: 17995958177810353692 hostid: 13583512 hostname: '''' top_guid: 3656218981390172871 guid: 3656218981390172871 vdev_children: 1 vdev_tree: type: ''disk'' id: 0 guid: 3656218981390172871 path: ''/dev/dsk/c4t1d0s0'' devid: ''id1,sd at SATA_____ST3808110AS_________________5LR557KB/a'' phys_path: ''/pci at 0,0/pci8086,2847 at 1c,4/pci1043,81e4 at 0/disk at 1,0:a'' whole_disk: 0 metaslab_array: 30 metaslab_shift: 27 ashift: 9 asize: 21430272000 is_log: 0 DTL: 4098 create_txg: 4 -------------------------------------------- LABEL 3 -------------------------------------------- version: 28 name: ''rpool'' state: 0 txg: 560647931 pool_guid: 17995958177810353692 hostid: 13583512 hostname: '''' top_guid: 3656218981390172871 guid: 3656218981390172871 vdev_children: 1 vdev_tree: type: ''disk'' id: 0 guid: 3656218981390172871 path: ''/dev/dsk/c4t1d0s0'' devid: ''id1,sd at SATA_____ST3808110AS_________________5LR557KB/a'' phys_path: ''/pci at 0,0/pci8086,2847 at 1c,4/pci1043,81e4 at 0/disk at 1,0:a'' whole_disk: 0 metaslab_array: 30 metaslab_shift: 27 ashift: 9 asize: 21430272000 is_log: 0 DTL: 4098 create_txg: 4 Any ideas as to whether this rpool can be recovered into mountable state, or recreation is my only option now? ;) In particular, I''m currently testing with LiveUSB oi_148a as that is what they have at the broken PC. Should we expect for zpool import and fixup to work better with oi_151a, oi_dev, or Solaris 11 (Express or Release)? It might be problematic to record another bootable device remotely, so if no related code has changed... //Jim