Chris Anderson
2021-Feb-23 03:25 UTC
lots of "no such file or directory" errors in zfs filesystem
On Mon, Feb 22, 2021 at 9:13 AM Andriy Gapon <avg at freebsd.org> wrote:> On 22/02/2021 16:20, Chris Anderson wrote: > > On Mon, Feb 22, 2021 at 1:36 AM Andriy Gapon <avg at freebsd.org > > <mailto:avg at freebsd.org>> wrote: > > > > On 22/02/2021 09:31, Chris Anderson wrote: > > > None of these files are especially important to me, however I was > wondering > > > if there would be any benefit to the community from trying to > debug this > > > issue further to understand what might be going wrong. > > > > Yes. > > > > > > Could you offer any guidance about what kind of debugging information I > could > > collect that would be of use? > > You can start with picking a single file that demonstrates the problem. > Then, > ls -li the-file > zdb -dddd file's-filesystem file's-inode-number > The filesystem can be found out from df output, the inode number is in ls > -li > output -- if the command prints anything at all. > If it does not, then do ls -lid on the file's directory and then zdb -dddd > for > the directory's inode number. In the output there should be the file name > and > its number (I think that it's in hex, but not sure). >so I can't ls -i the file since that triggers the no such file warning. if I run zdb -dddd on the inode of a directory which contains one of those missing files, I can get the inode of the file from that, but I don't get anything particularly interesting in the output. most of the files that are missing are in directories with a large number of files (the largest has 180k) but I managed to find a directory which had a single file entry that is missing: Dataset tank/home/cva [ZPL], ID 196, cr_txg 163, 109G, 908537 objects, rootbp DVA[0]=<0:13210311000:1000> DVA[1]=<0:18b9a02c000:1000> [L0 DMU objset] fletcher4 uncompressed LE contiguous unique double size=800L/800P birth=46916371L/46916371P fill=908537 cksum=11fdd21d1d:13cb24c87a6e:da0c9bf1b5df3:715ab2ec45b7b09 Object lvl iblk dblk dsize dnsize lsize %full type 38268 1 128K 1K 0 512 1K 100.00 ZFS directory 264 bonus ZFS znode dnode flags: USED_BYTES USERUSED_ACCOUNTED dnode maxblkid: 0 uid 1001 gid 1001 atime Sun Aug 6 02:00:41 2017 mtime Wed Apr 15 12:12:42 2020 ctime Wed Apr 15 12:12:42 2020 crtime Sat Aug 5 15:10:07 2017 gen 23881023 mode 40755 size 3 parent 38176 links 2 pflags 40800000144 xattr 0 rdev 0x0000000000000000 microzap: 1024 bytes, 1 entries hash_test.go = 38274 (type: Regular File) # zdb -dddd tank/home/cva 38274 Dataset tank/home/cva [ZPL], ID 196, cr_txg 163, 109G, 908537 objects, rootbp DVA[0]=<0:13210311000:1000> DVA[1]=<0:18b9a02c000:1000> [L0 DMU objset] fletcher4 uncompressed LE contiguous unique double size=800L/800P birth=46916371L/46916371P fill=908537 cksum=11fdd21d1d:13cb24c87a6e:da0c9bf1b5df3:715ab2ec45b7b09 Object lvl iblk dblk dsize dnsize lsize %full type zdb: dmu_bonus_hold(38274) failed, errno 2> > -- > Andriy Gapon >
Andriy Gapon
2021-Feb-23 10:52 UTC
lots of "no such file or directory" errors in zfs filesystem
On 23/02/2021 05:25, Chris Anderson wrote:> so I can't ls -i the file since that triggers the no such file warning. if I run > zdb -dddd on the inode of a directory which contains one of those missing files, > I can get the inode of the file from that, but I don't get anything particularly > interesting in the output. > > most of the files that are missing are in directories with a large number of > files (the largest has 180k) but I managed to find a directory which had a > single file entry that is missing: > > Dataset tank/home/cva [ZPL], ID 196, cr_txg 163, 109G, 908537 objects, rootbp > DVA[0]=<0:13210311000:1000> DVA[1]=<0:18b9a02c000:1000> [L0 DMU objset] > fletcher4 uncompressed LE contiguous unique double size=800L/800P > birth=46916371L/46916371P fill=908537 > cksum=11fdd21d1d:13cb24c87a6e:da0c9bf1b5df3:715ab2ec45b7b09 > > > ? ? Object? lvl ? iblk ? dblk? dsize? dnsize? lsize ? %full? type > > ?? ? 38268? ? 1 ? 128K ? ? 1K? ? ? 0? ? 512 ? ? 1K? 100.00? ZFS directory > > ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 264 ? bonus? ZFS znode > > ? ? ? ? dnode flags: USED_BYTES USERUSED_ACCOUNTED? > > ? ? ? ? dnode maxblkid: 0 > > ? ? ? ? uid ? ? 1001 > > ? ? ? ? gid ? ? 1001 > > ? ? ? ? atime ? Sun Aug? 6 02:00:41 2017 > > ? ? ? ? mtime ? Wed Apr 15 12:12:42 2020 > > ? ? ? ? ctime ? Wed Apr 15 12:12:42 2020 > > ? ? ? ? crtime? Sat Aug? 5 15:10:07 2017 > > ? ? ? ? gen ? ? 23881023 > > ? ? ? ? mode? ? 40755 > > ? ? ? ? size? ? 3 > > ? ? ? ? parent? 38176 > > ? ? ? ? links ? 2 > > ? ? ? ? pflags? 40800000144 > > ? ? ? ? xattr ? 0 > > ? ? ? ? rdev? ? 0x0000000000000000 > > ? ? ? ? microzap: 1024 bytes, 1 entries > > ?? ? ? ? > > ? ? ? ? ? ? ? ? hash_test.go = 38274 (type: Regular File) > > > # zdb -dddd tank/home/cva 38274 > > Dataset tank/home/cva [ZPL], ID 196, cr_txg 163, 109G, 908537 objects, rootbp > DVA[0]=<0:13210311000:1000> DVA[1]=<0:18b9a02c000:1000> [L0 DMU objset] > fletcher4 uncompressed LE contiguous unique double size=800L/800P > birth=46916371L/46916371P fill=908537 > cksum=11fdd21d1d:13cb24c87a6e:da0c9bf1b5df3:715ab2ec45b7b09 > > > ? ? Object? lvl ? iblk ? dblk? dsize? dnsize? lsize ? %full? type > > zdb: dmu_bonus_hold(38274) failed, errno 2So, this looks like a "simple" problem. Unfortunately, it is very hard to tell retrospectively what bug caused it. The directory has an entry for the file, but the file does not actually exist (or has a different ID). This is a logical inconsistency, not a data integrity issue. So, a scrub, being a data integrity check, would not detect such an issue. Hypothetical zfs_fsck is needed to find and repair such logical problems. Does that pool and filesystem have any special history? I mean upgrades, replication via send/recv, moving between OS-s, etc. -- Andriy Gapon