Jason Pepas
2015-Nov-07  06:21 UTC
[Libguestfs] mkfs.ext2 succeeds despite nbd write errors?
Hi,
So I've been hacking together an nbdkit plugin (similar to the
"file"
plugin, but it splits the file up into chunks):
https://github.com/pepaslabs/nbdkit-chunks-plugin
I got it to the point of being a working prototype.  Then I threw it
onto a raspberry pi, which it turns out only has a 50/50 shot of
fallocate() working correctly.
I'm checking the return code of fallocate(), and my chunks_pwrite()
returns -1 if it fails.  No problems there.
When I run mkfs.ext2 /dev/nbd0 on the client, I see this on the nbd-server:
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000030723'
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000030724'
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000030725'
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000030726'
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000030727'
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000030728'
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000031232'
Indeed, there is definitely a problem with fallocate, as some of the
chunks are the correct size (256k), and some are zero length:
cell@pi1$ pwd
/home/cell/nbds/default/chunks
cell@pi1$ ls -l | tail
-rw------- 1 cell cell 262144 Nov  7 06:01 00000000000000032256
-rw------- 1 cell cell 262144 Nov  7 06:01 00000000000000032257
-rw------- 1 cell cell 262144 Nov  7 06:01 00000000000000032258
-rw------- 1 cell cell 262144 Nov  7 06:01 00000000000000032259
-rw------- 1 cell cell 262144 Nov  7 06:01 00000000000000032260
-rw------- 1 cell cell 262144 Nov  7 06:01 00000000000000032261
-rw------- 1 cell cell 262144 Nov  7 06:01 00000000000000032262
-rw------- 1 cell cell 262144 Nov  7 06:01 00000000000000032263
-rw------- 1 cell cell      0 Nov  7 06:01 00000000000000032264
-rw------- 1 cell cell      0 Nov  7 06:01 00000000000000032767
But that's my concern.  The problem is that, alarmingly, mkfs.ext2
isn't phased by this at all:
root@debian:~# nbd-client pi1 10809 /dev/nbd0
Negotiation: ..size = 8192MB
bs=1024, sz=8589934592 bytes
root@debian:~# mkfs.ext2 /dev/nbd0
mke2fs 1.42.12 (29-Aug-2014)
Creating filesystem with 2097152 4k blocks and 524288 inodes
Filesystem UUID: 2230269c-6d2a-4927-93df-d9dd9f4fa40c
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
Allocating group tables: done
Writing inode tables: done
Writing superblocks and filesystem accounting information: done
root@debian:~#
However, the nbd-client's dmesg is chock full of errors:
[ 9832.409219] block nbd0: Other side returned error (22)
[ 9832.457401] block nbd0: Other side returned error (22)
[ 9832.503100] block nbd0: Other side returned error (22)
[ 9832.542457] block nbd0: Other side returned error (22)
[ 9832.590394] block nbd0: Other side returned error (22)
[ 9832.642393] block nbd0: Other side returned error (22)
[ 9832.681455] block nbd0: Other side returned error (22)
[ 9832.721355] block nbd0: Other side returned error (22)
[ 9832.722676] quiet_error: 15129 callbacks suppressed
[ 9832.722679] Buffer I/O error on device nbd0, logical block 6293248
[ 9832.724274] lost page write due to I/O error on nbd0
[ 9832.724282] Buffer I/O error on device nbd0, logical block 6293249
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293250
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293251
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293252
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293253
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293254
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293255
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293256
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293257
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.743111] block nbd0: Other side returned error (22)
[ 9832.744420] blk_update_request: 125 callbacks suppressed
[ 9832.744422] end_request: I/O error, dev nbd0, sector 12587008
[ 9832.758203] block nbd0: Other side returned error (22)
[ 9832.759513] end_request: I/O error, dev nbd0, sector 12845056
[ 9832.777635] block nbd0: Other side returned error (22)
[ 9832.779511] end_request: I/O error, dev nbd0, sector 12849160
[ 9832.805950] block nbd0: Other side returned error (22)
[ 9832.810278] end_request: I/O error, dev nbd0, sector 12849416
[ 9832.846880] block nbd0: Other side returned error (22)
So, my question / concern is, how is it that the nbd-client's kernel
is correctly detecting massive I/O errors, but apparently not sending
them through to mkfs.ext2?
Or perhaps mkfs.ext2 doesn't check for I/O errors?  That's a bit hard
to believe...
Anyway, I'm sure someone on this list has run into similar issues, so
I thought I'd reach out before I go too far down a rabbit hole.
Thanks,
Jason Pepas
Richard W.M. Jones
2015-Nov-07  11:03 UTC
Re: [Libguestfs] mkfs.ext2 succeeds despite nbd write errors?
On Sat, Nov 07, 2015 at 12:21:29AM -0600, Jason Pepas wrote:> Hi, > > So I've been hacking together an nbdkit plugin (similar to the "file" > plugin, but it splits the file up into chunks): > https://github.com/pepaslabs/nbdkit-chunks-plugin > > I got it to the point of being a working prototype. Then I threw it > onto a raspberry pi, which it turns out only has a 50/50 shot of > fallocate() working correctly. > > I'm checking the return code of fallocate(), and my chunks_pwrite() > returns -1 if it fails. No problems there. > > When I run mkfs.ext2 /dev/nbd0 on the client, I see this on the nbd-server: > > > nbdkit: chunks[1]: error: Unable to fallocate > '/home/cell/nbds/default/chunks/00000000000000030723' > nbdkit: chunks[1]: error: Unable to fallocate > '/home/cell/nbds/default/chunks/00000000000000030724' > nbdkit: chunks[1]: error: Unable to fallocate > '/home/cell/nbds/default/chunks/00000000000000030725' > nbdkit: chunks[1]: error: Unable to fallocate > '/home/cell/nbds/default/chunks/00000000000000030726' > nbdkit: chunks[1]: error: Unable to fallocate > '/home/cell/nbds/default/chunks/00000000000000030727' > nbdkit: chunks[1]: error: Unable to fallocate > '/home/cell/nbds/default/chunks/00000000000000030728' > nbdkit: chunks[1]: error: Unable to fallocate > '/home/cell/nbds/default/chunks/00000000000000031232' > > > Indeed, there is definitely a problem with fallocate, as some of the > chunks are the correct size (256k), and some are zero length: > > cell@pi1$ pwd > /home/cell/nbds/default/chunks > cell@pi1$ ls -l | tail > -rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032256 > -rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032257 > -rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032258 > -rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032259 > -rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032260 > -rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032261 > -rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032262 > -rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032263 > -rw------- 1 cell cell 0 Nov 7 06:01 00000000000000032264 > -rw------- 1 cell cell 0 Nov 7 06:01 00000000000000032767 > > > But that's my concern. The problem is that, alarmingly, mkfs.ext2 > isn't phased by this at all: > > > root@debian:~# nbd-client pi1 10809 /dev/nbd0 > Negotiation: ..size = 8192MB > bs=1024, sz=8589934592 bytes > root@debian:~# mkfs.ext2 /dev/nbd0 > mke2fs 1.42.12 (29-Aug-2014) > Creating filesystem with 2097152 4k blocks and 524288 inodes > Filesystem UUID: 2230269c-6d2a-4927-93df-d9dd9f4fa40c > Superblock backups stored on blocks: > 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632 > > Allocating group tables: done > Writing inode tables: done > Writing superblocks and filesystem accounting information: done > > root@debian:~# > > > However, the nbd-client's dmesg is chock full of errors: > > > [ 9832.409219] block nbd0: Other side returned error (22) > [ 9832.457401] block nbd0: Other side returned error (22) > [ 9832.503100] block nbd0: Other side returned error (22) > [ 9832.542457] block nbd0: Other side returned error (22) > [ 9832.590394] block nbd0: Other side returned error (22) > [ 9832.642393] block nbd0: Other side returned error (22) > [ 9832.681455] block nbd0: Other side returned error (22) > [ 9832.721355] block nbd0: Other side returned error (22) > [ 9832.722676] quiet_error: 15129 callbacks suppressed > [ 9832.722679] Buffer I/O error on device nbd0, logical block 6293248 > [ 9832.724274] lost page write due to I/O error on nbd0 > [ 9832.724282] Buffer I/O error on device nbd0, logical block 6293249 > [ 9832.725110] lost page write due to I/O error on nbd0 > [ 9832.725110] Buffer I/O error on device nbd0, logical block 6293250 > [ 9832.725110] lost page write due to I/O error on nbd0 > [ 9832.725110] Buffer I/O error on device nbd0, logical block 6293251 > [ 9832.725110] lost page write due to I/O error on nbd0 > [ 9832.725110] Buffer I/O error on device nbd0, logical block 6293252 > [ 9832.725110] lost page write due to I/O error on nbd0 > [ 9832.725110] Buffer I/O error on device nbd0, logical block 6293253 > [ 9832.725110] lost page write due to I/O error on nbd0 > [ 9832.725110] Buffer I/O error on device nbd0, logical block 6293254 > [ 9832.725110] lost page write due to I/O error on nbd0 > [ 9832.725110] Buffer I/O error on device nbd0, logical block 6293255 > [ 9832.725110] lost page write due to I/O error on nbd0 > [ 9832.725110] Buffer I/O error on device nbd0, logical block 6293256 > [ 9832.725110] lost page write due to I/O error on nbd0 > [ 9832.725110] Buffer I/O error on device nbd0, logical block 6293257 > [ 9832.725110] lost page write due to I/O error on nbd0 > [ 9832.743111] block nbd0: Other side returned error (22) > [ 9832.744420] blk_update_request: 125 callbacks suppressed > [ 9832.744422] end_request: I/O error, dev nbd0, sector 12587008 > [ 9832.758203] block nbd0: Other side returned error (22) > [ 9832.759513] end_request: I/O error, dev nbd0, sector 12845056 > [ 9832.777635] block nbd0: Other side returned error (22) > [ 9832.779511] end_request: I/O error, dev nbd0, sector 12849160 > [ 9832.805950] block nbd0: Other side returned error (22) > [ 9832.810278] end_request: I/O error, dev nbd0, sector 12849416 > [ 9832.846880] block nbd0: Other side returned error (22) > > > So, my question / concern is, how is it that the nbd-client's kernel > is correctly detecting massive I/O errors, but apparently not sending > them through to mkfs.ext2?It's definitely not good, but I don't think it can be nbdkit, since nbd-client is seeing the errors.> Or perhaps mkfs.ext2 doesn't check for I/O errors? That's a bit hard > to believe...How about 'strace mkfs.ext2 ..' and see if any system calls are returning errors. That would show you whether nbd-client is throwing errors away, or whether mkfs is getting the errors and ignoring them (seems pretty unlikely, but you never know). After that, it'd be down to tracing where the errors end up in the kernel. Rich.> Anyway, I'm sure someone on this list has run into similar issues, so > I thought I'd reach out before I go too far down a rabbit hole. > > Thanks, > Jason Pepas > > _______________________________________________ > Libguestfs mailing list > Libguestfs@redhat.com > https://www.redhat.com/mailman/listinfo/libguestfs-- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Jason Pepas
2015-Nov-07  19:22 UTC
Re: [Libguestfs] mkfs.ext2 succeeds despite nbd write errors?
On Sat, Nov 7, 2015 at 5:03 AM, Richard W.M. Jones <rjones@redhat.com> wrote:> How about 'strace mkfs.ext2 ..' and see if any system calls are > returning errors. That would show you whether nbd-client is throwing > errors away, or whether mkfs is getting the errors and ignoring them > (seems pretty unlikely, but you never know). > > After that, it'd be down to tracing where the errors end up in the > kernel.Thanks for the tip! The results are interesting. It looks like all of mkfs's pwrite() calls succeed, but its final fsync() calls do actually fail: root@debian:~# strace mkfs.ext2 /dev/nbd0 2>&1 | tee strace.out root@debian:~# cat strace.out | grep pwrite pwrite(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 32768, 8187379712) = 32768 pwrite(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 32768, 8187412480) = 32768 pwrite(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 32768, 8187445248) = 32768 ... root@debian:~# cat strace.out | grep fsync fsync(3) = -1 EIO (Input/output error) fsync(3) = -1 EIO (Input/output error) The fsync() calls happen just before mkfs exists success: root@debian:~# cat strace.out | tail pwrite(3, "\1\2\0\0\2\2\0\0\3\2\0\0\367{\365\37\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 6576672768) = 4096 fsync(3) = -1 EIO (Input/output error) pwrite(3, "\0\0\10\0\0\0 \0\231\231\1\0qm\37\0\365\377\7\0\0\0\0\0\2\0\0\0\2\0\0\0"..., 1024, 1024) = 1024 fsync(3) = -1 EIO (Input/output error) close(3) = 0 write(1, "done\n\n", 6done ) = 6 exit_group(0) = ? +++ exited with 0 +++ root@debian:~# I did manage to find two calls to fsync in the e2fsprogs source which are not return-value-checked: https://github.com/tytso/e2fsprogs/blob/956b0f18a5ddb6815a9dff4f10a1e3125cdca9ba/misc/filefrag.c#L303 https://github.com/tytso/e2fsprogs/blob/956b0f18a5ddb6815a9dff4f10a1e3125cdca9ba/lib/ext2fs/unix_io.c#L915 I'll see about submitting a patch there. I'm not sure where to start with hunting down why mkfs's pwrite() calls aren't failing. I'd look to the kernel source for that? -jason