Richard W.M. Jones
2022-May-03 08:07 UTC
[Libguestfs] nbdkit error: "write reply: NBD_CMD_WRITE: Broken pipe"
On Mon, May 02, 2022 at 03:36:33PM +0100, Nikolaus Rath wrote:> On May 02 2022, Laszlo Ersek <lersek at redhat.com> wrote: > > On 05/01/22 18:35, Nikolaus Rath wrote: > >> Hi, > >> > >> I am developing a new nbdkit plugin, and occasionally I am getting > >> errors like this: > >> > >> |nbdkit: s3backer.8: error: write reply: NBD_CMD_WRITE: Broken pipe > >> nbdkit: s3backer.15: error: write reply: NBD_CMD_WRITE: Broken pipe| > >> > >> > >> (where "s3backer" is the plugin name). > >> > >> I am not sure what to make of these. Can someone advise? > >> > >> Looking at the nbdkit source, it looks to me like these are generated > >> when there is a problem sending a reply to the nbd client. On the other > >> hand, I am using the standard 'nbd-client' program through a Unix > >> socket, so I'd think this should not result in errors...?So firstly, yes, we should interoperate correctly with the kernel nbd.ko client and nbd-client. If there's a bug in interop, it's a bug in nbdkit. However in this case these errors could be normal if the client disconnects suddenly. It's easy enough to simulate this even using only the userspace client from libnbd. If we initiate NBD_CMD_WRITE but disconnect before it finishes then: $ nbdkit -U - -fv memory 1M \ --run 'nbdsh -u $uri -c "b = nbd.Buffer.from_bytearray(bytearray(512)); h.aio_pwrite(b, 0)"' ... nbdkit: memory.0: error: write reply: NBD_CMD_WRITE: Broken pipe Note that this isn't a problem for nbdkit. It prints the error because it cannot send the reply on this connection, but continues processing other connections as normal. Data is potentially lost, but there's nothing nbdkit can do about that if the client goes away suddenly. Clients that care about data integrity should issue NBD_CMD_FLUSH and wait for the reply before declaring that data has been committed.> > If your plugin managed to crash nbd-client remotely, that would be > > consistent with this symptom. > > So I tried to reproduce this, and noticed something odd. It seems I can > disconnect the nbd device (nbd-client -d) while there are still requests > in flight: > > May 02 15:20:50 vostro.rath.org kernel: nbd1: detected capacity change from 0 to 52428800 > May 02 15:20:50 vostro.rath.org kernel: block nbd1: NBD_DISCONNECT > May 02 15:20:50 vostro.rath.org kernel: block nbd1: Disconnected due to user request. > May 02 15:20:50 vostro.rath.org kernel: block nbd1: shutting down sockets > May 02 15:20:50 vostro.rath.org kernel: I/O error, dev nbd1, sector 776 op 0x0:(READ) flags 0x80700 phys_seg 29 prio class 0 > May 02 15:20:50 vostro.rath.org kernel: I/O error, dev nbd1, sector 776 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 > May 02 15:20:50 vostro.rath.org kernel: Buffer I/O error on dev nbd1, logical block 97, async page read > May 02 15:20:50 vostro.rath.org kernel: block nbd1: Attempted send on invalid socket > May 02 15:20:50 vostro.rath.org kernel: I/O error, dev nbd1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 > May 02 15:20:50 vostro.rath.org kernel: block nbd1: Attempted send on invalid socket > May 02 15:20:50 vostro.rath.org kernel: I/O error, dev nbd1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 > > This was generated by running: > > $ nbd-client localhost /dev/nbd1 && mkfs.ext4 /dev/nbd1 && nbd-client -d > /dev/nbd1 > > Is that expected behavior?It's a bit unexpected to me. Adding Wouter to the thread - he might have an idea here, especially if there's a way to have "nbd-client -d" wait for pending requests to finish before disconnecting. I don't use the kernel client very much myself. We mostly use either libnbd or the qemu client.> I would have thought that nb-client will block until any dirty data has > been written. > > Curiously enough, in this case I did *not* get the above warnings from > nbdkit itself.Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Nikolaus Rath
2022-May-06 13:08 UTC
[Libguestfs] nbdkit error: "write reply: NBD_CMD_WRITE: Broken pipe"
On May 03 2022, rjones at redhat.com (Richard W.M. Jones) wrote:>> So I tried to reproduce this, and noticed something odd. It seems I can >> disconnect the nbd device (nbd-client -d) while there are still requests >> in flight: >> >> May 02 15:20:50 vostro.rath.org kernel: nbd1: detected capacity change from 0 to 52428800 >> May 02 15:20:50 vostro.rath.org kernel: block nbd1: NBD_DISCONNECT >> May 02 15:20:50 vostro.rath.org kernel: block nbd1: Disconnected due to user request. >> May 02 15:20:50 vostro.rath.org kernel: block nbd1: shutting down sockets >> May 02 15:20:50 vostro.rath.org kernel: I/O error, dev nbd1, sector 776 op 0x0:(READ) >> flags 0x80700 phys_seg 29 prio class 0 >> May 02 15:20:50 vostro.rath.org kernel: I/O error, dev nbd1, sector 776 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 >> May 02 15:20:50 vostro.rath.org kernel: Buffer I/O error on dev nbd1, logical block 97, async page read >> May 02 15:20:50 vostro.rath.org kernel: block nbd1: Attempted send on invalid socket >> May 02 15:20:50 vostro.rath.org kernel: I/O error, dev nbd1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 >> May 02 15:20:50 vostro.rath.org kernel: block nbd1: Attempted send on invalid socket >> May 02 15:20:50 vostro.rath.org kernel: I/O error, dev nbd1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 >> >> This was generated by running: >> >> $ nbd-client localhost /dev/nbd1 && mkfs.ext4 /dev/nbd1 && nbd-client -d >> /dev/nbd1 >> >> Is that expected behavior? > > It's a bit unexpected to me. Adding Wouter to the thread - he might > have an idea here, especially if there's a way to have "nbd-client -d" > wait for pending requests to finish before disconnecting. > > I don't use the kernel client very much myself. We mostly use either > libnbd or the qemu client.Did you Cc Wouter to your email? I didn't see a Cc header, but perhaps this was stripped by the mailing list software? Or is there a more appropriate list where I could follow-up on this? Thanks! -Nikolaus -- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.?