Steven Rostedt
2022-Mar-25 21:20 UTC
[PATCH] virtio: Workaround fix for hard hang on guest using fifos
This is more of a workaround patch and probably not the proper fix. But I'm doing some work that is using fifos for guests and this is causing a hang that is quite annoying. I currently have this patch applied to continue my work. I was working on analyzing data transfers between host and guests via virtio sockets (FIFOs on host, dev on guest), vsockets and TCP packets. I wrote a program to test each by passing a 1GB file and timing it. I'm using the splice system call to help move things along. In doing so, I found that my pipe between splice calls originally used "page_size" for data transfer, and that is not as efficient as finding out what the pipe size is. So I changed the code to use pipe_size and while debugging it, the guest locked up hard. I'm attaching the "agent-fifo" that runs on the guest, and the "client-fifo" that runs on the host (the names may be backwards, but makes sense when you add how I test vsockets and network packets). Here's what I did: <host> # ./client-fifo /var/lib/virt/Guest/trace-pipe-cpu0.out /test/bigfile Where the trace-pipe-cpu0.out is the receiving side from the guest's virtio pipe. The /test/bigfile is created when data starts coming in from the guest pipe. <guest> # dd if=/dev/urandom of=bigfile bs=1024 count=1048576 <guest> # ./agent-fifo /dev/virtio-ports/trace-pipe-cpu0 bigfile With the updates to change the size being passed in the splice from page_size to pipe_size, this never finished (it would copy around a meg or so). And stopped. When I killed the agent-fifo task on the guest, the guest hung hard. Debugging this, I found that the guest is stuck in the loop in drivers/char/virt_console.c: __send_control_msg(): if (virtqueue_add_outbuf(vq, sg, 1, &portdev->cpkt, GFP_ATOMIC) == 0) { virtqueue_kick(vq); while (!virtqueue_get_buf(vq, &len) && !virtqueue_is_broken(vq)) cpu_relax(); } It never exits that loop. My workaround (this patch) is to put in a timeout, and exit out if it spins there for more than 5 seconds. This makes the problem go away. Below is my changes, but this is a band-aid, it is not the cure. Workaround-fix-by: Steven Rostedt (Google) <rostedt at goodmis.org> --- diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index e3c430539a17..65f259f3f8cb 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -551,6 +551,7 @@ static ssize_t __send_control_msg(struct ports_device *portdev, u32 port_id, struct scatterlist sg[1]; struct virtqueue *vq; unsigned int len; + u64 end; if (!use_multiport(portdev)) return 0; @@ -567,9 +568,15 @@ static ssize_t __send_control_msg(struct ports_device *portdev, u32 port_id, if (virtqueue_add_outbuf(vq, sg, 1, &portdev->cpkt, GFP_ATOMIC) == 0) { virtqueue_kick(vq); + end = jiffies + 5 * HZ; while (!virtqueue_get_buf(vq, &len) - && !virtqueue_is_broken(vq)) + && !virtqueue_is_broken(vq)) { + if (unlikely(end < jiffies)) { + dev_warn(&portdev->vdev->dev, "send_control_msg timed out!\n"); + break; + } cpu_relax(); + } } spin_unlock(&portdev->c_ovq_lock); -------------- next part -------------- A non-text attachment was scrubbed... Name: agent-fifo.c Type: text/x-c++src Size: 2697 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20220325/d6d9601b/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: client-fifo.c Type: text/x-c++src Size: 2400 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20220325/d6d9601b/attachment-0001.bin>
Linus Torvalds
2022-Mar-26 00:36 UTC
[PATCH] virtio: Workaround fix for hard hang on guest using fifos
On Fri, Mar 25, 2022 at 2:20 PM Steven Rostedt <rostedt at goodmis.org> wrote:> > With the updates to change the size being passed in the splice from > page_size to pipe_size, this never finished (it would copy around a meg or > so). And stopped. When I killed the agent-fifo task on the guest, the guest > hung hard.Without knowing (or really caring) at all how virtqueue works, this sounds very much like the classic pipe deadlock where two processes communicate over a pair of pipes, sending each other commands, and replying to each other with status updates. And you absolutely cannot do that if one side can possibly want to up fill the whole pipe. Deadlock: - process A is trying to send data to process B (on 'pipe_A'), and blocks because the pipe is full - process B is reads the data and everything is fine, and A gets to continue - but then process B sends some stratus update the other way (on 'pipe_B' - you can't use the same pipe for bidirectional, it's why you use a pair of pipes or a socketpair) and waits for the result. - now A and B are both waiting for each other - A is waiting for B to empty the big bunch of data it's sending, and B is waiting for the result for the (small) command it sent. and neither makes any progress. You can find several mentions of these kinds of problems by just googling for "bidirectional pipe deadlock" or similar. The solution is invariably to either (a) make sure that nobody writes even remotely close to enough data to fill a pipe before reading the other pipe (you can still fill up a pipe, but at least somebody is always going to succeed and make progress and do the read to make progress). (b) make sure everybody who writes to a pipe will use nonblocking IO (and is willing to do reads in between to satisfy the other end). That first case is basically what one of the PIPE_BUF guarantees is all about (the other one is the atomicity it guarantees, ie you can write a "command packet" and be guaranteed that readers will see it without data mixed in from other writes). I have no idea what your client/agent does and how it interacts with the virtio pipes, but it really _sounds_ like a very similar issue, where it used to work (because PIPE_BUF) and now no longer does (because pipe filled). And that virtio_console __send_control_msg() pattern very much sounds like a "send data and wait for ACK" behavior of "process B". Linus