Michael S. Tsirkin
2022-Sep-29 07:19 UTC
[PATCH] vhost/vsock: Use kvmalloc/kvfree for larger packets.
On Thu, Sep 29, 2022 at 08:14:24AM +0900, Junichi Uekawa (????) wrote:> 2022?9?29?(?) 0:11 Stefano Garzarella <sgarzare at redhat.com>: > > > > On Wed, Sep 28, 2022 at 05:31:58AM -0400, Michael S. Tsirkin wrote: > > >On Wed, Sep 28, 2022 at 10:28:23AM +0200, Stefano Garzarella wrote: > > >> On Wed, Sep 28, 2022 at 03:45:38PM +0900, Junichi Uekawa wrote: > > >> > When copying a large file over sftp over vsock, data size is usually 32kB, > > >> > and kmalloc seems to fail to try to allocate 32 32kB regions. > > >> > > > >> > Call Trace: > > >> > [<ffffffffb6a0df64>] dump_stack+0x97/0xdb > > >> > [<ffffffffb68d6aed>] warn_alloc_failed+0x10f/0x138 > > >> > [<ffffffffb68d868a>] ? __alloc_pages_direct_compact+0x38/0xc8 > > >> > [<ffffffffb664619f>] __alloc_pages_nodemask+0x84c/0x90d > > >> > [<ffffffffb6646e56>] alloc_kmem_pages+0x17/0x19 > > >> > [<ffffffffb6653a26>] kmalloc_order_trace+0x2b/0xdb > > >> > [<ffffffffb66682f3>] __kmalloc+0x177/0x1f7 > > >> > [<ffffffffb66e0d94>] ? copy_from_iter+0x8d/0x31d > > >> > [<ffffffffc0689ab7>] vhost_vsock_handle_tx_kick+0x1fa/0x301 [vhost_vsock] > > >> > [<ffffffffc06828d9>] vhost_worker+0xf7/0x157 [vhost] > > >> > [<ffffffffb683ddce>] kthread+0xfd/0x105 > > >> > [<ffffffffc06827e2>] ? vhost_dev_set_owner+0x22e/0x22e [vhost] > > >> > [<ffffffffb683dcd1>] ? flush_kthread_worker+0xf3/0xf3 > > >> > [<ffffffffb6eb332e>] ret_from_fork+0x4e/0x80 > > >> > [<ffffffffb683dcd1>] ? flush_kthread_worker+0xf3/0xf3 > > >> > > > >> > Work around by doing kvmalloc instead. > > >> > > > >> > Signed-off-by: Junichi Uekawa <uekawa at chromium.org> > > > > > >My worry here is that this in more of a work around. > > >It would be better to not allocate memory so aggressively: > > >if we are so short on memory we should probably process > > >packets one at a time. Is that very hard to implement? > > > > Currently the "virtio_vsock_pkt" is allocated in the "handle_kick" > > callback of TX virtqueue. Then the packet is multiplexed on the right > > socket queue, then the user space can de-queue it whenever they want. > > > > So maybe we can stop processing the virtqueue if we are short on memory, > > but when can we restart the TX virtqueue processing? > > > > I think as long as the guest used only 4K buffers we had no problem, but > > now that it can create larger buffers the host may not be able to > > allocate it contiguously. Since there is no need to have them contiguous > > here, I think this patch is okay. > > > > However, if we switch to sk_buff (as Bobby is already doing), maybe we > > don't have this problem because I think there is some kind of > > pre-allocated pool. > > > > Thank you for the review! I was wondering if this is a reasonable workaround (as > we found that this patch makes a reliably crashing system into a > reliably surviving system.) > > > ... Sounds like it is a reasonable patch to use backported to older kernels?Hmm. Good point about stable. OK. Acked-by: Michael S. Tsirkin <mst at redhat.com>> > > > > > > > > > > >> > --- > > >> > > > >> > drivers/vhost/vsock.c | 2 +- > > >> > net/vmw_vsock/virtio_transport_common.c | 2 +- > > >> > 2 files changed, 2 insertions(+), 2 deletions(-) > > >> > > > >> > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c > > >> > index 368330417bde..5703775af129 100644 > > >> > --- a/drivers/vhost/vsock.c > > >> > +++ b/drivers/vhost/vsock.c > > >> > @@ -393,7 +393,7 @@ vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq, > > >> > return NULL; > > >> > } > > >> > > > >> > - pkt->buf = kmalloc(pkt->len, GFP_KERNEL); > > >> > + pkt->buf = kvmalloc(pkt->len, GFP_KERNEL); > > >> > if (!pkt->buf) { > > >> > kfree(pkt); > > >> > return NULL; > > >> > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c > > >> > index ec2c2afbf0d0..3a12aee33e92 100644 > > >> > --- a/net/vmw_vsock/virtio_transport_common.c > > >> > +++ b/net/vmw_vsock/virtio_transport_common.c > > >> > @@ -1342,7 +1342,7 @@ EXPORT_SYMBOL_GPL(virtio_transport_recv_pkt); > > >> > > > >> > void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt) > > >> > { > > >> > - kfree(pkt->buf); > > >> > + kvfree(pkt->buf); > > >> > > >> virtio_transport_free_pkt() is used also in virtio_transport.c and > > >> vsock_loopback.c where pkt->buf is allocated with kmalloc(), but IIUC > > >> kvfree() can be used with that memory, so this should be fine. > > >> > > >> > kfree(pkt); > > >> > } > > >> > EXPORT_SYMBOL_GPL(virtio_transport_free_pkt); > > >> > -- > > >> > 2.37.3.998.g577e59143f-goog > > >> > > > >> > > >> This issue should go away with the Bobby's work about introducing sk_buff > > >> [1], but we can queue this for now. > > >> > > >> I'm not sure if we should do the same also in the virtio-vsock driver > > >> (virtio_transport.c). Here in vhost-vsock the buf allocated is only used in > > >> the host, while in the virtio-vsock driver the buffer is exposed to the > > >> device emulated in the host, so it should be physically contiguous (if not, > > >> maybe we need to adjust virtio_vsock_rx_fill()). > > > > > >More importantly it needs to support DMA API which IIUC kvmalloc > > >memory does not. > > > > > > > Right, good point! > > > > Thanks, > > Stefano > > > > > -- > Junichi Uekawa > Google
Stefano Garzarella
2022-Sep-29 07:46 UTC
[PATCH] vhost/vsock: Use kvmalloc/kvfree for larger packets.
On Thu, Sep 29, 2022 at 03:19:14AM -0400, Michael S. Tsirkin wrote:>On Thu, Sep 29, 2022 at 08:14:24AM +0900, Junichi Uekawa (????) wrote: >> 2022?9?29?(?) 0:11 Stefano Garzarella <sgarzare at redhat.com>: >> > >> > On Wed, Sep 28, 2022 at 05:31:58AM -0400, Michael S. Tsirkin wrote: >> > >On Wed, Sep 28, 2022 at 10:28:23AM +0200, Stefano Garzarella wrote: >> > >> On Wed, Sep 28, 2022 at 03:45:38PM +0900, Junichi Uekawa wrote: >> > >> > When copying a large file over sftp over vsock, data size is usually 32kB, >> > >> > and kmalloc seems to fail to try to allocate 32 32kB regions. >> > >> > >> > >> > Call Trace: >> > >> > [<ffffffffb6a0df64>] dump_stack+0x97/0xdb >> > >> > [<ffffffffb68d6aed>] warn_alloc_failed+0x10f/0x138 >> > >> > [<ffffffffb68d868a>] ? __alloc_pages_direct_compact+0x38/0xc8 >> > >> > [<ffffffffb664619f>] __alloc_pages_nodemask+0x84c/0x90d >> > >> > [<ffffffffb6646e56>] alloc_kmem_pages+0x17/0x19 >> > >> > [<ffffffffb6653a26>] kmalloc_order_trace+0x2b/0xdb >> > >> > [<ffffffffb66682f3>] __kmalloc+0x177/0x1f7 >> > >> > [<ffffffffb66e0d94>] ? copy_from_iter+0x8d/0x31d >> > >> > [<ffffffffc0689ab7>] vhost_vsock_handle_tx_kick+0x1fa/0x301 [vhost_vsock] >> > >> > [<ffffffffc06828d9>] vhost_worker+0xf7/0x157 [vhost] >> > >> > [<ffffffffb683ddce>] kthread+0xfd/0x105 >> > >> > [<ffffffffc06827e2>] ? vhost_dev_set_owner+0x22e/0x22e [vhost] >> > >> > [<ffffffffb683dcd1>] ? flush_kthread_worker+0xf3/0xf3 >> > >> > [<ffffffffb6eb332e>] ret_from_fork+0x4e/0x80 >> > >> > [<ffffffffb683dcd1>] ? flush_kthread_worker+0xf3/0xf3 >> > >> > >> > >> > Work around by doing kvmalloc instead. >> > >> > >> > >> > Signed-off-by: Junichi Uekawa <uekawa at chromium.org> >> > > >> > >My worry here is that this in more of a work around. >> > >It would be better to not allocate memory so aggressively: >> > >if we are so short on memory we should probably process >> > >packets one at a time. Is that very hard to implement? >> > >> > Currently the "virtio_vsock_pkt" is allocated in the "handle_kick" >> > callback of TX virtqueue. Then the packet is multiplexed on the right >> > socket queue, then the user space can de-queue it whenever they want. >> > >> > So maybe we can stop processing the virtqueue if we are short on memory, >> > but when can we restart the TX virtqueue processing? >> > >> > I think as long as the guest used only 4K buffers we had no problem, but >> > now that it can create larger buffers the host may not be able to >> > allocate it contiguously. Since there is no need to have them contiguous >> > here, I think this patch is okay. >> > >> > However, if we switch to sk_buff (as Bobby is already doing), maybe we >> > don't have this problem because I think there is some kind of >> > pre-allocated pool. >> > >> >> Thank you for the review! I was wondering if this is a reasonable workaround (as >> we found that this patch makes a reliably crashing system into a >> reliably surviving system.) >> >> >> ... Sounds like it is a reasonable patch to use backported to older kernels? > >Hmm. Good point about stable. OK.Right, so in this case I think is better to add a Fixes tag. Since we used kmalloc from the beginning we can use the following: Fixes: 433fc58e6bf2 ("VSOCK: Introduce vhost_vsock.ko")> >Acked-by: Michael S. Tsirkin <mst at redhat.com> >@Michael are you queueing this, or should it go through net tree? Thanks, Stefano