Richard Yao
2013-Dec-04 20:43 UTC
[PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
The 9p-virtio transport does zero copy on things larger than 1024 bytes in size. It accomplishes this by returning the physical addresses of pages to the virtio-pci device. At present, the translation is usually a bit shift. However, that approach produces an invalid page address when we read/write to vmalloc buffers, such as those used for Linux kernle modules. This causes QEMU to die printing: qemu-system-x86_64: virtio: trying to map MMIO memory This patch enables 9p-virtio to correctly handle this case. This not only enables us to load Linux kernel modules off virtfs, but also enables ZFS file-based vdevs on virtfs to be used without killing QEMU. Also, special thanks to both Avi Kivity and Alexander Graf for their interpretation of QEMU backtraces. Without their guidence, tracking down this bug would have taken much longer. Signed-off-by: Richard Yao <ryao at gentoo.org> --- net/9p/trans_virtio.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c index 9c5a1aa..5d1d04b 100644 --- a/net/9p/trans_virtio.c +++ b/net/9p/trans_virtio.c @@ -340,7 +340,10 @@ static int p9_get_mapped_pages(struct virtio_chan *chan, int count = nr_pages; while (nr_pages) { s = rest_of_page(data); - pages[index++] = kmap_to_page(data); + if (is_vmalloc_or_module_addr(data)) + pages[index++] = vmalloc_to_page(data); + else + pages[index++] = kmap_to_page(data); data += s; nr_pages--; } -- 1.8.3.2
Alexander Graf
2013-Dec-04 20:49 UTC
[PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
On 04.12.2013, at 21:43, Richard Yao <ryao at gentoo.org> wrote:> The 9p-virtio transport does zero copy on things larger than 1024 bytes > in size. It accomplishes this by returning the physical addresses of > pages to the virtio-pci device. At present, the translation is usually a > bit shift. > > However, that approach produces an invalid page address when we > read/write to vmalloc buffers, such as those used for Linux kernle > modules. This causes QEMU to die printing: > > qemu-system-x86_64: virtio: trying to map MMIO memory > > This patch enables 9p-virtio to correctly handle this case. This not > only enables us to load Linux kernel modules off virtfs, but also > enables ZFS file-based vdevs on virtfs to be used without killing QEMU. > > Also, special thanks to both Avi Kivity and Alexander Graf for their > interpretation of QEMU backtraces. Without their guidence, tracking down > this bug would have taken much longer. > > Signed-off-by: Richard Yao <ryao at gentoo.org>Looks good and clean to me, but I'm not expert on the mm subsystem. Acked-by: Alexander Graf <agraf at suse.de> Alex
Will Deacon
2013-Dec-06 11:14 UTC
[PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
On Wed, Dec 04, 2013 at 08:43:18PM +0000, Richard Yao wrote:> The 9p-virtio transport does zero copy on things larger than 1024 bytes > in size. It accomplishes this by returning the physical addresses of > pages to the virtio-pci device. At present, the translation is usually a > bit shift. > > However, that approach produces an invalid page address when we > read/write to vmalloc buffers, such as those used for Linux kernle > modules. This causes QEMU to die printing: > > qemu-system-x86_64: virtio: trying to map MMIO memory > > This patch enables 9p-virtio to correctly handle this case. This not > only enables us to load Linux kernel modules off virtfs, but also > enables ZFS file-based vdevs on virtfs to be used without killing QEMU. > > Also, special thanks to both Avi Kivity and Alexander Graf for their > interpretation of QEMU backtraces. Without their guidence, tracking down > this bug would have taken much longer. > > Signed-off-by: Richard Yao <ryao at gentoo.org> > --- > net/9p/trans_virtio.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c > index 9c5a1aa..5d1d04b 100644 > --- a/net/9p/trans_virtio.c > +++ b/net/9p/trans_virtio.c > @@ -340,7 +340,10 @@ static int p9_get_mapped_pages(struct virtio_chan *chan, > int count = nr_pages; > while (nr_pages) { > s = rest_of_page(data); > - pages[index++] = kmap_to_page(data); > + if (is_vmalloc_or_module_addr(data))Can this really end up being a module address? Will
Richard Yao
2013-Dec-06 14:38 UTC
[PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
On 12/06/2013 06:14 AM, Will Deacon wrote:> On Wed, Dec 04, 2013 at 08:43:18PM +0000, Richard Yao wrote: >> The 9p-virtio transport does zero copy on things larger than 1024 bytes >> in size. It accomplishes this by returning the physical addresses of >> pages to the virtio-pci device. At present, the translation is usually a >> bit shift. >> >> However, that approach produces an invalid page address when we >> read/write to vmalloc buffers, such as those used for Linux kernle >> modules. This causes QEMU to die printing: >> >> qemu-system-x86_64: virtio: trying to map MMIO memory >> >> This patch enables 9p-virtio to correctly handle this case. This not >> only enables us to load Linux kernel modules off virtfs, but also >> enables ZFS file-based vdevs on virtfs to be used without killing QEMU. >> >> Also, special thanks to both Avi Kivity and Alexander Graf for their >> interpretation of QEMU backtraces. Without their guidence, tracking down >> this bug would have taken much longer. >> >> Signed-off-by: Richard Yao <ryao at gentoo.org> >> --- >> net/9p/trans_virtio.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c >> index 9c5a1aa..5d1d04b 100644 >> --- a/net/9p/trans_virtio.c >> +++ b/net/9p/trans_virtio.c >> @@ -340,7 +340,10 @@ static int p9_get_mapped_pages(struct virtio_chan *chan, >> int count = nr_pages; >> while (nr_pages) { >> s = rest_of_page(data); >> - pages[index++] = kmap_to_page(data); >> + if (is_vmalloc_or_module_addr(data)) > > Can this really end up being a module address?Yes. Here is the stacktrace to prove it: [<ffffffff814878ce>] p9_virtio_zc_request+0x45e/0x510 [<ffffffff814814ed>] p9_client_zc_rpc.constprop.16+0xfd/0x4f0 [<ffffffff814839dd>] p9_client_read+0x15d/0x240 [<ffffffff811c8440>] v9fs_fid_readn+0x50/0xa0 [<ffffffff811c84a0>] v9fs_file_readn+0x10/0x20 [<ffffffff811c84e7>] v9fs_file_read+0x37/0x70 [<ffffffff8114e3fb>] vfs_read+0x9b/0x160 [<ffffffff81153571>] kernel_read+0x41/0x60 [<ffffffff810c83ab>] copy_module_from_fd.isra.34+0xfb/0x180 [<ffffffff810cc420>] SyS_finit_module+0x70/0xd0 [<ffffffff814a08fd>] system_call_fastpath+0x1a/0x1f [<ffffffffffffffff>] 0xffffffffffffffff This is easily reproducible by trying to load a module off virtfs. QEMU will print the message that I cited in the commit message and then kill itself. P.S. I omitted the CC list the first time I sent this, so I am resending with the correct CC list. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 901 bytes Desc: OpenPGP digital signature URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20131206/5f78f02c/attachment.sig>
Possibly Parallel Threads
- [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
- [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
- [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
- [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
- [PATCH] virtio: 9p: correctly pass physical address to userspace for high pages