On Wed, 18 Apr 2018, David Miller wrote:> From: Mikulas Patocka <mpatocka at redhat.com> > Date: Wed, 18 Apr 2018 12:44:25 -0400 (EDT) > > > The structure net_device is followed by arbitrary driver-specific data > > (accessible with the function netdev_priv). And for virtio-net, these > > driver-specific data must be in DMA memory. > > And we are saying that this assumption is wrong and needs to be > corrected.So, try to find all the networking drivers that to DMA to the private area. The problem here is that kvzalloc usually returns DMA-able area, but it may return non-DMA area rarely, if the memory is too fragmented. So, we are in a situation, where some networking drivers will randomly fail. Go and find them. Mikulas
Mikulas Patocka
2018-Apr-19 16:12 UTC
[PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM
On Wed, 18 Apr 2018, Mikulas Patocka wrote:> > > On Wed, 18 Apr 2018, David Miller wrote: > > > From: Mikulas Patocka <mpatocka at redhat.com> > > Date: Wed, 18 Apr 2018 12:44:25 -0400 (EDT) > > > > > The structure net_device is followed by arbitrary driver-specific data > > > (accessible with the function netdev_priv). And for virtio-net, these > > > driver-specific data must be in DMA memory. > > > > And we are saying that this assumption is wrong and needs to be > > corrected. > > So, try to find all the networking drivers that to DMA to the private > area. > > The problem here is that kvzalloc usually returns DMA-able area, but it > may return non-DMA area rarely, if the memory is too fragmented. So, we > are in a situation, where some networking drivers will randomly fail. Go > and find them. > > MikulasHer I submit a patch that makes kvmalloc always use vmalloc if CONFIG_DEBUG_VM is defined. From: Mikulas Patocka <mpatocka at redhat.com> Subject: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM The kvmalloc function tries to use kmalloc and falls back to vmalloc if kmalloc fails. Unfortunatelly, some kernel code has bugs - it uses kvmalloc and then uses DMA-API on the returned memory or frees it with kfree. Such bugs were found in the virtio-net driver, dm-integrity or RHEL7 powerpc-specific code. These bugs are hard to reproduce because vmalloc falls back to kmalloc only if memory is fragmented. In order to detect these bugs reliably I submit this patch that changes kvmalloc to always use vmalloc if CONFIG_DEBUG_VM is turned on. Signed-off-by: Mikulas Patocka <mpatocka at redhat.com> --- mm/util.c | 2 ++ 1 file changed, 2 insertions(+) Index: linux-2.6/mm/util.c ==================================================================--- linux-2.6.orig/mm/util.c 2018-04-18 15:46:23.000000000 +0200 +++ linux-2.6/mm/util.c 2018-04-18 16:00:43.000000000 +0200 @@ -395,6 +395,7 @@ EXPORT_SYMBOL(vm_mmap); */ void *kvmalloc_node(size_t size, gfp_t flags, int node) { +#ifndef CONFIG_DEBUG_VM gfp_t kmalloc_flags = flags; void *ret; @@ -426,6 +427,7 @@ void *kvmalloc_node(size_t size, gfp_t f */ if (ret || size <= PAGE_SIZE) return ret; +#endif return __vmalloc_node_flags_caller(size, node, flags, __builtin_return_address(0));
Eric Dumazet
2018-Apr-19 16:25 UTC
[PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM
On 04/19/2018 09:12 AM, Mikulas Patocka wrote:> > > These bugs are hard to reproduce because vmalloc falls back to kmalloc > only if memory is fragmented. >This sentence is wrong. .... because kvmalloc() falls back to vmalloc() ...
Michael S. Tsirkin
2018-Apr-19 16:43 UTC
[PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM
On Thu, Apr 19, 2018 at 12:12:38PM -0400, Mikulas Patocka wrote:> > > On Wed, 18 Apr 2018, Mikulas Patocka wrote: > > > > > > > On Wed, 18 Apr 2018, David Miller wrote: > > > > > From: Mikulas Patocka <mpatocka at redhat.com> > > > Date: Wed, 18 Apr 2018 12:44:25 -0400 (EDT) > > > > > > > The structure net_device is followed by arbitrary driver-specific data > > > > (accessible with the function netdev_priv). And for virtio-net, these > > > > driver-specific data must be in DMA memory. > > > > > > And we are saying that this assumption is wrong and needs to be > > > corrected. > > > > So, try to find all the networking drivers that to DMA to the private > > area. > > > > The problem here is that kvzalloc usually returns DMA-able area, but it > > may return non-DMA area rarely, if the memory is too fragmented. So, we > > are in a situation, where some networking drivers will randomly fail. Go > > and find them. > > > > Mikulas > > Her I submit a patch that makes kvmalloc always use vmalloc if > CONFIG_DEBUG_VM is defined. > > > > > From: Mikulas Patocka <mpatocka at redhat.com> > Subject: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM > > The kvmalloc function tries to use kmalloc and falls back to vmalloc if > kmalloc fails. > > Unfortunatelly, some kernel code has bugs - it uses kvmalloc and then > uses DMA-API on the returned memory or frees it with kfree. Such bugs were > found in the virtio-net driver, dm-integrity or RHEL7 powerpc-specific > code. > > These bugs are hard to reproduce because vmalloc falls back to kmalloc > only if memory is fragmented. > > In order to detect these bugs reliably I submit this patch that changes > kvmalloc to always use vmalloc if CONFIG_DEBUG_VM is turned on. > > Signed-off-by: Mikulas Patocka <mpatocka at redhat.com>Maybe make it conditional on CONFIG_DEBUG_SG too? Otherwise I think you just trigger a hard to debug memory corruption.> --- > mm/util.c | 2 ++ > 1 file changed, 2 insertions(+) > > Index: linux-2.6/mm/util.c > ==================================================================> --- linux-2.6.orig/mm/util.c 2018-04-18 15:46:23.000000000 +0200 > +++ linux-2.6/mm/util.c 2018-04-18 16:00:43.000000000 +0200 > @@ -395,6 +395,7 @@ EXPORT_SYMBOL(vm_mmap); > */ > void *kvmalloc_node(size_t size, gfp_t flags, int node) > { > +#ifndef CONFIG_DEBUG_VM > gfp_t kmalloc_flags = flags; > void *ret; > > @@ -426,6 +427,7 @@ void *kvmalloc_node(size_t size, gfp_t f > */ > if (ret || size <= PAGE_SIZE) > return ret; > +#endif > > return __vmalloc_node_flags_caller(size, node, flags, > __builtin_return_address(0));
Vlastimil Babka
2018-Apr-19 18:28 UTC
[PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM
On 04/19/2018 06:12 PM, Mikulas Patocka wrote:> From: Mikulas Patocka <mpatocka at redhat.com> > Subject: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM > > The kvmalloc function tries to use kmalloc and falls back to vmalloc if > kmalloc fails. > > Unfortunatelly, some kernel code has bugs - it uses kvmalloc and then > uses DMA-API on the returned memory or frees it with kfree. Such bugs were > found in the virtio-net driver, dm-integrity or RHEL7 powerpc-specific > code. > > These bugs are hard to reproduce because vmalloc falls back to kmalloc > only if memory is fragmented. > > In order to detect these bugs reliably I submit this patch that changes > kvmalloc to always use vmalloc if CONFIG_DEBUG_VM is turned on. > > Signed-off-by: Mikulas Patocka <mpatocka at redhat.com>Hmm AFAIK Fedora uses CONFIG_DEBUG_VM in their kernels. Sure you want to impose this on all users? Seems too much for DEBUG_VM to me. Maybe it should be hidden under some error injection config?> --- > mm/util.c | 2 ++ > 1 file changed, 2 insertions(+) > > Index: linux-2.6/mm/util.c > ==================================================================> --- linux-2.6.orig/mm/util.c 2018-04-18 15:46:23.000000000 +0200 > +++ linux-2.6/mm/util.c 2018-04-18 16:00:43.000000000 +0200 > @@ -395,6 +395,7 @@ EXPORT_SYMBOL(vm_mmap); > */ > void *kvmalloc_node(size_t size, gfp_t flags, int node) > { > +#ifndef CONFIG_DEBUG_VM > gfp_t kmalloc_flags = flags; > void *ret; > > @@ -426,6 +427,7 @@ void *kvmalloc_node(size_t size, gfp_t f > */ > if (ret || size <= PAGE_SIZE) > return ret; > +#endifDid you verify that vmalloc does the right thing for sub-page sizes? Shouldn't those be exempted?> return __vmalloc_node_flags_caller(size, node, flags, > __builtin_return_address(0)); >
Andrew Morton
2018-Apr-19 19:47 UTC
[PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM
On Thu, 19 Apr 2018 12:12:38 -0400 (EDT) Mikulas Patocka <mpatocka at redhat.com> wrote:> The kvmalloc function tries to use kmalloc and falls back to vmalloc if > kmalloc fails. > > Unfortunatelly, some kernel code has bugs - it uses kvmalloc and then > uses DMA-API on the returned memory or frees it with kfree. Such bugs were > found in the virtio-net driver, dm-integrity or RHEL7 powerpc-specific > code. > > These bugs are hard to reproduce because vmalloc falls back to kmalloc > only if memory is fragmented.Yes, that's nasty.> In order to detect these bugs reliably I submit this patch that changes > kvmalloc to always use vmalloc if CONFIG_DEBUG_VM is turned on. > > ... > > --- linux-2.6.orig/mm/util.c 2018-04-18 15:46:23.000000000 +0200 > +++ linux-2.6/mm/util.c 2018-04-18 16:00:43.000000000 +0200 > @@ -395,6 +395,7 @@ EXPORT_SYMBOL(vm_mmap); > */ > void *kvmalloc_node(size_t size, gfp_t flags, int node) > { > +#ifndef CONFIG_DEBUG_VM > gfp_t kmalloc_flags = flags; > void *ret; > > @@ -426,6 +427,7 @@ void *kvmalloc_node(size_t size, gfp_t f > */ > if (ret || size <= PAGE_SIZE) > return ret; > +#endif > > return __vmalloc_node_flags_caller(size, node, flags, > __builtin_return_address(0));Well, it doesn't have to be done at compile-time, does it? We could add a knob (in debugfs, presumably) which enables this at runtime. That's far more user-friendly.
Matthew Wilcox
2018-Apr-20 11:47 UTC
[PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM
On Thu, Apr 19, 2018 at 12:12:38PM -0400, Mikulas Patocka wrote:> Unfortunatelly, some kernel code has bugs - it uses kvmalloc and then > uses DMA-API on the returned memory or frees it with kfree. Such bugs were > found in the virtio-net driver, dm-integrity or RHEL7 powerpc-specific > code.Maybe it's time to have the SG code handle vmalloced pages? This is becoming more and more common with vmapped stacks (and some of our workarounds are hideous -- allocate 4 bytes with kmalloc because we can't DMA onto the stack any more?). We already have a few places which do handle sgs of vmalloced addresses, such as the nx crypto driver: if (is_vmalloc_addr(start_addr)) sg_addr = page_to_phys(vmalloc_to_page(start_addr)) + offset_in_page(sg_addr); else sg_addr = __pa(sg_addr); and videobuf: pg = vmalloc_to_page(virt); if (NULL == pg) goto err; BUG_ON(page_to_pfn(pg) >= (1 << (32 - PAGE_SHIFT))); sg_set_page(&sglist[i], pg, PAGE_SIZE, 0); Yes, there's the potential that we have to produce two SG entries for a virtually contiguous region if it crosses a page boundary, and our APIs aren't set up right to make it happen. But this is something we should consider fixing ... otherwise we'll end up with dozens of driver hacks. The videobuf implementation was already copy-and-pasted into the saa7146 driver, for example.
Michal Hocko
2018-Apr-20 13:08 UTC
[PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM
On Thu 19-04-18 12:12:38, Mikulas Patocka wrote: [...]> From: Mikulas Patocka <mpatocka at redhat.com> > Subject: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM > > The kvmalloc function tries to use kmalloc and falls back to vmalloc if > kmalloc fails. > > Unfortunatelly, some kernel code has bugs - it uses kvmalloc and then > uses DMA-API on the returned memory or frees it with kfree. Such bugs were > found in the virtio-net driver, dm-integrity or RHEL7 powerpc-specific > code. > > These bugs are hard to reproduce because vmalloc falls back to kmalloc > only if memory is fragmented. > > In order to detect these bugs reliably I submit this patch that changes > kvmalloc to always use vmalloc if CONFIG_DEBUG_VM is turned on.No way. This is just wrong! First of all, you will explode most likely on many allocations of small sizes. Second, CONFIG_DEBUG_VM tends to be enabled quite often.> Signed-off-by: Mikulas Patocka <mpatocka at redhat.com>Nacked-by: Michal Hocko <mhocko at suse.com>> --- > mm/util.c | 2 ++ > 1 file changed, 2 insertions(+) > > Index: linux-2.6/mm/util.c > ==================================================================> --- linux-2.6.orig/mm/util.c 2018-04-18 15:46:23.000000000 +0200 > +++ linux-2.6/mm/util.c 2018-04-18 16:00:43.000000000 +0200 > @@ -395,6 +395,7 @@ EXPORT_SYMBOL(vm_mmap); > */ > void *kvmalloc_node(size_t size, gfp_t flags, int node) > { > +#ifndef CONFIG_DEBUG_VM > gfp_t kmalloc_flags = flags; > void *ret; > > @@ -426,6 +427,7 @@ void *kvmalloc_node(size_t size, gfp_t f > */ > if (ret || size <= PAGE_SIZE) > return ret; > +#endif > > return __vmalloc_node_flags_caller(size, node, flags, > __builtin_return_address(0));-- Michal Hocko SUSE Labs
Possibly Parallel Threads
- [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM
- [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM
- [PATCH] net: don't use kvzalloc for DMA memory
- [PATCH] drm/virtio: Use vmalloc for command buffer allocations.
- [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM