Mike Galbraith
2017-Dec-18 18:06 UTC
[Nouveau] nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
Greetings, Kernel bound workloads seem to trigger the below for whatever reason. I only see this when beating up NFS. There was a kworker wakeup latency issue, but with a bandaid applied to fix that up, I can still trigger this. [ 1313.811031] nouveau 0000:01:00.0: swiotlb buffer is full (sz: 2097152 bytes) [ 1313.811035] swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152 [ 1313.811038] CPU: 6 PID: 3026 Comm: Xorg Tainted: G E 4.15.0.g1291a0d5-master #355 [ 1313.811040] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013 [ 1313.811041] Call Trace: [ 1313.811049] dump_stack+0x7c/0xb6 [ 1313.811053] swiotlb_alloc_coherent+0x13f/0x150 [ 1313.811060] ttm_dma_pool_alloc_new_pages+0x106/0x3c0 [ttm] [ 1313.811066] ttm_dma_pool_get_pages+0x10a/0x1e0 [ttm] [ 1313.811070] ttm_dma_populate+0x21f/0x2f0 [ttm] [ 1313.811075] ttm_tt_bind+0x2f/0x60 [ttm] [ 1313.811079] ttm_bo_handle_move_mem+0x51f/0x580 [ttm] [ 1313.811084] ? ttm_bo_handle_move_mem+0x5/0x580 [ttm] [ 1313.811088] ttm_bo_validate+0x10c/0x120 [ttm] [ 1313.811092] ? ttm_bo_validate+0x5/0x120 [ttm] [ 1313.811106] ? drm_mode_setcrtc+0x20e/0x540 [drm] [ 1313.811109] ttm_bo_init_reserved+0x290/0x490 [ttm] [ 1313.811114] ttm_bo_init+0x52/0xb0 [ttm] [ 1313.811141] ? nv10_bo_put_tile_region+0x60/0x60 [nouveau] [ 1313.811163] nouveau_bo_new+0x465/0x5e0 [nouveau] [ 1313.811184] ? nv10_bo_put_tile_region+0x60/0x60 [nouveau] [ 1313.811203] nouveau_gem_new+0x66/0x110 [nouveau] [ 1313.811223] ? nouveau_gem_new+0x110/0x110 [nouveau] [ 1313.811241] nouveau_gem_ioctl_new+0x48/0xc0 [nouveau] [ 1313.811249] drm_ioctl_kernel+0x64/0xb0 [drm] [ 1313.811257] drm_ioctl+0x2a4/0x360 [drm] [ 1313.811276] ? nouveau_gem_new+0x110/0x110 [nouveau] [ 1313.811285] ? drm_ioctl+0x5/0x360 [drm] [ 1313.811304] nouveau_drm_ioctl+0x50/0xb0 [nouveau] [ 1313.811308] do_vfs_ioctl+0x90/0x690 [ 1313.811311] ? do_vfs_ioctl+0x5/0x690 [ 1313.811313] SyS_ioctl+0x3b/0x70 [ 1313.811316] entry_SYSCALL_64_fastpath+0x1f/0x91 [ 1313.811320] RIP: 0033:0x7f3234746227 [ 1313.811321] RSP: 002b:00007ffc3ace0408 EFLAGS: 00003246 ORIG_RAX: 0000000000000010 [ 1313.811324] RAX: ffffffffffffffda RBX: 00000000025515d0 RCX: 00007f3234746227 [ 1313.811325] RDX: 00007ffc3ace0460 RSI: 00000000c0306480 RDI: 000000000000000b [ 1313.811326] RBP: 0000000000824120 R08: 0000000002548f80 R09: 00000000025490d0 [ 1313.811328] R10: 0000000000000000 R11: 0000000000003246 R12: 000000000000093d [ 1313.811329] R13: 0000000002aff74c R14: 0000000000824150 R15: 0000000000000000
Tobias Klausmann
2017-Dec-18 19:01 UTC
[Nouveau] nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
On 12/18/17 7:06 PM, Mike Galbraith wrote:> Greetings, > > Kernel bound workloads seem to trigger the below for whatever reason. > I only see this when beating up NFS. There was a kworker wakeup > latency issue, but with a bandaid applied to fix that up, I can still > trigger this.Hi, i have seen this one as well with my system, but i could not find an easy way to trigger it for bisecting purpose. If you can trigger it conveniently, a bisect would be nice! Greetings, Tobias> > [ 1313.811031] nouveau 0000:01:00.0: swiotlb buffer is full (sz: 2097152 bytes) > [ 1313.811035] swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152 > [ 1313.811038] CPU: 6 PID: 3026 Comm: Xorg Tainted: G E 4.15.0.g1291a0d5-master #355 > [ 1313.811040] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013 > [ 1313.811041] Call Trace: > [ 1313.811049] dump_stack+0x7c/0xb6 > [ 1313.811053] swiotlb_alloc_coherent+0x13f/0x150 > [ 1313.811060] ttm_dma_pool_alloc_new_pages+0x106/0x3c0 [ttm] > [ 1313.811066] ttm_dma_pool_get_pages+0x10a/0x1e0 [ttm] > [ 1313.811070] ttm_dma_populate+0x21f/0x2f0 [ttm] > [ 1313.811075] ttm_tt_bind+0x2f/0x60 [ttm] > [ 1313.811079] ttm_bo_handle_move_mem+0x51f/0x580 [ttm] > [ 1313.811084] ? ttm_bo_handle_move_mem+0x5/0x580 [ttm] > [ 1313.811088] ttm_bo_validate+0x10c/0x120 [ttm] > [ 1313.811092] ? ttm_bo_validate+0x5/0x120 [ttm] > [ 1313.811106] ? drm_mode_setcrtc+0x20e/0x540 [drm] > [ 1313.811109] ttm_bo_init_reserved+0x290/0x490 [ttm] > [ 1313.811114] ttm_bo_init+0x52/0xb0 [ttm] > [ 1313.811141] ? nv10_bo_put_tile_region+0x60/0x60 [nouveau] > [ 1313.811163] nouveau_bo_new+0x465/0x5e0 [nouveau] > [ 1313.811184] ? nv10_bo_put_tile_region+0x60/0x60 [nouveau] > [ 1313.811203] nouveau_gem_new+0x66/0x110 [nouveau] > [ 1313.811223] ? nouveau_gem_new+0x110/0x110 [nouveau] > [ 1313.811241] nouveau_gem_ioctl_new+0x48/0xc0 [nouveau] > [ 1313.811249] drm_ioctl_kernel+0x64/0xb0 [drm] > [ 1313.811257] drm_ioctl+0x2a4/0x360 [drm] > [ 1313.811276] ? nouveau_gem_new+0x110/0x110 [nouveau] > [ 1313.811285] ? drm_ioctl+0x5/0x360 [drm] > [ 1313.811304] nouveau_drm_ioctl+0x50/0xb0 [nouveau] > [ 1313.811308] do_vfs_ioctl+0x90/0x690 > [ 1313.811311] ? do_vfs_ioctl+0x5/0x690 > [ 1313.811313] SyS_ioctl+0x3b/0x70 > [ 1313.811316] entry_SYSCALL_64_fastpath+0x1f/0x91 > [ 1313.811320] RIP: 0033:0x7f3234746227 > [ 1313.811321] RSP: 002b:00007ffc3ace0408 EFLAGS: 00003246 ORIG_RAX: 0000000000000010 > [ 1313.811324] RAX: ffffffffffffffda RBX: 00000000025515d0 RCX: 00007f3234746227 > [ 1313.811325] RDX: 00007ffc3ace0460 RSI: 00000000c0306480 RDI: 000000000000000b > [ 1313.811326] RBP: 0000000000824120 R08: 0000000002548f80 R09: 00000000025490d0 > [ 1313.811328] R10: 0000000000000000 R11: 0000000000003246 R12: 000000000000093d > [ 1313.811329] R13: 0000000002aff74c R14: 0000000000824150 R15: 0000000000000000
Mike Galbraith
2017-Dec-18 19:12 UTC
[Nouveau] nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
On Mon, 2017-12-18 at 20:01 +0100, Tobias Klausmann wrote:> On 12/18/17 7:06 PM, Mike Galbraith wrote: > > Greetings, > > > > Kernel bound workloads seem to trigger the below for whatever reason. > > I only see this when beating up NFS. There was a kworker wakeup > > latency issue, but with a bandaid applied to fix that up, I can still > > trigger this. > > > Hi, > > i have seen this one as well with my system, but i could not find an > easy way to trigger it for bisecting purpose. If you can trigger it > conveniently, a bisect would be nice!Workload permitting. To reproduce, mount your box NFS, cd to somewhere the NFS mount, and just do bonnie -s <memory size>. There, maybe you'll beat me to it. I hope so, I have multiple kernels doing the annoying "baby birds in a nest" thing at me literally endlessly :) -Mike
Michel Dänzer
2017-Dec-19 10:37 UTC
[Nouveau] nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
On 2017-12-18 08:01 PM, Tobias Klausmann wrote:> On 12/18/17 7:06 PM, Mike Galbraith wrote: >> Greetings, >> >> Kernel bound workloads seem to trigger the below for whatever reason. >> I only see this when beating up NFS. There was a kworker wakeup >> latency issue, but with a bandaid applied to fix that up, I can still >> trigger this. > > > Hi, > > i have seen this one as well with my system, but i could not find an > easy way to trigger it for bisecting purpose. If you can trigger it > conveniently, a bisect would be nice!I'm seeing this (with the amdgpu and radeon drivers) when restic takes a backup, creating memory pressure. I happen to have just finished bisecting, the result is: 648bc3574716400acc06f99915815f80d9563783 is the first bad commit commit 648bc3574716400acc06f99915815f80d9563783 Author: Christian König <christian.koenig at amd.com> Date: Thu Jul 6 09:59:43 2017 +0200 drm/ttm: add transparent huge page support for DMA allocations v2 Try to allocate huge pages when it makes sense. v2: fix comment and use ifdef -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
Apparently Analagous Threads
- nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
- nouveau: swiotlb buffer is full (sz: 2097152 bytes)/swiotlb: coherent allocation failed, size=2097152 spam
- nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
- nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
- nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152