Michel Dänzer
2017-Dec-19 10:37 UTC
[Nouveau] nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
On 2017-12-18 08:01 PM, Tobias Klausmann wrote:> On 12/18/17 7:06 PM, Mike Galbraith wrote: >> Greetings, >> >> Kernel bound workloads seem to trigger the below for whatever reason. >> I only see this when beating up NFS. There was a kworker wakeup >> latency issue, but with a bandaid applied to fix that up, I can still >> trigger this. > > > Hi, > > i have seen this one as well with my system, but i could not find an > easy way to trigger it for bisecting purpose. If you can trigger it > conveniently, a bisect would be nice!I'm seeing this (with the amdgpu and radeon drivers) when restic takes a backup, creating memory pressure. I happen to have just finished bisecting, the result is: 648bc3574716400acc06f99915815f80d9563783 is the first bad commit commit 648bc3574716400acc06f99915815f80d9563783 Author: Christian König <christian.koenig at amd.com> Date: Thu Jul 6 09:59:43 2017 +0200 drm/ttm: add transparent huge page support for DMA allocations v2 Try to allocate huge pages when it makes sense. v2: fix comment and use ifdef -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
Michel Dänzer
2017-Dec-19 10:39 UTC
[Nouveau] nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
On 2017-12-19 11:37 AM, Michel Dänzer wrote:> On 2017-12-18 08:01 PM, Tobias Klausmann wrote: >> On 12/18/17 7:06 PM, Mike Galbraith wrote: >>> Greetings, >>> >>> Kernel bound workloads seem to trigger the below for whatever reason. >>> I only see this when beating up NFS. There was a kworker wakeup >>> latency issue, but with a bandaid applied to fix that up, I can still >>> trigger this. >> >> >> Hi, >> >> i have seen this one as well with my system, but i could not find an >> easy way to trigger it for bisecting purpose. If you can trigger it >> conveniently, a bisect would be nice! > > I'm seeing this (with the amdgpu and radeon drivers) when restic takes a > backup, creating memory pressure. I happen to have just finished > bisecting, the result is: > > 648bc3574716400acc06f99915815f80d9563783 is the first bad commit > commit 648bc3574716400acc06f99915815f80d9563783 > Author: Christian König <christian.koenig at amd.com> > Date: Thu Jul 6 09:59:43 2017 +0200 > > drm/ttm: add transparent huge page support for DMA allocations v2 > > Try to allocate huge pages when it makes sense. > > v2: fix comment and use ifdef > >BTW, I haven't noticed any bad effects other than the dmesg splats, so maybe it's just noise about transient failures for which there is a proper fallback in place. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
Christian König
2017-Dec-19 13:45 UTC
[Nouveau] nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
Am 19.12.2017 um 11:39 schrieb Michel Dänzer:> On 2017-12-19 11:37 AM, Michel Dänzer wrote: >> On 2017-12-18 08:01 PM, Tobias Klausmann wrote: >>> On 12/18/17 7:06 PM, Mike Galbraith wrote: >>>> Greetings, >>>> >>>> Kernel bound workloads seem to trigger the below for whatever reason. >>>> I only see this when beating up NFS. There was a kworker wakeup >>>> latency issue, but with a bandaid applied to fix that up, I can still >>>> trigger this. >>> >>> Hi, >>> >>> i have seen this one as well with my system, but i could not find an >>> easy way to trigger it for bisecting purpose. If you can trigger it >>> conveniently, a bisect would be nice! >> I'm seeing this (with the amdgpu and radeon drivers) when restic takes a >> backup, creating memory pressure. I happen to have just finished >> bisecting, the result is: >> >> 648bc3574716400acc06f99915815f80d9563783 is the first bad commit >> commit 648bc3574716400acc06f99915815f80d9563783 >> Author: Christian König <christian.koenig at amd.com> >> Date: Thu Jul 6 09:59:43 2017 +0200 >> >> drm/ttm: add transparent huge page support for DMA allocations v2 >> >> Try to allocate huge pages when it makes sense. >> >> v2: fix comment and use ifdef >> >> > BTW, I haven't noticed any bad effects other than the dmesg splats, so > maybe it's just noise about transient failures for which there is a > proper fallback in place.Yeah, I think that is exactly what happens here. We try to allocate a huge page, but fail and so fall back to using multiple 4k pages instead. Going to send out a patch to suppress the warning. Thanks for bisecting this, Christian.
Apparently Analagous Threads
- nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
- nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
- nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
- nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152
- nouveau. swiotlb: coherent allocation failed for device 0000:01:00.0 size=2097152