Maarten Maathuis
2010-Jan-09 23:41 UTC
[Nouveau] [TTM] general protection fault in ttm_tt_swapout, to_virtual looks screwed up
I've been noticing for a while that i've been getting general protection faults in ttm_to_swapout, this time i was printk'ing the virtual addresses. In case it's not obvious, the result of kmap_atomic() is wrong. This is nouveau/linux-2.6 which is somewhere after 2.6.32. I was wondering if anyone has ever seen anything like this? from_virtual ffff88003088b000 to_virtual ffff88003100f000 from_virtual ffff88003088c000 to_virtual ffff880031018000 from_virtual ffff88003088d000 to_virtual ffff880031019000 from_virtual ffff88003088e000 to_virtual ffff88003101a000 from_virtual ffff88003088f000 to_virtual ffff88003101b000 from_virtual ffff880030890000 to_virtual ffff880031016000 from_virtual ffff880030891000 to_virtual ffff880031017000 from_virtual ffff880030892000 to_virtual ffff88003101c000 from_virtual ffff880030893000 to_virtual ffff88003101d000 from_virtual ffff880030894000 to_virtual ffff88003101e000 from_virtual ffff880030895000 to_virtual ffff88003101f000 from_virtual ffff880030896000 to_virtual ffff88001c626000 from_virtual ffff880030897000 to_virtual ffff88001c627000 from_virtual ffff880030898000 to_virtual ffff88001c752000 from_virtual ffff880030899000 to_virtual ffff88001c66d000 from_virtual ffff88003089a000 to_virtual ffff88001c642000 from_virtual ffff8800308db000 to_virtual 0005d12492492000 general protection fault: 0000 [#1] PREEMPT last sysfs file: /sys/devices/pci0000:00/0000:00:08.0/host3/uevent CPU 0 Modules linked in: nouveau snd_ice1724 snd_rawmidi wm8775 tuner ttm drm_kms_helper usbhid snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 cx25840 snd_pcm ivtv drm snd_timer snd_page_alloc snd_pt2258 snd_i2c cx2341x agpgart ehci_hcd i2c_algo_bit i2c_nforce2 tveeprom snd soundcore ohci_hcd Pid: 1501, comm: ttm_swap Not tainted 2.6.32-00848-gedc4439-dirty #172 System Product Name RIP: 0010:[<ffffffffa017c3fb>] [<ffffffffa017c3fb>] ttm_tt_swapout+0x135/0x1db [ttm] RSP: 0000:ffff88003749fce0 EFLAGS: 00010282 RAX: 0000000000000040 RBX: 0005d12492492000 RCX: 0000000000000400 RDX: 00000000ffffffff RSI: ffff8800308db000 RDI: 0005d12492492000 RBP: ffff88003749fd30 R08: 0000000000be8bfa R09: ffff88003749fb30 R10: 000000000000000f R11: 0000000000000020 R12: ffff8800308db000 R13: ffff880030a7a4e0 R14: fffffffffffffff4 R15: ffff880037795a00 FS: 00007fa003a78700(0000) GS:ffffffff814d8000(0000) knlGS:00000000f5a256d0 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00007fcac3873f80 CR3: 000000003ecfd000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process ttm_swap (pid: 1501, threadinfo ffff88003749e000, task ffff88003d2ae360) Stack: ffff88003e00b910 ffff88003749ffd8 ffff880037795a00 ffff8800377918b8 <0> 000000001f4be800 ffff88001f4be870 ffff88001f4be800 0000000000000000 <0> ffff88001f4be844 ffff88003edb95d8 ffff88003749fdb0 ffffffffa017d550 Call Trace: [<ffffffffa017d550>] ttm_bo_swapout+0x1df/0x222 [ttm] [<ffffffffa017b38d>] ttm_shrink+0x9b/0xc0 [ttm] [<ffffffffa017b3c6>] ttm_shrink_work+0x14/0x16 [ttm] [<ffffffff810489c7>] worker_thread+0x1b7/0x25e [<ffffffffa017b3b2>] ? ttm_shrink_work+0x0/0x16 [ttm] [<ffffffff8104bd8a>] ? autoremove_wake_function+0x0/0x38 [<ffffffff81048810>] ? worker_thread+0x0/0x25e [<ffffffff8104ba75>] kthread+0x7c/0x84 [<ffffffff8100bd6a>] child_rip+0xa/0x20 [<ffffffffa01d776e>] ? nouveau_mem_init_heap+0x5d/0xfd [nouveau] [<ffffffff8104b9f9>] ? kthread+0x0/0x84 [<ffffffff8100bd60>] ? child_rip+0x0/0x20 Code: 04 00 00 00 e8 97 f7 ff ff 4c 89 e6 48 89 c2 48 89 c3 48 c7 c7 92 1f 18 a0 31 c0 e8 ca ad 19 e1 48 89 df 4c 89 e6 b9 00 04 00 00 <f3> a5 e8 ba f7 ff ff e8 b5 f7 ff ff bf 01 00 00 00 e8 c4 f0 19 RIP [<ffffffffa017c3fb>] ttm_tt_swapout+0x135/0x1db [ttm] RSP <ffff88003749fce0> ---[ end trace 8724ec5cfdbfe4ce ]--- note: ttm_swap[1501] exited with preempt_count 3
Thomas Hellstrom
2010-Jan-10 19:10 UTC
[Nouveau] [TTM] general protection fault in ttm_tt_swapout, to_virtual looks screwed up
I've seen something similar with openchrome, but I think I traced that down to the DMA engines causing memory corruption. Note that IIRC kmap_atomic may return page_address(page) for a lowmem page. Any idea what may cause kmap_atomic to behave in this way? /Thomas Maarten Maathuis wrote:> I've been noticing for a while that i've been getting general > protection faults in ttm_to_swapout, this time i was printk'ing the > virtual addresses. > > In case it's not obvious, the result of kmap_atomic() is wrong. > > This is nouveau/linux-2.6 which is somewhere after 2.6.32. I was > wondering if anyone has ever seen anything like this? > > from_virtual ffff88003088b000 to_virtual ffff88003100f000 > from_virtual ffff88003088c000 to_virtual ffff880031018000 > from_virtual ffff88003088d000 to_virtual ffff880031019000 > from_virtual ffff88003088e000 to_virtual ffff88003101a000 > from_virtual ffff88003088f000 to_virtual ffff88003101b000 > from_virtual ffff880030890000 to_virtual ffff880031016000 > from_virtual ffff880030891000 to_virtual ffff880031017000 > from_virtual ffff880030892000 to_virtual ffff88003101c000 > from_virtual ffff880030893000 to_virtual ffff88003101d000 > from_virtual ffff880030894000 to_virtual ffff88003101e000 > from_virtual ffff880030895000 to_virtual ffff88003101f000 > from_virtual ffff880030896000 to_virtual ffff88001c626000 > from_virtual ffff880030897000 to_virtual ffff88001c627000 > from_virtual ffff880030898000 to_virtual ffff88001c752000 > from_virtual ffff880030899000 to_virtual ffff88001c66d000 > from_virtual ffff88003089a000 to_virtual ffff88001c642000 > from_virtual ffff8800308db000 to_virtual 0005d12492492000 > general protection fault: 0000 [#1] PREEMPT > last sysfs file: /sys/devices/pci0000:00/0000:00:08.0/host3/uevent > CPU 0 > Modules linked in: nouveau snd_ice1724 snd_rawmidi wm8775 tuner ttm > drm_kms_helper usbhid snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus > snd_ak4xxx_adda snd_ak4114 cx25840 snd_pcm ivtv drm snd_timer > snd_page_alloc snd_pt2258 snd_i2c cx2341x agpgart ehci_hcd > i2c_algo_bit i2c_nforce2 tveeprom snd soundcore ohci_hcd > Pid: 1501, comm: ttm_swap Not tainted 2.6.32-00848-gedc4439-dirty #172 > System Product Name > RIP: 0010:[<ffffffffa017c3fb>] [<ffffffffa017c3fb>] > ttm_tt_swapout+0x135/0x1db [ttm] > RSP: 0000:ffff88003749fce0 EFLAGS: 00010282 > RAX: 0000000000000040 RBX: 0005d12492492000 RCX: 0000000000000400 > RDX: 00000000ffffffff RSI: ffff8800308db000 RDI: 0005d12492492000 > RBP: ffff88003749fd30 R08: 0000000000be8bfa R09: ffff88003749fb30 > R10: 000000000000000f R11: 0000000000000020 R12: ffff8800308db000 > R13: ffff880030a7a4e0 R14: fffffffffffffff4 R15: ffff880037795a00 > FS: 00007fa003a78700(0000) GS:ffffffff814d8000(0000) knlGS:00000000f5a256d0 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 00007fcac3873f80 CR3: 000000003ecfd000 CR4: 00000000000006f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process ttm_swap (pid: 1501, threadinfo ffff88003749e000, task ffff88003d2ae360) > Stack: > ffff88003e00b910 ffff88003749ffd8 ffff880037795a00 ffff8800377918b8 > <0> 000000001f4be800 ffff88001f4be870 ffff88001f4be800 0000000000000000 > <0> ffff88001f4be844 ffff88003edb95d8 ffff88003749fdb0 ffffffffa017d550 > Call Trace: > [<ffffffffa017d550>] ttm_bo_swapout+0x1df/0x222 [ttm] > [<ffffffffa017b38d>] ttm_shrink+0x9b/0xc0 [ttm] > [<ffffffffa017b3c6>] ttm_shrink_work+0x14/0x16 [ttm] > [<ffffffff810489c7>] worker_thread+0x1b7/0x25e > [<ffffffffa017b3b2>] ? ttm_shrink_work+0x0/0x16 [ttm] > [<ffffffff8104bd8a>] ? autoremove_wake_function+0x0/0x38 > [<ffffffff81048810>] ? worker_thread+0x0/0x25e > [<ffffffff8104ba75>] kthread+0x7c/0x84 > [<ffffffff8100bd6a>] child_rip+0xa/0x20 > [<ffffffffa01d776e>] ? nouveau_mem_init_heap+0x5d/0xfd [nouveau] > [<ffffffff8104b9f9>] ? kthread+0x0/0x84 > [<ffffffff8100bd60>] ? child_rip+0x0/0x20 > Code: 04 00 00 00 e8 97 f7 ff ff 4c 89 e6 48 89 c2 48 89 c3 48 c7 c7 > 92 1f 18 a0 31 c0 e8 ca ad 19 e1 48 89 df 4c 89 e6 b9 00 04 00 00 <f3> > a5 e8 ba f7 ff ff e8 b5 f7 ff ff bf 01 00 00 00 e8 c4 f0 19 > RIP [<ffffffffa017c3fb>] ttm_tt_swapout+0x135/0x1db [ttm] > RSP <ffff88003749fce0> > ---[ end trace 8724ec5cfdbfe4ce ]--- > note: ttm_swap[1501] exited with preempt_count 3 > > ------------------------------------------------------------------------------ > This SF.Net email is sponsored by the Verizon Developer Community > Take advantage of Verizon's best-in-class app development support > A streamlined, 14 day to market process makes app distribution fast and easy > Join now and get one step closer to millions of Verizon customers > http://p.sf.net/sfu/verizon-dev2dev > -- > _______________________________________________ > Dri-devel mailing list > Dri-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dri-devel >
Xavier
2010-Jan-11 09:13 UTC
[Nouveau] [TTM] general protection fault in ttm_tt_swapout, to_virtual looks screwed up
On Sun, Jan 10, 2010 at 12:41 AM, Maarten Maathuis <madman2003 at gmail.com> wrote:> I've been noticing for a while that i've been getting general > protection faults in ttm_to_swapout, this time i was printk'ing the > virtual addresses. > > In case it's not obvious, the result of kmap_atomic() is wrong. > > This is nouveau/linux-2.6 which is somewhere after 2.6.32. I was > wondering if anyone has ever seen anything like this? >I am not sure this is related (so I don't want to bother dri-devel too), but using a specific version of nv25 dri and exiting etracer game, I managed to trigger a nice combination of problems :) validate:-12 then a bug in _mmx_memcpy , called by ttm_to_swapout then many of the "reserve_ram_pages_type failed" stuff. I suppose firefox is quite good at putting pressure and triggering bugs, but not quite as efficient as gallium :) And if it's not related at all and completely off-topic, then sorry. [ 250.615413] [drm] nouveau 0000:03:00.0: Allocating FIFO number 2 [ 250.617551] [drm] nouveau 0000:03:00.0: nouveau_channel_alloc: initialised FIFO 2 [ 267.092436] [drm] nouveau 0000:03:00.0: validate: -12 [ 267.944165] BUG: unable to handle kernel paging request at 0aaff000 [ 267.944178] IP: [<c11a0b71>] _mmx_memcpy+0x79/0x134 [ 267.944194] *pde = 00000000 [ 267.944199] Oops: 0002 [#1] [ 267.944203] last sysfs file: /sys/devices/pci0000:00/0000:00:02.2/usb1/1-0:1.0/uevent [ 267.944210] Modules linked in: snd_emu10k1 nouveau snd_rawmidi ttm emu10k1_gp snd_util_mem ath5k firewire_ohci drm_kms_helper snd_hwdep gameport ath skge snd_intel8x0 firewire_core crc_itu_t drm snd_ac97_codec ac97_bus nvidia_agp forcedeth i2c_nforce2 agpgart i2c_algo_bit [ 267.944238] [ 267.944244] Pid: 3046, comm: ttm_swap Not tainted (2.6.32 #10) A7N8X-E [ 267.944249] EIP: 0060:[<c11a0b71>] EFLAGS: 00010212 CPU: 0 [ 267.944254] EIP is at _mmx_memcpy+0x79/0x134 [ 267.944258] EAX: cd6c3000 EBX: 00000040 ECX: 00000040 EDX: 0aaff000 [ 267.944262] ESI: cd6c3000 EDI: d741b600 EBP: dcd15ee4 ESP: dcd15ed0 [ 267.944267] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 [ 267.944272] Process ttm_swap (pid: 3046, ti=dcd14000 task=ddc0e800 task.ti=dcd14000) [ 267.944276] Stack: [ 267.944278] 0aaff000 00001000 d6a95080 fffffff4 d741b600 dcd15eec e0a21bf0 dcd15f10 [ 267.944287] <0> e0a21f80 d6989758 c1857860 d741b600 000003b6 d74a3a00 00000000 00000000 [ 267.944296] <0> dcd15f44 e0a22d49 dfbcab8c d74a3a28 00000000 00400000 00000400 00000000 [ 267.944306] Call Trace: [ 267.944324] [<e0a21bf0>] ? T.418+0xd/0xf [ttm] [ 267.944335] [<e0a21f80>] ? ttm_tt_swapout+0x106/0x17b [ttm] [ 267.944346] [<e0a22d49>] ? ttm_bo_swapout+0x12f/0x161 [ttm] [ 267.944356] [<e0a2128a>] ? ttm_shrink+0x7d/0x89 [ttm] [ 267.944365] [<e0a212aa>] ? ttm_shrink_work+0x14/0x18 [ttm] [ 267.944374] [<c1032244>] ? worker_thread+0x11f/0x188 [ 267.944383] [<c101ff94>] ? T.1319+0x26/0x5e [ 267.944392] [<e0a21296>] ? ttm_shrink_work+0x0/0x18 [ttm] [ 267.944402] [<c1034e51>] ? autoremove_wake_function+0x0/0x2f [ 267.944407] [<c1032125>] ? worker_thread+0x0/0x188 [ 267.944413] [<c1034b50>] ? kthread+0x5e/0x63 [ 267.944419] [<c1034af2>] ? kthread+0x0/0x63 [ 267.944427] [<c10030d3>] ? kernel_thread_helper+0x7/0x10 [ 267.944430] Code: 00 00 00 0f 0d 86 c0 00 00 00 0f 0d 86 00 01 00 00 31 c9 eb 46 0f 0d 80 40 01 00 00 0f 6f 00 0f 6f 48 08 0f 6f 50 10 0f 6f 58 18 <0f> 7f 02 0f 7f 4a 08 0f 7f 52 10 0f 7f 5a 18 0f 6f 40 20 0f 6f [ 267.944476] EIP: [<c11a0b71>] _mmx_memcpy+0x79/0x134 SS:ESP 0068:dcd15ed0 [ 267.944484] CR2: 000000000aaff000 [ 267.944489] ---[ end trace 74c69ad3330bb04a ]--- [ 267.944495] note: ttm_swap[3046] exited with preempt_count 2 [ 268.666662] reserve_ram_pages_type failed 0x9cba000-0x9cbb000, track 0x8, req 0x10 [ 269.001031] reserve_ram_pages_type failed 0x6ff0000-0x6ff1000, track 0x8, req 0x10 [ 269.474134] reserve_ram_pages_type failed 0x68f2000-0x68f3000, track 0x8, req 0x10 [ 269.767270] reserve_ram_pages_type failed 0x14c98000-0x14c99000, track 0x8, req 0x10 [ 269.783213] reserve_ram_pages_type failed 0x23eb000-0x23ec000, track 0x8, req 0x10 [ 270.161672] reserve_ram_pages_type failed 0x10784000-0x10785000, track 0x8, req 0x10 [ 270.376144] reserve_ram_pages_type failed 0x9e01000-0x9e02000, track 0x8, req 0x10 [ 271.169852] reserve_ram_pages_type failed 0xe1b7000-0xe1b8000, track 0x8, req 0x10 [ 271.607980] reserve_ram_pages_type failed 0x3ae2000-0x3ae3000, track 0x8, req 0x10 [ 272.027030] reserve_ram_pages_type failed 0x195e8000-0x195e9000, track 0x8, req 0x10 [ 284.930170] reserve_ram_pages_type failed 0xdcd4000-0xdcd5000, track 0x8, req 0x10 [ 285.043614] reserve_ram_pages_type failed 0x36c5000-0x36c6000, track 0x8, req 0x10 [ 285.941944] reserve_ram_pages_type failed 0x30d2000-0x30d3000, track 0x8, req 0x10 [ 341.055291] reserve_ram_pages_type failed 0x1b90b000-0x1b90c000, track 0x8, req 0x10 [ 355.005349] [drm] nouveau 0000:03:00.0: nouveau_channel_free: freeing fifo 2 [ 355.074031] etracer used greatest stack depth: 5632 bytes left