Mike Galbraith
2017-Jul-12 11:25 UTC
[Nouveau] [regression drm/noveau] suspend to ram -> BOOM: exception RIP: drm_calc_vbltimestamp_from_scanoutpos+335
On Wed, 2017-07-12 at 11:55 +0200, Mike Galbraith wrote:> On Tue, 2017-07-11 at 14:22 -0400, Ilia Mirkin wrote: > > > > Some display stuff did change for 4.13 for GM20x+ boards. If it's not > > too much trouble, a bisect would be pretty useful. > > Bisection seemingly went fine, but the result is odd. > > e98c58e55f68f8785aebfab1f8c9a03d8de0afe1 is the first bad commitBut it really really is bad. Looking at gitk fork in the road leading to it... 52d9d38c183b drm/sti:fix spelling mistake: "compoment" -> "component" - good e4e818cc2d7c drm: make drm_panel.h self-contained - good 9cf8f5802f39 drm: add missing declaration to drm_blend.h - good Before the git highway splits, all is well. The lane with commits works fine at both ends, but e98c58e55f68 is busted. Merge arfifact? -Mike
Mike Galbraith
2017-Jul-12 17:19 UTC
[Nouveau] [regression drm/noveau] suspend to ram -> BOOM: exception RIP: drm_calc_vbltimestamp_from_scanoutpos+335
On Wed, 2017-07-12 at 07:37 -0400, Ilia Mirkin wrote:> On Wed, Jul 12, 2017 at 7:25 AM, Mike Galbraith <efault at gmx.de> wrote: > > On Wed, 2017-07-12 at 11:55 +0200, Mike Galbraith wrote: > >> On Tue, 2017-07-11 at 14:22 -0400, Ilia Mirkin wrote: > >> > > >> > Some display stuff did change for 4.13 for GM20x+ boards. If it's not > >> > too much trouble, a bisect would be pretty useful. > >> > >> Bisection seemingly went fine, but the result is odd. > >> > >> e98c58e55f68f8785aebfab1f8c9a03d8de0afe1 is the first bad commit > > > > But it really really is bad. Looking at gitk fork in the road leading > > to it... > > > > 52d9d38c183b drm/sti:fix spelling mistake: "compoment" -> "component" - good > > e4e818cc2d7c drm: make drm_panel.h self-contained - good > > 9cf8f5802f39 drm: add missing declaration to drm_blend.h - good > > > > Before the git highway splits, all is well. The lane with commits > > works fine at both ends, but e98c58e55f68 is busted. Merge arfifact? > > Hmmm... that tree does not appear to have gotten a v4.12 backmerge at > any point. The last backmerge from Linus as far as I can tell was > v4.11-rc7. Could be an interaction with some out-of-tree change.FWIW, checking out the fingered commit then.. git log --oneline 52d9d38c183b..e98c58e55f68|grep nouveau and reverting the lot helped not at all. Checking out 6b7781b42dc9 and reverting the fingered commit did. Given the nouveau bits reverted are mostly the vblank changes, CC to Daniel, maybe he'll know why both GTX 980 and GeForce 8600 GT get all upset. Either I'm damn lucky, both of my nvidia equipped boxen going boom 100% repeatably, or there are a lot of folks out there who haven't yet tried suspend with our latest/greatest kernel. I suspect the later. -Mike
Mike Galbraith
2017-Jul-14 13:36 UTC
[Nouveau] [regression drm/noveau] suspend to ram -> BOOM: exception RIP: drm_calc_vbltimestamp_from_scanoutpos+335
On Wed, 2017-07-12 at 07:37 -0400, Ilia Mirkin wrote:> On Wed, Jul 12, 2017 at 7:25 AM, Mike Galbraith <efault at gmx.de> wrote: > > On Wed, 2017-07-12 at 11:55 +0200, Mike Galbraith wrote: > >> On Tue, 2017-07-11 at 14:22 -0400, Ilia Mirkin wrote: > >> > > >> > Some display stuff did change for 4.13 for GM20x+ boards. If it's not > >> > too much trouble, a bisect would be pretty useful. > >> > >> Bisection seemingly went fine, but the result is odd. > >> > >> e98c58e55f68f8785aebfab1f8c9a03d8de0afe1 is the first bad commit > > > > But it really really is bad. Looking at gitk fork in the road leading > > to it... > > > > 52d9d38c183b drm/sti:fix spelling mistake: "compoment" -> "component" - good > > e4e818cc2d7c drm: make drm_panel.h self-contained - good > > 9cf8f5802f39 drm: add missing declaration to drm_blend.h - good > > > > Before the git highway splits, all is well. The lane with commits > > works fine at both ends, but e98c58e55f68 is busted. Merge arfifact? > > Hmmm... that tree does not appear to have gotten a v4.12 backmerge at > any point. The last backmerge from Linus as far as I can tell was > v4.11-rc7. Could be an interaction with some out-of-tree change.Ok, a network outage gave me time to go hunting. Indeed it is a bad interaction with the tree DRM merged into. All DRM did was to slip a WARN_ON_ONCE() that nouveau triggers into a kernel module where such things no longer warn, they blow the box out of the water. I made a dinky testcase module (attached), and bisected to the real root.... 19d436268dde95389c616bb3819da73f0a8b28a8 is the first bad commit commit 19d436268dde95389c616bb3819da73f0a8b28a8 Author: Peter Zijlstra <peterz at infradead.org> Date: Sat Feb 25 08:56:53 2017 +0100 debug: Add _ONCE() logic to report_bug() Josh suggested moving the _ONCE logic inside the trap handler, using a bit in the bug_entry::flags field, avoiding the need for the extra variable. Sadly this only works for WARN_ON_ONCE(), since the others have printk() statements prior to triggering the trap. Still, this saves a fair amount of text and some data: text data filename 10682460 4530992 defconfig-build/vmlinux.orig 10665111 4530096 defconfig-build/vmlinux.patched Suggested-by: Josh Poimboeuf <jpoimboe at redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz at infradead.org> Cc: Andy Lutomirski <luto at kernel.org> Cc: Arnd Bergmann <arnd at arndb.de> Cc: Borislav Petkov <bp at alien8.de> Cc: Brian Gerst <brgerst at gmail.com> Cc: Denys Vlasenko <dvlasenk at redhat.com> Cc: H. Peter Anvin <hpa at zytor.com> Cc: Linus Torvalds <torvalds at linux-foundation.org> Cc: Peter Zijlstra <peterz at infradead.org> Cc: Thomas Gleixner <tglx at linutronix.de> Signed-off-by: Ingo Molnar <mingo at kernel.org> :040000 040000 9f47f66ec4c234f6ee8e2a09e991c95fe47cf2c1 3e92aa9e77b39ed075ae2c3bdf041d92ef898f62 M arch :040000 040000 34f70b73d40c82533dd7df9b289106be69e2fa8d dd5d7248694a36b3e170f2dca5d9c4121535a990 M include :040000 040000 f6e627b0d378f0a00d2987fdd0c7b215306e6e3c b360d4ee2579744cce530184d7dab13493f73ee0 M lib -------------- next part -------------- A non-text attachment was scrubbed... Name: warn_on_once.patch Type: text/x-patch Size: 675 bytes Desc: not available URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20170714/a0402ff4/attachment-0001.bin>
Possibly Parallel Threads
- [regression drm/noveau] suspend to ram -> BOOM: exception RIP: drm_calc_vbltimestamp_from_scanoutpos+335
- [regression drm/noveau] suspend to ram -> BOOM: exception RIP: drm_calc_vbltimestamp_from_scanoutpos+335
- [regression drm/noveau] suspend to ram -> BOOM: exception RIP: drm_calc_vbltimestamp_from_scanoutpos+335
- [regression drm/noveau] suspend to ram -> BOOM: exception RIP: drm_calc_vbltimestamp_from_scanoutpos+335
- [regression drm/noveau] suspend to ram -> BOOM: exception RIP: drm_calc_vbltimestamp_from_scanoutpos+335