Rafał Miłecki
2014-Feb-10 20:05 UTC
[Nouveau] GeForce 6100 (NV4E) & nouveau regression in 3.12
2014-02-10 20:06 GMT+01:00 Ilia Mirkin <imirkin at alum.mit.edu>:> On Mon, Feb 10, 2014 at 10:12 AM, Rafa? Mi?ecki <zajec5 at gmail.com> wrote: >> 2014-02-09 23:12 GMT+01:00 Ilia Mirkin <imirkin at alum.mit.edu>: >>> On Sun, Feb 9, 2014 at 5:08 PM, Rafa? Mi?ecki <zajec5 at gmail.com> wrote: >>>> Last week I've switched from my old & good 3.4.63 to 3.14-rc1 and >>>> noticed nasty display corruptions when using nouveau. It seems that >>>> changing parts of the screen are appearing for a fraction of second in >>>> random places. I've recorded this behavior: >>>> http://www.youtube.com/watch?v=IEq7JzGVzj0 >>>> >>>> My hardware is some old motherboard with >>>> 00:05.0 VGA compatible controller [0300]: NVIDIA Corporation C51G >>>> [GeForce 6100] [10de:0242] (rev a2) >>>> integrated. Since my CPU is ancient AMD Sempron(tm) Processor 2800+ it >>>> took me few days to track this issue. >>>> >>>> There goes some summary of various kernels: >>>> >>>> 1) 3.4.63 >>>> No display problems. Works great. >>>> >>>> 2) commit 928c2f0c006bf7f381f58af2b2786d2a858ae311 >>>> drm/fb-helper: don't sleep for screen unblank when an oops is in progress >>>> Scrollbars have a pink line. I didn't track which commit introduced >>>> this pink corruption. No other problems. >>>> >>>> 3) commit c21eb21cb50d58e7cbdcb8b9e7ff68b85cfa5095 >>>> Revert "drm: mark context support as a legacy subsystem" >>>> This fixes pink lines on scrollbars and introduces this nasty display >>>> corruption. It's one commit after previous one. >>>> It means it's the first bad commit for these nasty corruptions recoded >>>> and uploaded to YouTube. >>>> >>>> 4) 3.14-rc1 >>>> No changes since c21eb21cb50d58e7cbdcb8b9e7ff68b85cfa5095. No pink >>>> lines, but display corruptions happening. >>> >>> Can you boot with nouveau.config=NvMSI=0 ? If that helps, there are >>> some patches on the nouveau/dri-devel lists (search for "nv4c") that >>> may help you. >> >> Unfortunately this config parameter doesn't help :( > > Too bad. It may still be worthwhile applying the patches and seeing > what happens... it seems like some registers got switched around on > the nv4x IGP's: > > http://lists.freedesktop.org/archives/nouveau/2014-February/016032.html > http://lists.freedesktop.org/archives/nouveau/2014-February/016033.html > http://lists.freedesktop.org/archives/nouveau/2014-February/016034.htmlI've applied all 3 patches, compiled, tried... didn't help. I've also tried nouveau.config=NvMSI=0 on top on your patches, didn't help.> BTW, youtube says "this video is unavailable".Ohh, Google/YouTube really doesn't like ppl removing G+ account... http://files.zajec.net/20140208-nouveau.mp4> Is there anything in dmesg when the display corruptions happen?No.> There was also an issue with libdrm_nouveau for pre-nv50 chips, when > compiled with gcc-4.8 some time back... fixed in... 2.4.48 or so?I use openSUSE 12.2 (x86_64) which provides gcc 4.7.1 and libdrm_nouveau1-2.4.33-2.3.2.x86_64. I assume libdrm_nouveau was compiled using that 4.7.1.> Lastly, it may be worth trying 3.11.x and 3.12.x to get a better > handle on when problems happened. The commits you cite are in the > middle of releases, and may have various badness associated with them > (e.g. 3.12-rc had a later-disabled MSI implementation, back in 3.13... > probably some other stuff).I'll provide results tomorrow. -- Rafa?
Ilia Mirkin
2014-Feb-11 10:41 UTC
[Nouveau] GeForce 6100 (NV4E) & nouveau regression in 3.12
On Mon, Feb 10, 2014 at 3:05 PM, Rafa? Mi?ecki <zajec5 at gmail.com> wrote:> 2014-02-10 20:06 GMT+01:00 Ilia Mirkin <imirkin at alum.mit.edu>: >> There was also an issue with libdrm_nouveau for pre-nv50 chips, when >> compiled with gcc-4.8 some time back... fixed in... 2.4.48 or so? > > I use openSUSE 12.2 (x86_64) which provides gcc 4.7.1 and > libdrm_nouveau1-2.4.33-2.3.2.x86_64. I assume libdrm_nouveau was > compiled using that 4.7.1.Hmmm... the nouveau drm rewrite went into 2.4.34... I guess you're using pretty old userspace in general, since everything depends on the post-rewrite libdrm_nouveau. Of course it definitely sounds like a kernel issue, but I can't help but wonder if this is a non-issue with later userspace. So there are basically 2 things left to do, in order of time-consuming-ness: (a) try a live{cd,usb} (e.g. arch, or something else that has recent software), and see if the issue is still present there. (b) bisect. you can (almost) definitely restrict the bisect to drivers/gpu/drm/nouveau. if you have additional computational power, i would recommend looking into distcc for speeding up the compiles. it may be interesting to also try 3.6.x since 3.7 received a pretty big rewrite. but a git bisect is a lot more direct in figuring these things out :) After I watched your video, it definitely brought back memories of another bug or perhaps email on this list a while back (definitely within the past year), but unfortunately I can't quite place it :( -ilia
Rafał Miłecki
2014-Feb-11 11:09 UTC
[Nouveau] GeForce 6100 (NV4E) & nouveau regression in 3.12
2014-02-11 11:41 GMT+01:00 Ilia Mirkin <imirkin at alum.mit.edu>:> (b) bisect. you can (almost) definitely restrict the bisect to > drivers/gpu/drm/nouveau. if you have additional computational power, i > would recommend looking into distcc for speeding up the compiles. it > may be interesting to also try 3.6.x since 3.7 received a pretty big > rewrite. but a git bisect is a lot more direct in figuring these > things out :)I've already bisected commit that changed this pink line issue into a general screen corruption. Just to remind it was: commit c21eb21cb50d58e7cbdcb8b9e7ff68b85cfa5095 Author: Dave Airlie <airlied at redhat.com> Date: Fri Sep 20 08:32:59 2013 +1000 Revert "drm: mark context support as a legacy subsystem" Would you like me to bisect commit that introduced this pink line issue? -- Rafa?
Rafał Miłecki
2014-Feb-16 15:11 UTC
[Nouveau] GeForce 6100 (NV4E) & nouveau regression in 3.12
2014-02-11 11:41 GMT+01:00 Ilia Mirkin <imirkin at alum.mit.edu>:> On Mon, Feb 10, 2014 at 3:05 PM, Rafa? Mi?ecki <zajec5 at gmail.com> wrote: >> 2014-02-10 20:06 GMT+01:00 Ilia Mirkin <imirkin at alum.mit.edu>: >>> There was also an issue with libdrm_nouveau for pre-nv50 chips, when >>> compiled with gcc-4.8 some time back... fixed in... 2.4.48 or so? >> >> I use openSUSE 12.2 (x86_64) which provides gcc 4.7.1 and >> libdrm_nouveau1-2.4.33-2.3.2.x86_64. I assume libdrm_nouveau was >> compiled using that 4.7.1. > > Hmmm... the nouveau drm rewrite went into 2.4.34... I guess you're > using pretty old userspace in general, since everything depends on the > post-rewrite libdrm_nouveau. Of course it definitely sounds like a > kernel issue, but I can't help but wonder if this is a non-issue with > later userspace. > > So there are basically 2 things left to do, in order of time-consuming-ness: > > (a) try a live{cd,usb} (e.g. arch, or something else that has recent > software), and see if the issue is still present there.I've tried Fedora 20 booted from USB. It suffers from the same issue. It's based on kernel 3.11.10, but I'm sure it has more up to date userspace.> (b) bisect. you can (almost) definitely restrict the bisect to > drivers/gpu/drm/nouveau. if you have additional computational power, i > would recommend looking into distcc for speeding up the compiles. it > may be interesting to also try 3.6.x since 3.7 received a pretty big > rewrite. but a git bisect is a lot more direct in figuring these > things out :)Bisecting nouveau between 3.10 and 3.11 is a real pain. Ben introduced booting regression with commit: commit dceef5d87cc01358cc1434416f3272e2ddc3d97a Author: Ben Skeggs <bskeggs at redhat.com> Date: Mon Mar 4 13:01:21 2013 +1000 drm/nouveau/fb: initialise vram controller as pfb sub-object I had to first bisect fix for that regression which appeared to be: commit 6284bf41b97fb36ed96b664a3c23b6dc3661f5f9 Author: Ilia Mirkin <imirkin at alum.mit.edu> Date: Fri Aug 9 17:25:54 2013 -0400 drm/nouveau/fb: fix null derefs in nv49 and nv4e init Unfortunately meanwhile another init regression was introduced with: commit 0108bc808107b97e101b15af9705729626be6447 Author: Maarten Lankhorst <maarten.lankhorst at canonical.com> Date: Sun Jul 7 10:40:19 2013 +0200 drm/nouveau: do not allow negative sizes for now And I had to find fix for that which was: commit 35095f7529bb6abdfc956e7a41ca6957520b70a7 Author: Maarten Lankhorst <maarten.lankhorst at canonical.com> Date: Sat Jul 27 10:17:12 2013 +0200 drm/nouveau: fix size check for cards without vm Then I finally was able to test every commit between 3.10 and 3.11 without skipping 90% of them. -- Rafa?
Rafał Miłecki
2014-Feb-16 15:17 UTC
[Nouveau] GeForce 6100 (NV4E) & nouveau regression in 3.12
2014-02-11 11:41 GMT+01:00 Ilia Mirkin <imirkin at alum.mit.edu>:> (b) bisect. you can (almost) definitely restrict the bisect to > drivers/gpu/drm/nouveau. if you have additional computational power, i > would recommend looking into distcc for speeding up the compiles. it > may be interesting to also try 3.6.x since 3.7 received a pretty big > rewrite. but a git bisect is a lot more direct in figuring these > things out :) > > After I watched your video, it definitely brought back memories of > another bug or perhaps email on this list a while back (definitely > within the past year), but unfortunately I can't quite place it :(I've finally bisected between 3.10 and 3.11: 78ae0ad403daf11cf63da86923d2b5dbeda3af8f is the first bad commit commit 78ae0ad403daf11cf63da86923d2b5dbeda3af8f Author: Ben Skeggs <bskeggs at redhat.com> Date: Wed Aug 21 11:30:36 2013 +1000 drm/nv04/disp: fix framebuffer pin refcounting I've booted that commit and one commit older few times. Every time I booted 78ae0ad I got corruption. Every time I booted 6ff8c76 (it's the earlier commit), it was OK. Ben: any idea why this commit caused regression for my hardware? From the commit message I assume it was supposed to affect some ancient nv04 hardware only. Did it accidentally touch my nv4e path code maybe? -- Rafa?