On Wed, Dec 11, 2013 at 6:43 AM, Bruno Pr?mont <bonbons at linux-vserver.org> wrote:> Hi Ilia, > >> On Thu, 5 Dec 2013 10:55:06 -0500 Ilia Mirkin wrote: >> > On Thu, Dec 5, 2013 at 10:17 AM, Bruno Pr?mont wrote: >> > > With drm-nouveau-next branch on top of 3.13-rc2 and a few sound commits >> > > I get following errors in kernel log (repeating much more often). >> > > (once there was a corrupted fbcon with matrix-like strings of >> > > unrecognizable glyph-strings hanging down from screen top border) >> > > >> > > These errors do stop once I switch to Xorg. >> > > >> > > [ 13.825627] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000408280 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT >> > > [ 13.826506] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000408280 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT >> > > [ 13.827319] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000408280 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT >> > > [ 13.828139] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000408280 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT >> > > [ 13.828957] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000400300 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT >> > > >> > > System MBA2,1 with: >> > > 02:00.0 VGA compatible controller [0300]: NVIDIA Corporation C79 [GeForce 9400M] [10de:0870] (rev b1) >> > >> > Do you also see these in 3.13-rc2 + >> > http://cgit.freedesktop.org/nouveau/linux-2.6/commit/?h=drm-nouveau-next&id=a7e4201f0f7d47e03b851f06f8987856e8d33083 >> > ? If not, you can reorder the patches (git rebase -i) and bisect the >> > rest. Basically the GPU is trying to write to some address that is >> > either completely bogus or is stale, and its internal MMU won't let >> > it. >> >> If I remember correctly these did not happen with Linus' rc2 but only >> do happen with drm-nouveau-next on top of it. >> >> I will check and bisect though probably will not get time to complete >> bisect before late Saturday. >> >> >> A display glitch during kernel boot with fbcon might be related to this. >> (very wild guess based on when the trapped writes happen) >> The penguin logo shown by fbcon looks like memcopy'ed to a tiled >> framebuffer without respecting the fact that framebuffer is tiled. > > It was introduced somewhere between 3.12 and 3.13-rc2 though does not happen > on every boot and when it happens it might be only a couple of trapped writes > or a whole lot flooding complete log buffer. > > Visually it manifests itself as a scrambled framebuffer > (looking like tiled versus untiled writing to framebuffer). > When it happens, it seem to be correlated to call to dialog to ask for > luks password and screen scrambling happens when boot messages > (kernel/userspace) show/scroll by [they wipe out some scrambled area > when scrolling by]. > > Unfortunately it will be rather hard to bisect because sometimes it does not > happen. > Adding drm-nouveau-next on top of rc2/rc3 does no apparent difference.A shot in the dark -- try booting with nouveau.config=NvMSI=0. We already disable MSI for NVAA, perhaps we should do it for NVAC as well. Although the blob drivers have it enabled... but we could be doing something wrong. -ilia
Michele Baldessari
2014-Jan-01 20:29 UTC
[Nouveau] drm-nouveau-next - write trapped by fbcon
On Wed, Dec 11, 2013 at 07:21:05AM -0500, Ilia Mirkin wrote:> >> On Thu, 5 Dec 2013 10:55:06 -0500 Ilia Mirkin wrote: > >> > On Thu, Dec 5, 2013 at 10:17 AM, Bruno Pr?mont wrote: > >> > > With drm-nouveau-next branch on top of 3.13-rc2 and a few sound commits > >> > > I get following errors in kernel log (repeating much more often). > >> > > (once there was a corrupted fbcon with matrix-like strings of > >> > > unrecognizable glyph-strings hanging down from screen top border) > >> > > > >> > > These errors do stop once I switch to Xorg. > >> > > > >> > > [ 13.825627] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000408280 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT > >> > > [ 13.826506] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000408280 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT > >> > > [ 13.827319] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000408280 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT > >> > > [ 13.828139] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000408280 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT > >> > > [ 13.828957] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000400300 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT > >> > > > >> > > System MBA2,1 with: > >> > > 02:00.0 VGA compatible controller [0300]: NVIDIA Corporation C79 [GeForce 9400M] [10de:0870] (rev b1) > >> > > >> > Do you also see these in 3.13-rc2 + > >> > http://cgit.freedesktop.org/nouveau/linux-2.6/commit/?h=drm-nouveau-next&id=a7e4201f0f7d47e03b851f06f8987856e8d33083 > >> > ? If not, you can reorder the patches (git rebase -i) and bisect the > >> > rest. Basically the GPU is trying to write to some address that is > >> > either completely bogus or is stale, and its internal MMU won't let > >> > it. > >> > >> If I remember correctly these did not happen with Linus' rc2 but only > >> do happen with drm-nouveau-next on top of it. > >> > >> I will check and bisect though probably will not get time to complete > >> bisect before late Saturday. > >> > >> > >> A display glitch during kernel boot with fbcon might be related to this. > >> (very wild guess based on when the trapped writes happen) > >> The penguin logo shown by fbcon looks like memcopy'ed to a tiled > >> framebuffer without respecting the fact that framebuffer is tiled. > > > > It was introduced somewhere between 3.12 and 3.13-rc2 though does not happen > > on every boot and when it happens it might be only a couple of trapped writes > > or a whole lot flooding complete log buffer. > > > > Visually it manifests itself as a scrambled framebuffer > > (looking like tiled versus untiled writing to framebuffer). > > When it happens, it seem to be correlated to call to dialog to ask for > > luks password and screen scrambling happens when boot messages > > (kernel/userspace) show/scroll by [they wipe out some scrambled area > > when scrolling by]. > > > > Unfortunately it will be rather hard to bisect because sometimes it does not > > happen. > > Adding drm-nouveau-next on top of rc2/rc3 does no apparent difference. > > A shot in the dark -- try booting with nouveau.config=NvMSI=0. We > already disable MSI for NVAA, perhaps we should do it for NVAC as > well. Although the blob drivers have it enabled... but we could be > doing something wrong.Hi Ilia, FWIW a Fedora user got those messages on stock 3.13-rc6 with and without the "nouveau.config=nvMSI=0" boot option: [ 10.812029] nouveau E[ PFB][0000:01:00.0] trapped write at 0x0000525500 on channel 0x0001fed0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT This is via BZ https://bugzilla.redhat.com/show_bug.cgi?id=927451 hth, Michele -- Michele Baldessari <michele at acksyn.org> C2A5 9DA3 9961 4FFB E01B D0BC DDD4 DCCB 7515 5C6D
On Wed, Jan 1, 2014 at 3:29 PM, Michele Baldessari <michele at acksyn.org> wrote:> On Wed, Dec 11, 2013 at 07:21:05AM -0500, Ilia Mirkin wrote: >> >> On Thu, 5 Dec 2013 10:55:06 -0500 Ilia Mirkin wrote: >> >> > On Thu, Dec 5, 2013 at 10:17 AM, Bruno Pr?mont wrote: >> >> > > With drm-nouveau-next branch on top of 3.13-rc2 and a few sound commits >> >> > > I get following errors in kernel log (repeating much more often). >> >> > > (once there was a corrupted fbcon with matrix-like strings of >> >> > > unrecognizable glyph-strings hanging down from screen top border) >> >> > > >> >> > > These errors do stop once I switch to Xorg. >> >> > > >> >> > > [ 13.825627] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000408280 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT >> >> > > [ 13.826506] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000408280 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT >> >> > > [ 13.827319] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000408280 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT >> >> > > [ 13.828139] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000408280 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT >> >> > > [ 13.828957] nouveau E[ PFB][0000:02:00.0] trapped write at 0x0000400300 on channel 0x0000fee0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT >> >> > > >> >> > > System MBA2,1 with: >> >> > > 02:00.0 VGA compatible controller [0300]: NVIDIA Corporation C79 [GeForce 9400M] [10de:0870] (rev b1) >> >> > >> >> > Do you also see these in 3.13-rc2 + >> >> > http://cgit.freedesktop.org/nouveau/linux-2.6/commit/?h=drm-nouveau-next&id=a7e4201f0f7d47e03b851f06f8987856e8d33083 >> >> > ? If not, you can reorder the patches (git rebase -i) and bisect the >> >> > rest. Basically the GPU is trying to write to some address that is >> >> > either completely bogus or is stale, and its internal MMU won't let >> >> > it. >> >> >> >> If I remember correctly these did not happen with Linus' rc2 but only >> >> do happen with drm-nouveau-next on top of it. >> >> >> >> I will check and bisect though probably will not get time to complete >> >> bisect before late Saturday. >> >> >> >> >> >> A display glitch during kernel boot with fbcon might be related to this. >> >> (very wild guess based on when the trapped writes happen) >> >> The penguin logo shown by fbcon looks like memcopy'ed to a tiled >> >> framebuffer without respecting the fact that framebuffer is tiled. >> > >> > It was introduced somewhere between 3.12 and 3.13-rc2 though does not happen >> > on every boot and when it happens it might be only a couple of trapped writes >> > or a whole lot flooding complete log buffer. >> > >> > Visually it manifests itself as a scrambled framebuffer >> > (looking like tiled versus untiled writing to framebuffer). >> > When it happens, it seem to be correlated to call to dialog to ask for >> > luks password and screen scrambling happens when boot messages >> > (kernel/userspace) show/scroll by [they wipe out some scrambled area >> > when scrolling by]. >> > >> > Unfortunately it will be rather hard to bisect because sometimes it does not >> > happen. >> > Adding drm-nouveau-next on top of rc2/rc3 does no apparent difference. >> >> A shot in the dark -- try booting with nouveau.config=NvMSI=0. We >> already disable MSI for NVAA, perhaps we should do it for NVAC as >> well. Although the blob drivers have it enabled... but we could be >> doing something wrong. > > Hi Ilia, > > FWIW a Fedora user got those messages on stock 3.13-rc6 with and without the > "nouveau.config=nvMSI=0" boot option: > [ 10.812029] nouveau E[ PFB][0000:01:00.0] trapped write at 0x0000525500 on channel 0x0001fed0 [unknown] BAR/PFIFO_WRITE/FB reason: PAGE_NOT_PRESENT > > This is via BZ https://bugzilla.redhat.com/show_bug.cgi?id=927451That bug is about someone with a NV84, so my comment about NVAA/NVAC similarities doesn't apply. But it's good to know that MSI isn't at fault (probably). I guess it's bisect time. -ilia