Vincent Vanackere
2017-May-08 11:50 UTC
[Nouveau] GT 730 freeze : how do diagnose / debug ?
On 07/05/2017 23:50, Ilia Mirkin wrote:> You have two issues: > > (a) nouveau's GL driver messed something up, causing a read fault error > (b) nouveau's kernel driver tried to recover. It failed. > > Solution to #1: None, really. You can try updating mesa, and hope it > helps. Not sure what version you're on.Here's my packages version: ii libegl1-mesa:amd64 17.0.3-1ubuntu1 amd64 free implementation of the EGL API -- runtime ii libegl1-mesa-dev:amd64 17.0.3-1ubuntu1 amd64 free implementation of the EGL API -- development files ii libgl1-mesa-dev:amd64 17.0.3-1ubuntu1 amd64 free implementation of the OpenGL API -- GLX development files ii libgl1-mesa-dri:amd64 17.0.3-1ubuntu1 amd64 free implementation of the OpenGL API -- DRI modules ii libgl1-mesa-glx:amd64 17.0.3-1ubuntu1 amd64 free implementation of the OpenGL API -- GLX runtime ii libglapi-mesa:amd64 17.0.3-1ubuntu1 amd64 free implementation of the GL API -- shared library ii libgles2-mesa:amd64 17.0.3-1ubuntu1 amd64 free implementation of the OpenGL|ES 2.x API -- runtime ii libglu1-mesa:amd64 9.0.0-2.1build1 amd64 Mesa OpenGL utility library (GLU) ii libglu1-mesa-dev:amd64 9.0.0-2.1build1 amd64 Mesa OpenGL utility library -- development files ii libwayland-egl1-mesa:amd64 17.0.3-1ubuntu1 amd64 implementation of the Wayland EGL platform -- runtime ii mesa-common-dev:amd64 17.0.3-1ubuntu1 amd64 Developer documentation for Mesa ii mesa-utils 8.3.0-4 amd64 Miscellaneous Mesa GL utilities ii mesa-vdpau-drivers:amd64 17.0.3-1ubuntu1 amd64 Mesa VDPAU video acceleration drivers I'll try compiling a newer version from git to see if it helps...> Solution to #2: Ben Skeggs will hopefully have something clever to > say. The recovery logic was recently beefed up considerably, so the > fact that you even got that far is already a good start. > > If you're looking for a stable experience with Xorg, I recommend using > xf86-video-nouveau -- it's been extensively battle-tested, and is > quite simple logic; I also recommend against anything that uses GL on > an ongoing basis (which, sadly, everyone thinks is the coolest thing > to do these days). If you're looking for a stable experience with a > GL-based Wayland compositor, you'll have to wait until either the > nouveau GL driver is perfect or nouveau kernel module can properly > recover from any screwups the GL driver makes.I'm not expecting the GL driver to be perfect ;-) However it would be nice if the kernel module could recover at least a bit better from bad commands from the GL driver (indeed I've had some hard lockups too where I could not even connect from ssh).> You can also remove nouveau_dri.so entirely, which is a big hammer > against these types of issues (removes all GL-based acceleration), or > you can run certain key pieces of software with > LIBGL_ALWAYS_SOFTWARE=1, which will force a CPU-based GL > implementation.Thanks for the hint, I'll try this workaround too ! Please let me know if I can do anything to improve the drivers's stablility (like dumping the cards's register or enabling some traces ?). Alternatively if you know of a fanless graphic card model that would be able to drive 2 monitors at 2560x1440 with proper linux support, I'm interested ;-) Regards> Cheers, > > -ilia > > > 2017-05-07 16:03 GMT-04:00 Vincent Vanackere <vincent.vanackere at gmail.com>: >> Hi, >> >> I own an Asus GT730-SL-2GD3-BRK, trying to drive two monitors at 2560x1440 >> resolution. Using gnome-shell with either Xorg or wayland I get screen >> freezes very frequently. Those freezes usually require a reboot to get >> working graphics (below a sample trace that I got yesterday). >> I am running Ubuntu 17.04 with the latest kernels avalable, I also tested >> various more recent kernels including the latest drm tree at >> https://cgit.freedesktop.org/~airlied/linux/log/?h=drm-next but the problem >> always occurs. >> When a freeze occurs, the computer is still reachable through ssh but the >> only action I found so far to get graphics back is to restart the computer. >> I am willing to run diagnostics programs or test any patch if it would >> help. I'm also not excluding the possibility that I may have some faulty >> hardware so any hardwae-health-test advice would be welcome... >> >> Regards, >> >> Vincent Vanackère >> >> [ 1.199135] nouveau 0000:01:00.0: NVIDIA GK208B (b06070b1) >> [ 1.319930] nouveau 0000:01:00.0: bios: version 80.28.92.00.10 >> [ 1.322095] nouveau 0000:01:00.0: fb: 2048 MiB DDR3 >> [ 2.620362] nouveau 0000:01:00.0: DRM: VRAM: 2048 MiB >> [ 2.620362] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB >> [ 2.620364] nouveau 0000:01:00.0: DRM: TMDS table version 2.0 >> [ 2.620378] nouveau 0000:01:00.0: DRM: DCB version 4.0 >> [ 2.620379] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 00020030 >> [ 2.620380] nouveau 0000:01:00.0: DRM: DCB outp 01: 02011f62 00020010 >> [ 2.620380] nouveau 0000:01:00.0: DRM: DCB outp 02: 02022f10 00000000 >> [ 2.620381] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001031 >> [ 2.620381] nouveau 0000:01:00.0: DRM: DCB conn 01: 00002161 >> [ 2.620382] nouveau 0000:01:00.0: DRM: DCB conn 02: 00000200 >> [ 2.666199] nouveau 0000:01:00.0: hwmon_device_register() is deprecated. >> Please convert the driver to use hwmon_device_register_with_info(). >> [ 2.717519] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies >> [ 2.992994] nouveau 0000:01:00.0: DRM: allocated 2560x1440 fb: 0x60000, >> bo ffff8cd1499f8000 >> [ 3.025200] fbcon: nouveaufb (fb0) is primary device >> [ 3.253561] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device >> [ 3.268163] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on >> minor 0 >> [ 2150.225651] nouveau 0000:01:00.0: fifo: read fault at 0006710000 engine >> 00 [GR] client 02 [GPC0/PE_0] reason 02 [PTE] on channel 31 [007e8cb000 >> Xwayland[3019]] >> [ 2150.225662] nouveau 0000:01:00.0: fifo: channel 31: killed >> [ 2150.225663] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery >> [ 2150.225666] nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery >> [ 2150.225669] nouveau 0000:01:00.0: Xwayland[3019]: channel 31 killed! >> [ 2296.863975] Workqueue: events_unbound nv50_disp_atomic_commit_work >> [nouveau] >> [ 2296.863990] ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau] >> [ 2296.864032] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> [ 2296.864047] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> [ 2296.864118] Workqueue: events_unbound nv50_disp_atomic_commit_work >> [nouveau] >> [ 2296.864138] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] >> [ 2296.864153] ? nv84_fence_read+0x2e/0x30 [nouveau] >> [ 2296.864175] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> [ 2296.864189] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> [ 2417.699641] Workqueue: events_unbound nv50_disp_atomic_commit_work >> [nouveau] >> [ 2417.699656] ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau] >> [ 2417.699688] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> [ 2417.699705] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> [ 2417.699785] Workqueue: events_unbound nv50_disp_atomic_commit_work >> [nouveau] >> [ 2417.699808] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] >> [ 2417.699825] ? nv84_fence_read+0x2e/0x30 [nouveau] >> [ 2417.699851] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> [ 2417.699867] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> [ 2538.535424] Workqueue: events_unbound nv50_disp_atomic_commit_work >> [nouveau] >> [ 2538.535439] ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau] >> [ 2538.535469] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> [ 2538.535485] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> [ 2538.535555] Workqueue: events_unbound nv50_disp_atomic_commit_work >> [nouveau] >> [ 2538.535576] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] >> [ 2538.535591] ? nv84_fence_read+0x2e/0x30 [nouveau] >> [ 2538.535614] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> [ 2538.535628] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> >> >> _______________________________________________ >> Nouveau mailing list >> Nouveau at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/nouveau >>
On Mon, May 8, 2017 at 7:50 AM, Vincent Vanackere <vincent.vanackere at gmail.com> wrote:> Alternatively if you know of a fanless graphic card model that would be able to drive 2 monitors at 2560x1440 with proper linux support, I'm interested ;-)Anything made by AMD will likely work well on linux (and if it doesn't, you should get something that closer resembles support). I believe that there are some older model fanless HDMI + dual-link DVI boards, but I'm not intimately familiar with the full line of AMD offerings. Separately, my recommendation of using xf86-video-nouveau stands (and is, I might point out, entirely contrary to Ben's, who is the RedHat "nouveau" guy and kernel module maintainer). However that won't fix in any way any GL applications that "do bad things" (or cause nouveau to do bad things). FWIW in my regular (non-nouveau-development-targeted) use of Xorg with a variety of nvidia boards, currently the same GK208B as you, and without any of the "modern" hotness, I've almost never encountered hard hangs, despite using applications like chrome, etc which do tend to make use of some GL capabilities. Note that some GK208's, like mine, and some other boards as well, have a very sad hang-due-to-fan situation. The bug is https://bugs.freedesktop.org/show_bug.cgi?id=91413#c6, the hackpatch is at https://lists.freedesktop.org/archives/nouveau/2015-March/020421.html . It's been going on for literally years without being addressed. Cheers, -ilia
Vincent Vanackere
2017-May-09 07:50 UTC
[Nouveau] GT 730 freeze : how do diagnose / debug ?
Some additional data: - putting LIBGL_ALWAYS_SOFTWARE=1 in /etc/environment makes indeed the system work (for my current usage, the slowness is acceptable in exchange of stabillity) - I still get lock-up using mesa from git (17.2~git1705081930.25d2 from this repository https://launchpad.net/~oibaf/+archive/ubuntu/graphics-drivers) I have another question (probably Ben Skeggs could also give an advice ?): I see there are a lot more mesa variables that can be set ( https://www.mesa3d.org/envvars.html). Are there some other variables that I could set in order to either partially enable hardware acceleration or (better) to get a diagnostic of what the driver is doing that is causing the graphic card to hang ? Thanks for your help ! Vincent 2017-05-08 13:50 GMT+02:00 Vincent Vanackere <vincent.vanackere at gmail.com>:> On 07/05/2017 23:50, Ilia Mirkin wrote: > > You have two issues: > > > > (a) nouveau's GL driver messed something up, causing a read fault error > > (b) nouveau's kernel driver tried to recover. It failed. > > > > Solution to #1: None, really. You can try updating mesa, and hope it > > helps. Not sure what version you're on. > > Here's my packages version: > > ii libegl1-mesa:amd64 17.0.3-1ubuntu1 amd64 > free implementation of the EGL API -- runtime > ii libegl1-mesa-dev:amd64 17.0.3-1ubuntu1 amd64 > free implementation of the EGL API -- development files > ii libgl1-mesa-dev:amd64 17.0.3-1ubuntu1 amd64 > free implementation of the OpenGL API -- GLX development files > ii libgl1-mesa-dri:amd64 17.0.3-1ubuntu1 amd64 > free implementation of the OpenGL API -- DRI modules > ii libgl1-mesa-glx:amd64 17.0.3-1ubuntu1 amd64 > free implementation of the OpenGL API -- GLX runtime > ii libglapi-mesa:amd64 17.0.3-1ubuntu1 amd64 > free implementation of the GL API -- shared library > ii libgles2-mesa:amd64 17.0.3-1ubuntu1 amd64 > free implementation of the OpenGL|ES 2.x API -- runtime > ii libglu1-mesa:amd64 9.0.0-2.1build1 amd64 > Mesa OpenGL utility library (GLU) > ii libglu1-mesa-dev:amd64 9.0.0-2.1build1 amd64 > Mesa OpenGL utility library -- development files > ii libwayland-egl1-mesa:amd64 17.0.3-1ubuntu1 amd64 > implementation of the Wayland EGL platform -- runtime > ii mesa-common-dev:amd64 17.0.3-1ubuntu1 amd64 > Developer documentation for Mesa > ii mesa-utils 8.3.0-4 amd64 > Miscellaneous Mesa GL utilities > ii mesa-vdpau-drivers:amd64 17.0.3-1ubuntu1 amd64 > Mesa VDPAU video acceleration drivers > > > I'll try compiling a newer version from git to see if it helps... > > > Solution to #2: Ben Skeggs will hopefully have something clever to > > say. The recovery logic was recently beefed up considerably, so the > > fact that you even got that far is already a good start. > > > > If you're looking for a stable experience with Xorg, I recommend using > > xf86-video-nouveau -- it's been extensively battle-tested, and is > > quite simple logic; I also recommend against anything that uses GL on > > an ongoing basis (which, sadly, everyone thinks is the coolest thing > > to do these days). If you're looking for a stable experience with a > > GL-based Wayland compositor, you'll have to wait until either the > > nouveau GL driver is perfect or nouveau kernel module can properly > > recover from any screwups the GL driver makes. > > I'm not expecting the GL driver to be perfect ;-) > However it would be nice if the kernel module could recover at least a bit > better from bad commands from the GL driver (indeed I've had some hard > lockups too where I could not even connect from ssh). > > > You can also remove nouveau_dri.so entirely, which is a big hammer > > against these types of issues (removes all GL-based acceleration), or > > you can run certain key pieces of software with > > LIBGL_ALWAYS_SOFTWARE=1, which will force a CPU-based GL > > implementation. > > Thanks for the hint, I'll try this workaround too ! > > Please let me know if I can do anything to improve the drivers's > stablility (like dumping the cards's register or enabling some traces ?). > Alternatively if you know of a fanless graphic card model that would be > able to drive 2 monitors at 2560x1440 with proper linux support, I'm > interested ;-) > > Regards > > > Cheers, > > > > -ilia > > > > > > 2017-05-07 16:03 GMT-04:00 Vincent Vanackere < > vincent.vanackere at gmail.com>: > >> Hi, > >> > >> I own an Asus GT730-SL-2GD3-BRK, trying to drive two monitors at > 2560x1440 > >> resolution. Using gnome-shell with either Xorg or wayland I get screen > >> freezes very frequently. Those freezes usually require a reboot to get > >> working graphics (below a sample trace that I got yesterday). > >> I am running Ubuntu 17.04 with the latest kernels avalable, I also > tested > >> various more recent kernels including the latest drm tree at > >> https://cgit.freedesktop.org/~airlied/linux/log/?h=drm-next but the > problem > >> always occurs. > >> When a freeze occurs, the computer is still reachable through ssh but > the > >> only action I found so far to get graphics back is to restart the > computer. > >> I am willing to run diagnostics programs or test any patch if it would > >> help. I'm also not excluding the possibility that I may have some faulty > >> hardware so any hardwae-health-test advice would be welcome... > >> > >> Regards, > >> > >> Vincent Vanackère > >> > >> [ 1.199135] nouveau 0000:01:00.0: NVIDIA GK208B (b06070b1) > >> [ 1.319930] nouveau 0000:01:00.0: bios: version 80.28.92.00.10 > >> [ 1.322095] nouveau 0000:01:00.0: fb: 2048 MiB DDR3 > >> [ 2.620362] nouveau 0000:01:00.0: DRM: VRAM: 2048 MiB > >> [ 2.620362] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB > >> [ 2.620364] nouveau 0000:01:00.0: DRM: TMDS table version 2.0 > >> [ 2.620378] nouveau 0000:01:00.0: DRM: DCB version 4.0 > >> [ 2.620379] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 00020030 > >> [ 2.620380] nouveau 0000:01:00.0: DRM: DCB outp 01: 02011f62 00020010 > >> [ 2.620380] nouveau 0000:01:00.0: DRM: DCB outp 02: 02022f10 00000000 > >> [ 2.620381] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001031 > >> [ 2.620381] nouveau 0000:01:00.0: DRM: DCB conn 01: 00002161 > >> [ 2.620382] nouveau 0000:01:00.0: DRM: DCB conn 02: 00000200 > >> [ 2.666199] nouveau 0000:01:00.0: hwmon_device_register() is > deprecated. > >> Please convert the driver to use hwmon_device_register_with_info(). > >> [ 2.717519] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer > copies > >> [ 2.992994] nouveau 0000:01:00.0: DRM: allocated 2560x1440 fb: > 0x60000, > >> bo ffff8cd1499f8000 > >> [ 3.025200] fbcon: nouveaufb (fb0) is primary device > >> [ 3.253561] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device > >> [ 3.268163] [drm] Initialized nouveau 1.3.1 20120801 for > 0000:01:00.0 on > >> minor 0 > >> [ 2150.225651] nouveau 0000:01:00.0: fifo: read fault at 0006710000 > engine > >> 00 [GR] client 02 [GPC0/PE_0] reason 02 [PTE] on channel 31 [007e8cb000 > >> Xwayland[3019]] > >> [ 2150.225662] nouveau 0000:01:00.0: fifo: channel 31: killed > >> [ 2150.225663] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for > recovery > >> [ 2150.225666] nouveau 0000:01:00.0: fifo: engine 0: scheduled for > recovery > >> [ 2150.225669] nouveau 0000:01:00.0: Xwayland[3019]: channel 31 killed! > >> [ 2296.863975] Workqueue: events_unbound nv50_disp_atomic_commit_work > >> [nouveau] > >> [ 2296.863990] ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau] > >> [ 2296.864032] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] > >> [ 2296.864047] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] > >> [ 2296.864118] Workqueue: events_unbound nv50_disp_atomic_commit_work > >> [nouveau] > >> [ 2296.864138] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] > >> [ 2296.864153] ? nv84_fence_read+0x2e/0x30 [nouveau] > >> [ 2296.864175] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] > >> [ 2296.864189] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] > >> [ 2417.699641] Workqueue: events_unbound nv50_disp_atomic_commit_work > >> [nouveau] > >> [ 2417.699656] ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau] > >> [ 2417.699688] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] > >> [ 2417.699705] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] > >> [ 2417.699785] Workqueue: events_unbound nv50_disp_atomic_commit_work > >> [nouveau] > >> [ 2417.699808] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] > >> [ 2417.699825] ? nv84_fence_read+0x2e/0x30 [nouveau] > >> [ 2417.699851] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] > >> [ 2417.699867] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] > >> [ 2538.535424] Workqueue: events_unbound nv50_disp_atomic_commit_work > >> [nouveau] > >> [ 2538.535439] ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau] > >> [ 2538.535469] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] > >> [ 2538.535485] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] > >> [ 2538.535555] Workqueue: events_unbound nv50_disp_atomic_commit_work > >> [nouveau] > >> [ 2538.535576] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] > >> [ 2538.535591] ? nv84_fence_read+0x2e/0x30 [nouveau] > >> [ 2538.535614] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] > >> [ 2538.535628] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] > >> > >> > >> _______________________________________________ > >> Nouveau mailing list > >> Nouveau at lists.freedesktop.org > >> https://lists.freedesktop.org/mailman/listinfo/nouveau > >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20170509/7f0deacb/attachment-0001.html>
Vincent Vanackere
2017-May-11 09:15 UTC
[Nouveau] GT 730 freeze : how do diagnose / debug ?
2017-05-09 9:50 GMT+02:00 Vincent Vanackere <vincent.vanackere at gmail.com>:> Some additional data: > - putting LIBGL_ALWAYS_SOFTWARE=1 in /etc/environment makes indeed the > system work (for my current usage, the slowness is acceptable in exchange > of stabillity) >Unfortunately I just got a freeze (using wayland with LIBGL_ALWAYS_SOFTWARE=1): [179221.647861] nouveau 0000:01:00.0: Xwayland[27856]: nv50cal_space: -16 [179245.768920] traps: gnome-shell[3175] trap int3 ip:7f14cd988de1 sp:7ffe10e66110 error:0 in libglib-2.0.so.0.5200.0[7f14cd939000+111000] [179256.854109] [drm:drm_atomic_helper_swap_state [drm_kms_helper]] *ERROR* [CRTC:41:head-1] hw_done timed out [179267.094392] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:head-1] hw_done timed out [179277.334749] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:head-1] flip_done timed out [179279.385856] nouveau 0000:01:00.0: DRM: base-1: timeout [179289.623162] [drm:drm_atomic_helper_swap_state [drm_kms_helper]] *ERROR* [CRTC:41:head-1] hw_done timed out [179299.863479] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:head-1] hw_done timed out [179310.103838] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:head-1] flip_done timed out [179319.064210] INFO: task kworker/u8:1:30061 blocked for more than 120 seconds. [179319.064211] Not tainted 4.11.0-999-generic #201705062201 [179319.064211] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [179319.064212] kworker/u8:1 D 0 30061 2 0x00000000 [179319.064238] Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau] [179319.064239] Call Trace: [179319.064242] __schedule+0x3c3/0x840 [179319.064261] ? nouveau_display_scanoutpos+0xe9/0x180 [nouveau] [179319.064262] schedule+0x36/0x80 [179319.064264] schedule_timeout+0x23e/0x310 [179319.064265] ? __slab_free+0xa9/0x300 [179319.064283] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] [179319.064300] ? nv84_fence_read+0x2e/0x30 [nouveau] [179319.064301] dma_fence_default_wait+0x1af/0x250 [179319.064302] ? dma_fence_default_wait+0x1af/0x250 [179319.064304] ? dma_fence_free+0x20/0x20 [179319.064305] dma_fence_wait_timeout+0x39/0xe0 [179319.064310] drm_atomic_helper_wait_for_fences+0x4c/0xf0 [drm_kms_helper] [179319.064326] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] [179319.064342] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] [179319.064343] process_one_work+0x1e9/0x410 [179319.064344] worker_thread+0x4b/0x410 [179319.064345] kthread+0x109/0x140 [179319.064346] ? process_one_work+0x410/0x410 [179319.064347] ? kthread_create_on_node+0x70/0x70 [179319.064348] ret_from_fork+0x2c/0x40 [179320.344194] [drm:drm_atomic_helper_swap_state [drm_kms_helper]] *ERROR* [CRTC:41:head-1] hw_done timed out [179330.584461] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:head-1] hw_done timed out [179340.824777] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:head-1] flip_done timed out My current kernel version is 4.11.0-999.201705062201 from http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/2017-05-07/ To Ben Skeggs: is there anything I could do to help fix this ? If there is no hope of stability improvements I will have to switch to another graphic card so please let me know ! Best regards, Vincent - I still get lock-up using mesa from git (17.2~git1705081930.25d2 from> this repository https://launchpad.net/~oibaf/+archive/ubuntu/graphics- > drivers <https://launchpad.net/%7Eoibaf/+archive/ubuntu/graphics-drivers>) > > I have another question (probably Ben Skeggs could also give an advice ?): > I see there are a lot more mesa variables that can be set ( > https://www.mesa3d.org/envvars.html). Are there some other variables that > I could set in order to either partially enable hardware acceleration or > (better) to get a diagnostic of what the driver is doing that is causing > the graphic card to hang ? > > Thanks for your help ! > > Vincent > > 2017-05-08 13:50 GMT+02:00 Vincent Vanackere <vincent.vanackere at gmail.com> > : > >> On 07/05/2017 23:50, Ilia Mirkin wrote: >> > You have two issues: >> > >> > (a) nouveau's GL driver messed something up, causing a read fault error >> > (b) nouveau's kernel driver tried to recover. It failed. >> > >> > Solution to #1: None, really. You can try updating mesa, and hope it >> > helps. Not sure what version you're on. >> >> Here's my packages version: >> >> ii libegl1-mesa:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the EGL API -- runtime >> ii libegl1-mesa-dev:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the EGL API -- development files >> ii libgl1-mesa-dev:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the OpenGL API -- GLX development files >> ii libgl1-mesa-dri:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the OpenGL API -- DRI modules >> ii libgl1-mesa-glx:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the OpenGL API -- GLX runtime >> ii libglapi-mesa:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the GL API -- shared library >> ii libgles2-mesa:amd64 17.0.3-1ubuntu1 amd64 >> free implementation of the OpenGL|ES 2.x API -- runtime >> ii libglu1-mesa:amd64 9.0.0-2.1build1 amd64 >> Mesa OpenGL utility library (GLU) >> ii libglu1-mesa-dev:amd64 9.0.0-2.1build1 amd64 >> Mesa OpenGL utility library -- development files >> ii libwayland-egl1-mesa:amd64 17.0.3-1ubuntu1 amd64 >> implementation of the Wayland EGL platform -- runtime >> ii mesa-common-dev:amd64 17.0.3-1ubuntu1 amd64 >> Developer documentation for Mesa >> ii mesa-utils 8.3.0-4 amd64 >> Miscellaneous Mesa GL utilities >> ii mesa-vdpau-drivers:amd64 17.0.3-1ubuntu1 amd64 >> Mesa VDPAU video acceleration drivers >> >> >> I'll try compiling a newer version from git to see if it helps... >> >> > Solution to #2: Ben Skeggs will hopefully have something clever to >> > say. The recovery logic was recently beefed up considerably, so the >> > fact that you even got that far is already a good start. >> > >> > If you're looking for a stable experience with Xorg, I recommend using >> > xf86-video-nouveau -- it's been extensively battle-tested, and is >> > quite simple logic; I also recommend against anything that uses GL on >> > an ongoing basis (which, sadly, everyone thinks is the coolest thing >> > to do these days). If you're looking for a stable experience with a >> > GL-based Wayland compositor, you'll have to wait until either the >> > nouveau GL driver is perfect or nouveau kernel module can properly >> > recover from any screwups the GL driver makes. >> >> I'm not expecting the GL driver to be perfect ;-) >> However it would be nice if the kernel module could recover at least a >> bit better from bad commands from the GL driver (indeed I've had some hard >> lockups too where I could not even connect from ssh). >> >> > You can also remove nouveau_dri.so entirely, which is a big hammer >> > against these types of issues (removes all GL-based acceleration), or >> > you can run certain key pieces of software with >> > LIBGL_ALWAYS_SOFTWARE=1, which will force a CPU-based GL >> > implementation. >> >> Thanks for the hint, I'll try this workaround too ! >> >> Please let me know if I can do anything to improve the drivers's >> stablility (like dumping the cards's register or enabling some traces ?). >> Alternatively if you know of a fanless graphic card model that would be >> able to drive 2 monitors at 2560x1440 with proper linux support, I'm >> interested ;-) >> >> Regards >> >> > Cheers, >> > >> > -ilia >> > >> > >> > 2017-05-07 16:03 GMT-04:00 Vincent Vanackere < >> vincent.vanackere at gmail.com>: >> >> Hi, >> >> >> >> I own an Asus GT730-SL-2GD3-BRK, trying to drive two monitors at >> 2560x1440 >> >> resolution. Using gnome-shell with either Xorg or wayland I get screen >> >> freezes very frequently. Those freezes usually require a reboot to get >> >> working graphics (below a sample trace that I got yesterday). >> >> I am running Ubuntu 17.04 with the latest kernels avalable, I also >> tested >> >> various more recent kernels including the latest drm tree at >> >> https://cgit.freedesktop.org/~airlied/linux/log/?h=drm-next but the >> problem >> >> always occurs. >> >> When a freeze occurs, the computer is still reachable through ssh but >> the >> >> only action I found so far to get graphics back is to restart the >> computer. >> >> I am willing to run diagnostics programs or test any patch if it >> would >> >> help. I'm also not excluding the possibility that I may have some >> faulty >> >> hardware so any hardwae-health-test advice would be welcome... >> >> >> >> Regards, >> >> >> >> Vincent Vanackère >> >> >> >> [ 1.199135] nouveau 0000:01:00.0: NVIDIA GK208B (b06070b1) >> >> [ 1.319930] nouveau 0000:01:00.0: bios: version 80.28.92.00.10 >> >> [ 1.322095] nouveau 0000:01:00.0: fb: 2048 MiB DDR3 >> >> [ 2.620362] nouveau 0000:01:00.0: DRM: VRAM: 2048 MiB >> >> [ 2.620362] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB >> >> [ 2.620364] nouveau 0000:01:00.0: DRM: TMDS table version 2.0 >> >> [ 2.620378] nouveau 0000:01:00.0: DRM: DCB version 4.0 >> >> [ 2.620379] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 >> 00020030 >> >> [ 2.620380] nouveau 0000:01:00.0: DRM: DCB outp 01: 02011f62 >> 00020010 >> >> [ 2.620380] nouveau 0000:01:00.0: DRM: DCB outp 02: 02022f10 >> 00000000 >> >> [ 2.620381] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001031 >> >> [ 2.620381] nouveau 0000:01:00.0: DRM: DCB conn 01: 00002161 >> >> [ 2.620382] nouveau 0000:01:00.0: DRM: DCB conn 02: 00000200 >> >> [ 2.666199] nouveau 0000:01:00.0: hwmon_device_register() is >> deprecated. >> >> Please convert the driver to use hwmon_device_register_with_info(). >> >> [ 2.717519] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer >> copies >> >> [ 2.992994] nouveau 0000:01:00.0: DRM: allocated 2560x1440 fb: >> 0x60000, >> >> bo ffff8cd1499f8000 >> >> [ 3.025200] fbcon: nouveaufb (fb0) is primary device >> >> [ 3.253561] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device >> >> [ 3.268163] [drm] Initialized nouveau 1.3.1 20120801 for >> 0000:01:00.0 on >> >> minor 0 >> >> [ 2150.225651] nouveau 0000:01:00.0: fifo: read fault at 0006710000 >> engine >> >> 00 [GR] client 02 [GPC0/PE_0] reason 02 [PTE] on channel 31 [007e8cb000 >> >> Xwayland[3019]] >> >> [ 2150.225662] nouveau 0000:01:00.0: fifo: channel 31: killed >> >> [ 2150.225663] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for >> recovery >> >> [ 2150.225666] nouveau 0000:01:00.0: fifo: engine 0: scheduled for >> recovery >> >> [ 2150.225669] nouveau 0000:01:00.0: Xwayland[3019]: channel 31 killed! >> >> [ 2296.863975] Workqueue: events_unbound nv50_disp_atomic_commit_work >> >> [nouveau] >> >> [ 2296.863990] ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau] >> >> [ 2296.864032] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> >> [ 2296.864047] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> >> [ 2296.864118] Workqueue: events_unbound nv50_disp_atomic_commit_work >> >> [nouveau] >> >> [ 2296.864138] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] >> >> [ 2296.864153] ? nv84_fence_read+0x2e/0x30 [nouveau] >> >> [ 2296.864175] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> >> [ 2296.864189] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> >> [ 2417.699641] Workqueue: events_unbound nv50_disp_atomic_commit_work >> >> [nouveau] >> >> [ 2417.699656] ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau] >> >> [ 2417.699688] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> >> [ 2417.699705] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> >> [ 2417.699785] Workqueue: events_unbound nv50_disp_atomic_commit_work >> >> [nouveau] >> >> [ 2417.699808] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] >> >> [ 2417.699825] ? nv84_fence_read+0x2e/0x30 [nouveau] >> >> [ 2417.699851] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> >> [ 2417.699867] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> >> [ 2538.535424] Workqueue: events_unbound nv50_disp_atomic_commit_work >> >> [nouveau] >> >> [ 2538.535439] ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau] >> >> [ 2538.535469] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> >> [ 2538.535485] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> >> [ 2538.535555] Workqueue: events_unbound nv50_disp_atomic_commit_work >> >> [nouveau] >> >> [ 2538.535576] ? nouveau_bo_rd32+0x2a/0x30 [nouveau] >> >> [ 2538.535591] ? nv84_fence_read+0x2e/0x30 [nouveau] >> >> [ 2538.535614] nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau] >> >> [ 2538.535628] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] >> >> >> >> >> >> _______________________________________________ >> >> Nouveau mailing list >> >> Nouveau at lists.freedesktop.org >> >> https://lists.freedesktop.org/mailman/listinfo/nouveau >> >> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20170511/782ade4d/attachment-0001.html>
Apparently Analagous Threads
- GT 730 freeze : how do diagnose / debug ?
- GT 730 freeze : how do diagnose / debug ?
- GT 730 freeze : how do diagnose / debug ?
- [Bug 99900] [NVC1] nouveau: freeze / crash after kernel update to 4.10
- [Bug 108080] New: GK106 [GeForce GTX 660] System Freeze - warp 3d0009 [ILLEGAL_INSTR_ENCODING]