João Paulo Rechi Vita
2017-Feb-17 15:54 UTC
[Nouveau] nouveau preventing shutdown after suspend-resume
Hello, I'm working on a Asus X756UQK laptop with nvidia + intel graphics cards. After a suspend-resume cycle, the machine hangs on shutdown, requiring a forced power off. After resuming I sometimes see the following messages on the kernel log: [ 186.117539] nouveau 0000:01:00.0: DRM: evicting buffers... [ 186.118105] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle... [ 201.139049] nouveau 0000:01:00.0: DRM: failed to idle channel 0 [DRM] [ 201.139688] ------------[ cut here ]------------ [ 201.140297] WARNING: CPU: 0 PID: 1230 at /usr/src/packages/BUILD/linux-4.8.0/drivers/pci/pci.c:1616 pci_disable_device+0x99/0xb0 [ 201.140970] nouveau 0000:01:00.0: disabling already-disabled device [ 201.140984] Modules linked in: [ 201.141608] ccm arc4 rfcomm joydev cmac bnep intel_rapl x86_pkg_temp_thermal coretemp i2c_designware_platform i2c_designware_core kvm_intel asus_nb_wmi asus_wmi sparse_keymap snd_hda_codec_hdmi snd_hda_codec_conexant snd_soc_skl snd_hda_codec_generic snd_soc_skl_ipc snd_soc_sst_ipc kvm ath10k_pci snd_soc_sst_dsp snd_hda_ext_core snd_soc_sst_match ath10k_core snd_soc_core irqbypass crct10dif_pclmul snd_compress crc32_pclmul ac97_bus ghash_clmulni_intel snd_pcm_dmaengine ath mac80211 snd_hda_intel aesni_intel snd_hda_codec aes_x86_64 snd_hda_core lrw glue_helper uvcvideo snd_hwdep ablk_helper videobuf2_vmalloc cryptd videobuf2_memops snd_pcm videobuf2_v4l2 cfg80211 videobuf2_core videodev snd_timer media input_leds snd r8169 soundcore mii btusb btrtl shpchp processor_thermal_device mei_me idma64 mei [ 201.143087] intel_pch_thermal [ 201.143087] virt_dma [ 201.143087] intel_lpss_pci [ 201.143088] intel_soc_dts_iosf [ 201.143088] hci_uart [ 201.143089] elan_i2c [ 201.143089] btbcm [ 201.143089] btqca [ 201.143090] btintel [ 201.143090] bluetooth [ 201.143090] int3403_thermal [ 201.143091] int340x_thermal_zone [ 201.143091] acpi_als [ 201.143091] kfifo_buf [ 201.143092] int3400_thermal [ 201.143092] acpi_thermal_rel [ 201.143093] industrialio [ 201.143093] intel_lpss_acpi [ 201.143093] acpi_pad [ 201.143094] tpm_crb [ 201.143094] intel_lpss [ 201.143094] fjes [ 201.143095] mac_hid [ 201.143095] asus_wireless [ 201.143095] nouveau [ 201.143096] i915 [ 201.143096] mxm_wmi [ 201.143096] i2c_algo_bit [ 201.143097] drm_kms_helper [ 201.143097] syscopyarea [ 201.143098] ttm [ 201.143098] sysfillrect [ 201.143098] serio_raw [ 201.143099] sysimgblt [ 201.143099] fb_sys_fops [ 201.143100] drm [ 201.143100] ahci [ 201.143100] libahci [ 201.143101] i2c_hid [ 201.143101] hid [ 201.143101] video [ 201.143102] wmi [ 201.143104] CPU: 0 PID: 1230 Comm: kworker/0:6 Not tainted 4.8.0-32-generic #34+dev155.82734c4beos3.1.2-Endless [ 201.143104] Hardware name: ASUSTeK COMPUTER INC. X756UQK/X756UQK, BIOS X756UQK.201 07/01/2016 [ 201.143107] Workqueue: pm pm_runtime_work [ 201.143110] 0000000000000286 000000006307316f ffff953a9d933c08 ffffffff9e031233 [ 201.143111] ffff953a9d933c58 0000000000000000 ffff953a9d933c48 ffffffff9dc832f1 [ 201.143112] 0000065000000000 ffff953a9ff44000 ffff953a9feeeca0 ffff953a997b1800 [ 201.143113] Call Trace: [ 201.143116] [<ffffffff9e031233>] dump_stack+0x63/0x90 [ 201.143118] [<ffffffff9dc832f1>] __warn+0xd1/0xf0 [ 201.143120] [<ffffffff9dc8336f>] warn_slowpath_fmt+0x5f/0x80 [ 201.143122] [<ffffffff9e0924b4>] ? pci_save_vc_state+0x34/0xe0 [ 201.143124] [<ffffffff9e087b99>] pci_disable_device+0x99/0xb0 [ 201.143152] [<ffffffffc06d63d9>] nouveau_pmops_runtime_suspend+0x69/0xe0 [nouveau] [ 201.143153] [<ffffffff9e08a03b>] pci_pm_runtime_suspend+0x5b/0x180 [ 201.143154] [<ffffffff9e1abf63>] __rpm_callback+0x33/0x70 [ 201.143155] [<ffffffff9e1abfc4>] rpm_callback+0x24/0x80 [ 201.143156] [<ffffffff9e089fe0>] ? pci_pm_runtime_resume+0xa0/0xa0 [ 201.143157] [<ffffffff9e1ac2dd>] rpm_suspend+0x12d/0x650 [ 201.143158] [<ffffffff9e1adc48>] pm_runtime_work+0x78/0xa0 [ 201.143160] [<ffffffff9dc9db16>] process_one_work+0x156/0x420 [ 201.143161] [<ffffffff9dc9e62e>] worker_thread+0x4e/0x4a0 [ 201.143162] [<ffffffff9dc9e5e0>] ? rescuer_thread+0x380/0x380 [ 201.143163] [<ffffffff9dc9e5e0>] ? rescuer_thread+0x380/0x380 [ 201.143165] [<ffffffff9dca3b38>] kthread+0xd8/0xf0 [ 201.143167] [<ffffffff9e49f3df>] ret_from_fork+0x1f/0x40 [ 201.143168] [<ffffffff9dca3a60>] ? kthread_park+0x60/0x60 [ 201.143169] ---[ end trace db73394a87e603e4 ]--- Disabling runtime pm (nouveau.runpm=0) the machine is able to shutdown, but with a delay of ~30s, and the following messages on the log: nouveau 0000:01:00.0: Xorg[691]: failed to idle channel 2 [Xorg[691]] nouveau 0000:01:00.0: Xorg[691]: failed to idle channel 2 [Xorg[691]] lspci shows the card as: 01:00.0 3D controller: NVIDIA Corporation Device 179c (rev a2) And according to nouveau logs, this card supports the Optimus technology: [ 0.863470] pci 0000:01:00.0: optimus capabilities: enabled, status dynamic power, hda bios codec supported [ 0.863472] VGA switcheroo: detected Optimus DSM method \_SB_.PCI0.RP01.PEGP handle [ 0.863473] nouveau: detected PR support, will not use DSM [ 0.863494] nouveau 0000:01:00.0: enabling device (0006 -> 0007) [ 0.863691] nouveau 0000:01:00.0: NVIDIA GM107 (1171c0a2) Is this a known problem? I couldn't find any similar reports. Right now we are shipping a DMI-based quirk to disable rpm as a work-around, but I would like to support finding a real solution. I'm happy to file a bugzilla entry and provide any other needed information or help with testing. Are nouveau bugs tracked on bugs.kernel.org or the fdo bugzilla? Thanks and regards, -- João Paulo Rechi Vita http://about.me/jprvita
Ilia Mirkin
2017-Feb-17 16:14 UTC
[Nouveau] nouveau preventing shutdown after suspend-resume
On Fri, Feb 17, 2017 at 10:54 AM, João Paulo Rechi Vita <jprvita at gmail.com> wrote:> I'm happy to file > a bugzilla entry and provide any other needed information or help with > testing. Are nouveau bugs tracked on bugs.kernel.org or the fdo > bugzilla?Nouveau bugs are tracked on the fdo bugzilla. It would appear that you're using a 4.8 kernel. Do issues persist with a 4.10 kernel? I'm thinking of upstream commit cae9ff036ee, but it's likely that there have also been others I'm not thinking of. Cheers, -ilia
João Paulo Rechi Vita
2017-Feb-17 16:22 UTC
[Nouveau] nouveau preventing shutdown after suspend-resume
Hello Ilia, On 17 February 2017 at 11:14, Ilia Mirkin <imirkin at alum.mit.edu> wrote:> On Fri, Feb 17, 2017 at 10:54 AM, João Paulo Rechi Vita > <jprvita at gmail.com> wrote: >> I'm happy to file >> a bugzilla entry and provide any other needed information or help with >> testing. Are nouveau bugs tracked on bugs.kernel.org or the fdo >> bugzilla? > > Nouveau bugs are tracked on the fdo bugzilla. It would appear that > you're using a 4.8 kernel. Do issues persist with a 4.10 kernel? I'm > thinking of upstream commit cae9ff036ee, but it's likely that there > have also been others I'm not thinking of. >Yes, although the logs I have pasted were indeed collected using a 4.8 kernel, the problem persists with a recent Linus' tree (v4.10-rc8 + a couple of commits) which contains cae9ff036ee. Thanks, -- João Paulo Rechi Vita http://about.me/jprvita