bugzilla-daemon at freedesktop.org
2014-May-10 15:22 UTC
[Nouveau] [Bug 78530] New: Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 Priority: medium Bug ID: 78530 Assignee: nouveau at lists.freedesktop.org Summary: Memory corruption on Lenovo t440p with runpm QA Contact: xorg-team at lists.x.org Severity: normal Classification: Unclassified OS: All Reporter: nikoamia at gmail.com Hardware: Other Status: NEW Version: unspecified Component: Driver/nouveau Product: xorg On recent kernels with runpm the system crashes (with a big memory corruption) when nvidia card is disabled and then enabled on Lenovo T440p laptops with recent BIOSes (1.16+). My investigations into this: 1. The crash occurs even with just acpi_call, so it looks like on those BIOSes there is some new kind of procedure for enabling nvidia. 2. ACPI calls from Windows and Linux does not differ much (and trying Windows' calling sequence does not help). Also, DSDT from 1.14 and 1.16 BIOSes basically do not differ. 3. The bug can be workarounded by disabling all memory upper than 4GB 4. This bug affects not really memory (there is no corruption of regular memory), but devices using memory regions. For example, I can load system from ramdisk with all such devices disabled, (1) perform acpi nvidia disable-enable and (2) try to load such module (order of (1) and (2) does not matter) -- errors pile up, even if I unload and load this module. My own hypothesis is that something with PCI bus gets broken -- maybe some reinitialization needs to be performed? Links: 1. https://github.com/Bumblebee-Project/bbswitch/issues/78 (main discussion place) 2. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1268669 (ubuntu bug report) 3. https://bbs.archlinux.org/viewtopic.php?pid=1414109 (one of forum threads about this) -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140510/1db08138/attachment.html>
bugzilla-daemon at freedesktop.org
2014-May-10 18:23 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #1 from Nikolay Amiantov <nikoamia at gmail.com> --- Also, Dell XPS 15z with recent BIOSes is also affected. (reported by mattdistro in github thread) -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140510/6723e748/attachment-0001.html>
bugzilla-daemon at freedesktop.org
2014-May-10 18:24 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #2 from Ilia Mirkin <imirkin at alum.mit.edu> --- Just to confirm -- this happens without bumblebee as well, right? Does it happen with the blob driver? -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140510/bcd7364b/attachment.html>
bugzilla-daemon at freedesktop.org
2014-May-10 18:25 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #3 from Nikolay Amiantov <nikoamia at gmail.com> --- This happens even without any drivers at all: just acpi_call to make _PS3, _PS0 calls. -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140510/05b68acc/attachment.html>
bugzilla-daemon at freedesktop.org
2014-May-10 18:27 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #4 from Ilia Mirkin <imirkin at alum.mit.edu> --- Manually making acpi calls isn't the most prudent thing to do. Please confirm that this happens (a) With just nouveau loaded. No bumblebee anywhere at all. (b) With the blob driver. -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140510/b5b37eb0/attachment.html>
bugzilla-daemon at freedesktop.org
2014-May-10 18:29 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #5 from Nikolay Amiantov <nikoamia at gmail.com> --- Okay -- I should try blob with bumblebee, right? -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140510/c1bce011/attachment.html>
bugzilla-daemon at freedesktop.org
2014-May-10 18:30 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #6 from Ilia Mirkin <imirkin at alum.mit.edu> --- (In reply to comment #5)> Okay -- I should try blob with bumblebee, right?I'm not familiar with the blob situation wrt runtime pm. If they have any runtime pm-style support, please use that instead of bumblebee. If they have no support for that, then I guess it's fine to try it with bumblebee. -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140510/b7927deb/attachment.html>
bugzilla-daemon at freedesktop.org
2014-May-10 18:32 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #7 from Nikolay Amiantov <nikoamia at gmail.com> --- (In reply to comment #6)> I'm not familiar with the blob situation wrt runtime pm. If they have any > runtime pm-style support, please use that instead of bumblebee. If they have > no support for that, then I guess it's fine to try it with bumblebee.I'm not too familiar with it too, but I thought that they haven't added feature like this yet -- just asked to confirm. I'll try bumblebee then. -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140510/620bec42/attachment.html>
bugzilla-daemon at freedesktop.org
2014-May-10 19:08 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #8 from Nikolay Amiantov <nikoamia at gmail.com> --- I've tested two configurations on kernel 3.14.3, bbswitch 0.8 and nvidia 337.12: (1) disabled acpi_call and my custom script, disabled bbswitch and bumblebeed (all bumblebee components), modprobe'd nouveau, tried to start X (2) started bumblebeed, loaded bbswitch, started X without nouveau, ran "primusrun glxgears" With both cases, I've got fs corruption issues, iwlwifi errors and other distinctive errors pointing at memory corruption. -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140510/7793ec98/attachment.html>
bugzilla-daemon at freedesktop.org
2014-Jul-10 14:00 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #9 from Alexander Monakov <amonakov at gmail.com> --- This problem appears to be fixed in recent kernels after adding "Windows 2013" to kernel built-in ACPI OSI list. Bisection to resolving kernel commit shows: https://github.com/Bumblebee-Project/bbswitch/issues/78#issuecomment-48600484 Nikolay ? would you mind closing the bug after verifying it's resolved for you? -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140710/fbb17424/attachment.html>
bugzilla-daemon at freedesktop.org
2014-Jul-10 18:09 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 Nikolay Amiantov <nikoamia at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #10 from Nikolay Amiantov <nikoamia at gmail.com> --- The OSI fix indeed solves the issue. Closing the bug. -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140710/cc50821a/attachment-0001.html>
bugzilla-daemon at freedesktop.org
2014-Jul-18 21:56 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 Nikolay Amiantov <nikoamia at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|FIXED |--- --- Comment #11 from Nikolay Amiantov <nikoamia at gmail.com> --- Unfortunately, it wasn't a fix -- we've got another ACPI problem which prevented nvidia from disabling at all, so everything "started to work". You can find more about new problem at https://github.com/Bumblebee-Project/bbswitch/issues/78#issuecomment-48768044. I don't think we need another bug for this, do we? -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140718/54861ab2/attachment.html>
bugzilla-daemon at freedesktop.org
2014-Sep-02 11:34 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #12 from Dmitry Nezhevenko <dion at dion.org.ua> --- Hi, I also have affected T440p machine that corrupts everything once runtime PM is enabled or after calling ACPI method to resume card. It was stated in bumblebee github thread, that adding "memmap=99G$0x100000000" to kernel fixes issues on affected systems. My case looks a bit interesting because I have only 4GB of RAM right now, so disabling everything above 4GB should not change behavior. But it changes! Adding memmap= magic fixes issue for me. I've tried to compare /proc/iomem with and without boot options and found one difference. Once booted with memmap=99G$0x100000000 I'm getting one large reserved region: bceff000-18ffffffff : reserved bda00000-bf9fffff : Graphics Stolen Memory bfa00000-febfffff : PCI Bus 0000:00 c0000000-d1ffffff : PCI Bus 0000:02 c0000000-cfffffff : 0000:02:00.0 d0000000-d1ffffff : 0000:02:00.0 e0000000-efffffff : 0000:00:02.0 ... All PCI devices are inside this one large region. But if I boot with default options, iomem is different: bceff000-bf9fffff : reserved bda00000-bf9fffff : Graphics Stolen Memory bfa00000-febfffff : PCI Bus 0000:00 c0000000-d1ffffff : PCI Bus 0000:02 c0000000-cfffffff : 0000:02:00.0 d0000000-d1ffffff : 0000:02:00.0 e0000000-efffffff : 0000:00:02.0 f0000000-f0ffffff : PCI Bus 0000:02 f0000000-f0ffffff : 0000:02:00.0 f1000000-f13fffff : 0000:00:02.0 f1400000-f14fffff : PCI Bus 0000:04 So now this reserved region starting at bceff000 covers all PCI devices. [ I'm attaching both iomap files ] To check this I've tried to explicitly reserve whole region by booting with memmap=1100M$0xbfa00000 parameter. And got pretty similar to mem"map=99G" iomap. But system still crashes after runtime pm. I also was able to capture PCI configuration space for NVIDIA card from Win8 (where everything works). So I can confirm that after acpi_call windows also shows just 0xFF bytes. But once resumed, it's a bit different from linux. Both files attached. Any ideas? Maybe card is somehow misconfigured? Thanks -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140902/af8a7ee9/attachment.html>
bugzilla-daemon at freedesktop.org
2014-Sep-02 11:36 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #13 from Dmitry Nezhevenko <dion at dion.org.ua> --- Created attachment 105598 --> https://bugs.freedesktop.org/attachment.cgi?id=105598&action=edit iomem when booted with memmap=99G$0x40000000 -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140902/cb5fd047/attachment.html>
bugzilla-daemon at freedesktop.org
2014-Sep-02 11:36 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #14 from Dmitry Nezhevenko <dion at dion.org.ua> --- Created attachment 105599 --> https://bugs.freedesktop.org/attachment.cgi?id=105599&action=edit dmesg when booted with memmap=99G$0x40000000 -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140902/e34272a4/attachment.html>
bugzilla-daemon at freedesktop.org
2014-Sep-02 11:37 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #15 from Dmitry Nezhevenko <dion at dion.org.ua> --- Created attachment 105600 --> https://bugs.freedesktop.org/attachment.cgi?id=105600&action=edit iomem default -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140902/7b7dfb44/attachment-0001.html>
bugzilla-daemon at freedesktop.org
2014-Sep-02 11:37 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #16 from Dmitry Nezhevenko <dion at dion.org.ua> --- Created attachment 105601 --> https://bugs.freedesktop.org/attachment.cgi?id=105601&action=edit dmesg default -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140902/7db643de/attachment.html>
bugzilla-daemon at freedesktop.org
2014-Sep-02 11:38 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #17 from Dmitry Nezhevenko <dion at dion.org.ua> --- Created attachment 105602 --> https://bugs.freedesktop.org/attachment.cgi?id=105602&action=edit pci config space linux -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140902/0909be9c/attachment.html>
bugzilla-daemon at freedesktop.org
2014-Sep-02 11:38 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #18 from Dmitry Nezhevenko <dion at dion.org.ua> --- Created attachment 105603 --> https://bugs.freedesktop.org/attachment.cgi?id=105603&action=edit pci config space windows -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140902/960eec6a/attachment.html>
bugzilla-daemon at freedesktop.org
2014-Sep-02 11:39 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #19 from Dmitry Nezhevenko <dion at dion.org.ua> --- Created attachment 105604 --> https://bugs.freedesktop.org/attachment.cgi?id=105604&action=edit pci space diff (linux vs win) -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20140902/168239dd/attachment.html>
bugzilla-daemon at freedesktop.org
2014-Nov-04 08:39 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #20 from Dmitry Nezhevenko <dion at dion.org.ua> --- Any ideas on this? Have anybody tried new BIOS 1.27-1.28? WARN: Once updated there will be no way to revert it back to pre-1.26. -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20141104/b7663cba/attachment.html>
bugzilla-daemon at freedesktop.org
2015-Feb-07 13:15 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 --- Comment #21 from Nikolay Amiantov <nikoamia at gmail.com> --- @doudou on Github managed to solve this problem[1][2] -- Nouveau can port the same fix, I think. [1]: https://github.com/Bumblebee-Project/bbswitch/issues/78#issuecomment-67741841 [2]: https://github.com/Bumblebee-Project/bbswitch/pull/102 -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20150207/d98f0f7e/attachment.html>
bugzilla-daemon at freedesktop.org
2016-Aug-24 14:05 UTC
[Nouveau] [Bug 78530] Memory corruption on Lenovo t440p with runpm
https://bugs.freedesktop.org/show_bug.cgi?id=78530 Peter Wu <peter at lekensteyn.nl> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED CC| |peter at lekensteyn.nl Status|REOPENED |RESOLVED --- Comment #22 from Peter Wu <peter at lekensteyn.nl> --- Fixed in v4.8-rc1 commit 692a17dcc2922a91c6bcf11b3321503a3377b1b1 Author: Peter Wu <peter at lekensteyn.nl> Date: Fri Jul 15 15:12:18 2016 +0200 drm/nouveau/acpi: fix lockup with PCIe runtime PM It was confirmed to fix the memory corruption, if it still happens, please re-open. -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20160824/a7b69f16/attachment.html>
Possibly Parallel Threads
- [Bug 77106] New: Blank screen and errors on Thinkpad T440p GeForce 730M/Intel optimus setup
- Optimus switch corrupts memory on Lenovo T440p
- [PATCH 4/4] drm/nouveau/acpi: fix lockup with PCIe runtime PM
- dovecot2-antispam segfault
- [Bug 70875] New: [NVC0] NOUVEAU(0): [drm] failed to set drm interface version