bugzilla-daemon at freedesktop.org
2013-Jun-09 00:20 UTC
[Nouveau] [Bug 65554] New: CPU lock with nouveau_fan_update
https://bugs.freedesktop.org/show_bug.cgi?id=65554
Priority: medium
Bug ID: 65554
Assignee: nouveau at lists.freedesktop.org
Summary: CPU lock with nouveau_fan_update
QA Contact: xorg-team at lists.x.org
Severity: major
Classification: Unclassified
OS: Linux (All)
Reporter: ddamienn at gmail.com
Hardware: Other
Status: NEW
Version: unspecified
Component: Driver/nouveau
Product: xorg
Created attachment 80540
--> https://bugs.freedesktop.org/attachment.cgi?id=80540&action=edit
kernel log
Kernel logs report "Watchdog detected hard LOCKUP on cpu 1". It has
occurred
once per day for the last three days. Each time, it happens seemingly randomly
- monitor is turned off, I'm connected remotely via SSH but system is idle.
This is on ArchLinux with xf86-video-nouveau 1.0.7. The machine is ~2.5 years
old and has been very solid until now. My video card is: 01:00.0 VGA compatible
controller: NVIDIA Corporation GT216 [GeForce GT 220] (rev a2).
I have attached the kernel log. I also note that I don't know why the
'PTHERM'
events would be occurring - this was while the connected monitor was off and I
was only connected remotely via SSH.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20130609/d7c4ff7b/attachment.html>
bugzilla-daemon at freedesktop.org
2013-Jun-09 00:40 UTC
[Nouveau] [Bug 65554] CPU lock with nouveau_fan_update
https://bugs.freedesktop.org/show_bug.cgi?id=65554
--- Comment #1 from Ilia Mirkin <imirkin at alum.mit.edu> ---
nouveau_fan_update:
spin_lock_irqsave(&fan->lock, flags);
/* schedule next fan update, if not at target speed already */
if (list_empty(&fan->alarm.head) && target != duty) {
u16 bump_period = fan->bios.bump_period;
u16 slow_down_period = fan->bios.slow_down_period;
...
ptimer->alarm(ptimer, delay * 1000 * 1000, &fan->alarm);
If delay is somehow 0, the ->alarm will cause nouveau_fan_update to get
called
immediately. Can you add a printk to that function that shows the values? (This
may end up totally flooding your dmesg too... but I think the values may be the
same across prints.)
e.g.
diff --git a/drivers/gpu/drm/nouveau/core/subdev/therm/fan.c
b/drivers/gpu/drm/nouveau/core/subdev/therm/fan.c
index c728380..9453afd 100644
--- a/drivers/gpu/drm/nouveau/core/subdev/therm/fan.c
+++ b/drivers/gpu/drm/nouveau/core/subdev/therm/fan.c
@@ -88,7 +88,7 @@ nouveau_fan_update(struct nouveau_fan *fan, bool immediate,
int target)
delay = min(bump_period, slow_down_period) ;
else
delay = bump_period;
-
+ nv_info(therm, "Scheduling fan update in %d (slow: %d,
bump:
%d)\n", delay, slow_down_period, bump_period);
ptimer->alarm(ptimer, delay * 1000 * 1000,
&fan->alarm);
}
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20130609/ed98784a/attachment.html>
bugzilla-daemon at freedesktop.org
2013-Jun-09 01:01 UTC
[Nouveau] [Bug 65554] CPU lock with nouveau_fan_update
https://bugs.freedesktop.org/show_bug.cgi?id=65554 --- Comment #2 from ddamienn at gmail.com --- (In reply to comment #1)> If delay is somehow 0, the ->alarm will cause nouveau_fan_update to get > called immediately. Can you add a printk to that function that shows the > values? (This may end up totally flooding your dmesg too... but I think the > values may be the same across prints.)Thanks for your help. I will try to add the printk. The lock has made the machine inaccessible remotely, so it'll be a couple of days until I can post the results. -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20130609/149162d8/attachment.html>
bugzilla-daemon at freedesktop.org
2013-Jun-14 20:24 UTC
[Nouveau] [Bug 65554] CPU lock with nouveau_fan_update
https://bugs.freedesktop.org/show_bug.cgi?id=65554
ddamienn at gmail.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |INVALID
--- Comment #3 from ddamienn at gmail.com ---
I've had the debug statement active for the last few days, but the error
hasn't
re-occurred. I discovered that my video card's fan was only working
intermittently, which a clean seems to have fixed. Since the inactive fan
likely caused the error I was seeing, I am closing this bug report as invalid.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20130614/bd849c1c/attachment.html>
Apparently Analagous Threads
- [PATCH 1/4] pm/fan: drop the fan lock in fan_update() before rescheduling
- nouveau_fan_update: possible circular locking dependency detected
- linux-4.13/drivers/gpu/drm/nouveau/nvkm/subdev/therm/fan.c:86: possible faulty logic ?
- [PATCH 0/5] Thermal management fixes
- [PATCH] drm/nouveau/therm: remove redundant duty == target check