bugzilla-daemon at freedesktop.org
2013-Jun-09 00:20 UTC
[Nouveau] [Bug 65554] New: CPU lock with nouveau_fan_update
https://bugs.freedesktop.org/show_bug.cgi?id=65554 Priority: medium Bug ID: 65554 Assignee: nouveau at lists.freedesktop.org Summary: CPU lock with nouveau_fan_update QA Contact: xorg-team at lists.x.org Severity: major Classification: Unclassified OS: Linux (All) Reporter: ddamienn at gmail.com Hardware: Other Status: NEW Version: unspecified Component: Driver/nouveau Product: xorg Created attachment 80540 --> https://bugs.freedesktop.org/attachment.cgi?id=80540&action=edit kernel log Kernel logs report "Watchdog detected hard LOCKUP on cpu 1". It has occurred once per day for the last three days. Each time, it happens seemingly randomly - monitor is turned off, I'm connected remotely via SSH but system is idle. This is on ArchLinux with xf86-video-nouveau 1.0.7. The machine is ~2.5 years old and has been very solid until now. My video card is: 01:00.0 VGA compatible controller: NVIDIA Corporation GT216 [GeForce GT 220] (rev a2). I have attached the kernel log. I also note that I don't know why the 'PTHERM' events would be occurring - this was while the connected monitor was off and I was only connected remotely via SSH. -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20130609/d7c4ff7b/attachment.html>
bugzilla-daemon at freedesktop.org
2013-Jun-09 00:40 UTC
[Nouveau] [Bug 65554] CPU lock with nouveau_fan_update
https://bugs.freedesktop.org/show_bug.cgi?id=65554 --- Comment #1 from Ilia Mirkin <imirkin at alum.mit.edu> --- nouveau_fan_update: spin_lock_irqsave(&fan->lock, flags); /* schedule next fan update, if not at target speed already */ if (list_empty(&fan->alarm.head) && target != duty) { u16 bump_period = fan->bios.bump_period; u16 slow_down_period = fan->bios.slow_down_period; ... ptimer->alarm(ptimer, delay * 1000 * 1000, &fan->alarm); If delay is somehow 0, the ->alarm will cause nouveau_fan_update to get called immediately. Can you add a printk to that function that shows the values? (This may end up totally flooding your dmesg too... but I think the values may be the same across prints.) e.g. diff --git a/drivers/gpu/drm/nouveau/core/subdev/therm/fan.c b/drivers/gpu/drm/nouveau/core/subdev/therm/fan.c index c728380..9453afd 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/therm/fan.c +++ b/drivers/gpu/drm/nouveau/core/subdev/therm/fan.c @@ -88,7 +88,7 @@ nouveau_fan_update(struct nouveau_fan *fan, bool immediate, int target) delay = min(bump_period, slow_down_period) ; else delay = bump_period; - + nv_info(therm, "Scheduling fan update in %d (slow: %d, bump: %d)\n", delay, slow_down_period, bump_period); ptimer->alarm(ptimer, delay * 1000 * 1000, &fan->alarm); } -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20130609/ed98784a/attachment.html>
bugzilla-daemon at freedesktop.org
2013-Jun-09 01:01 UTC
[Nouveau] [Bug 65554] CPU lock with nouveau_fan_update
https://bugs.freedesktop.org/show_bug.cgi?id=65554 --- Comment #2 from ddamienn at gmail.com --- (In reply to comment #1)> If delay is somehow 0, the ->alarm will cause nouveau_fan_update to get > called immediately. Can you add a printk to that function that shows the > values? (This may end up totally flooding your dmesg too... but I think the > values may be the same across prints.)Thanks for your help. I will try to add the printk. The lock has made the machine inaccessible remotely, so it'll be a couple of days until I can post the results. -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20130609/149162d8/attachment.html>
bugzilla-daemon at freedesktop.org
2013-Jun-14 20:24 UTC
[Nouveau] [Bug 65554] CPU lock with nouveau_fan_update
https://bugs.freedesktop.org/show_bug.cgi?id=65554 ddamienn at gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |INVALID --- Comment #3 from ddamienn at gmail.com --- I've had the debug statement active for the last few days, but the error hasn't re-occurred. I discovered that my video card's fan was only working intermittently, which a clean seems to have fixed. Since the inactive fan likely caused the error I was seeing, I am closing this bug report as invalid. -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20130614/bd849c1c/attachment.html>
Reasonably Related Threads
- [PATCH 1/4] pm/fan: drop the fan lock in fan_update() before rescheduling
- nouveau_fan_update: possible circular locking dependency detected
- linux-4.13/drivers/gpu/drm/nouveau/nvkm/subdev/therm/fan.c:86: possible faulty logic ?
- [PATCH 0/5] Thermal management fixes
- [PATCH] drm/nouveau/therm: remove redundant duty == target check