Konrad Rzeszutek Wilk
2013-Jan-15 22:23 UTC
[PATCH] fixes to ACPI subsystem which assumes cpuidle is always enabled.
Attached are two patches to the ACPI subsystem and the cpuidle drivers. The fixes are to deal with the case when cpuidle_disabled returns true and we try to hotplug CPUs on/off. drivers/acpi/processor_idle.c | 3 +++ drivers/idle/intel_idle.c | 3 +-- 2 files changed, 4 insertions(+), 2 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Konrad Rzeszutek Wilk
2013-Jan-15 22:23 UTC
[PATCH 1/2] ACPI: intel-idle: Don''t register CPU notifier if we are not running.
The ''intel_idle_probe'' probes the CPU and sets the CPU notifier. But if later on during the module initialization we fail (say in cpuidle_register_driver) we stop loading but we neglected to unregister the CPU notifier. This means that during CPU hotplug events the system will fail: calling intel_idle_init+0x0/0x326 @ 1 intel_idle: MWAIT substates: 0x1120 intel_idle: v0.4 model 0x2A intel_idle: lapic_timer_reliable_states 0xffffffff intel_idle: intel_idle yielding to none initcall intel_idle_init+0x0/0x326 returned -19 after 14 usecs ... some time later, offlining and onlining a CPU: cpu 3 spinlock event irq 62 BUG: unable to ] __cpuidle_register_device+0x1c/0x120 PGD 99b8b067 PUD 99b95067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: xen_evtchn nouveau mxm_wmi wmi radeon ttm i915 fbcon tileblit font atl1c bitblit softcursor drm_kms_helper video xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd mperf CPU 0 Pid: 2302, comm: udevd Not tainted 3.8.0-rc3upstream-00249-g09ad159 #1 MSI MS-7680/H61M-P23 (MS-7680) RIP: e030:[<ffffffff814d956c>] [<ffffffff814d956c>] __cpuidle_register_device+0x1c/0x120 RSP: e02b:ffff88009dacfcb8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff880105380000 RCX: 000000000000001c RDX: 0000000000000000 RSI: 0000000000000055 RDI: ffff880105380000 RBP: ffff88009dacfce8 R08: ffffffff81a4f048 R09: 0000000000000008 R10: 0000000000000008 R11: 0000000000000000 R12: ffff880105380000 R13: 00000000ffffffdd R14: 0000000000000000 R15: ffffffff81a523d0 FS: 00007f37bd83b7a0(0000) GS:ffff880105200000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 00000000a09ea000 CR4: 0000000000042660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process udevd (pid: 2302, threadinfo ffff88009dace000, task ffff88009afb47f0) Stack: ffffffff8107f2d0 ffffffff810c2fb7 ffff88009dacfce8 00000000ffffffea ffff880105380000 00000000ffffffdd ffff88009dacfd08 ffffffff814d9882 0000000000000003 ffff880105380000 ffff88009dacfd28 ffffffff81340afd Call Trace: [<ffffffff8107f2d0>] ? collect_cpu_info_local+0x30/0x30 [<ffffffff810c2fb7>] ? __might_sleep+0xe7/0x100 [<ffffffff814d9882>] cpuidle_register_device+0x32/0x70 [<ffffffff81340afd>] intel_idle_cpu_init+0xad/0x110 [<ffffffff81340bc8>] cpu_hotplug_notify+0x68/0x80 [<ffffffff8166023d>] notifier_call_chain+0x4d/0x70 [<ffffffff810bc369>] __raw_notifier_call_chain+0x9/0x10 [<ffffffff81094a4b>] __cpu_notify+0x1b/0x30 [<ffffffff81652cf7>] _cpu_up+0x103/0x14b [<ffffffff81652e18>] cpu_up+0xd9/0xec [<ffffffff8164a254>] store_online+0x94/0xd0 [<ffffffff814122fb>] dev_attr_store+0x1b/0x20 [<ffffffff81216404>] sysfs_write_file+0xf4/0x170 [<ffffffff811a1024>] vfs_write+0xb4/0x130 [<ffffffff811a17ea>] sys_write+0x5a/0xa0 [<ffffffff816643a9>] system_call_fastpath+0x16/0x1b Code: 03 18 00 c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 30 48 89 5d e8 4c 89 65 f0 48 89 fb 4c 89 6d f8 e8 84 08 00 00 <48> 8b 78 08 49 89 c4 e8 f8 7f c1 ff 89 c2 b8 ea ff ff ff 84 d2 RIP [<ffffffff814d956c>] __cpuidle_register_device+0x1c/0x120 RSP <ffff88009dacfcb8> This patch fixes it by moving the CPU notifier registration as the last item to be done by the module. Cc: stable@vger.kernel.org # for 3.6 and above Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/idle/intel_idle.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 4ba384f..2df9414 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -448,8 +448,6 @@ static int intel_idle_probe(void) else on_each_cpu(__setup_broadcast_timer, (void *)true, 1); - register_cpu_notifier(&cpu_hotplug_notifier); - pr_debug(PREFIX "v" INTEL_IDLE_VERSION " model 0x%X\n", boot_cpu_data.x86_model); @@ -612,6 +610,7 @@ static int __init intel_idle_init(void) return retval; } } + register_cpu_notifier(&cpu_hotplug_notifier); return 0; } -- 1.8.0.2
Konrad Rzeszutek Wilk
2013-Jan-15 22:23 UTC
[PATCH 2/2] ACPI / cpuidle: Fix NULL pointer issues when cpuidle is disabled
If cpuidle is disabled, that means the: per_cpu(acpi_cpuidle_device, pr->id) is set to NULL as the acpi_processor_power_init ends up failing at retval = cpuidle_register_driver(&acpi_idle_driver) (in acpi_processor_power_init) and never sets the per_cpu idle device. So when acpi_processor_hotplug on CPU online notification tries to reference said device it crashes: cpu 3 spinlock event irq 62 BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [<ffffffff81381013>] acpi_processor_setup_cpuidle_cx+0x3f/0x105 PGD a259b067 PUD ab38b067 PMD 0 Oops: 0002 [#1] SMP odules linked in: dm_multipath dm_mod xen_evtchn iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi libcrc32c crc32c nouveau mxm_wmi wmi radeon ttm sg sr_mod sd_mod cdrom ata_generic ata_piix libata crc32c_intel scsi_mod atl1c i915 fbcon tileblit font bitblit softcursor drm_kms_helper video xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd mperf CPU 1 Pid: 3047, comm: bash Not tainted 3.8.0-rc3upstream-00250-g165c029 #1 MSI MS-7680/H61M-P23 (MS-7680) RIP: e030:[<ffffffff81381013>] [<ffffffff81381013>] acpi_processor_setup_cpuidle_cx+0x3f/0x105 RSP: e02b:ffff88001742dca8 EFLAGS: 00010202 RAX: 0000000000010be9 RBX: ffff8800a0a61800 RCX: ffff880105380000 RDX: 0000000000000003 RSI: 0000000000000200 RDI: ffff8800a0a61800 RBP: ffff88001742dce8 R08: ffffffff81812360 R09: 0000000000000200 R10: aaaaaaaaaaaaaaaa R11: 0000000000000001 R12: ffff8800a0a61800 R13: 00000000ffffff01 R14: 0000000000000000 R15: ffffffff81a907a0 FS: 00007fd6942f7700(0000) GS:ffff880105280000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000004 CR3: 00000000a6773000 CR4: 0000000000042660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process bash (pid: 3047, threadinfo ffff88001742c000, task ffff880017944000) Stack: 0000000000000150 ffff880100f59e00 ffff88001742dcd8 ffff8800a0a61800 0000000000000000 00000000ffffff01 0000000000000000 ffffffff81a907a0 ffff88001742dd18 ffffffff813815b1 ffff88001742dd08 ffffffff810ae336 Call Trace: [<ffffffff813815b1>] acpi_processor_hotplug+0x7c/0x9f [<ffffffff810ae336>] ? schedule_delayed_work_on+0x16/0x20 [<ffffffff8137ee8f>] acpi_cpu_soft_notify+0x90/0xca [<ffffffff8166023d>] notifier_call_chain+0x4d/0x70 [<ffffffff810bc369>] __raw_notifier_call_chain+0x9/0x10 [<ffffffff81094a4b>] __cpu_notify+0x1b/0x30 [<ffffffff81652cf7>] _cpu_up+0x103/0x14b [<ffffffff81652e18>] cpu_up+0xd9/0xec [<ffffffff8164a254>] store_online+0x94/0xd0 [<ffffffff814122fb>] dev_attr_store+0x1b/0x20 [<ffffffff81216404>] sysfs_write_file+0xf4/0x170 This patch fixes it. Cc: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/acpi/processor_idle.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index f1a5da4..fea6f8d 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -958,6 +958,9 @@ static int acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr) return -EINVAL; } + if (!dev) + return -EINVAL; + dev->cpu = pr->id; if (max_cstate == 0) -- 1.8.0.2
Konrad Rzeszutek Wilk
2013-Jan-15 22:30 UTC
Re: [PATCH 1/2] ACPI: intel-idle: Don''t register CPU notifier if we are not running.
On Tue, Jan 15, 2013 at 11:33:28PM +0100, Rafael J. Wysocki wrote:> On Tuesday, January 15, 2013 05:23:23 PM Konrad Rzeszutek Wilk wrote: > > The ''intel_idle_probe'' probes the CPU and sets the CPU notifier. > > But if later on during the module initialization we fail (say > > in cpuidle_register_driver) we stop loading but we neglected > > to unregister the CPU notifier. This means that during CPU > > hotplug events the system will fail: > > This really is for Len, but it appears to be obviously correct, so I''ll > take it as a v3.8 fix if Len doesn''t object. > > I suppose we need that in -stable too?Yes please. 3.6 and onwards. Thank you!> > Rafael > > > > calling intel_idle_init+0x0/0x326 @ 1 > > intel_idle: MWAIT substates: 0x1120 > > intel_idle: v0.4 model 0x2A > > intel_idle: lapic_timer_reliable_states 0xffffffff > > intel_idle: intel_idle yielding to none > > initcall intel_idle_init+0x0/0x326 returned -19 after 14 usecs > > > > ... some time later, offlining and onlining a CPU: > > > > cpu 3 spinlock event irq 62 > > BUG: unable to ] __cpuidle_register_device+0x1c/0x120 > > PGD 99b8b067 PUD 99b95067 PMD 0 > > Oops: 0000 [#1] SMP > > Modules linked in: xen_evtchn nouveau mxm_wmi wmi radeon ttm i915 fbcon tileblit font atl1c bitblit softcursor drm_kms_helper video xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd mperf > > CPU 0 > > Pid: 2302, comm: udevd Not tainted 3.8.0-rc3upstream-00249-g09ad159 #1 MSI MS-7680/H61M-P23 (MS-7680) > > RIP: e030:[<ffffffff814d956c>] [<ffffffff814d956c>] __cpuidle_register_device+0x1c/0x120 > > RSP: e02b:ffff88009dacfcb8 EFLAGS: 00010286 > > RAX: 0000000000000000 RBX: ffff880105380000 RCX: 000000000000001c > > RDX: 0000000000000000 RSI: 0000000000000055 RDI: ffff880105380000 > > RBP: ffff88009dacfce8 R08: ffffffff81a4f048 R09: 0000000000000008 > > R10: 0000000000000008 R11: 0000000000000000 R12: ffff880105380000 > > R13: 00000000ffffffdd R14: 0000000000000000 R15: ffffffff81a523d0 > > FS: 00007f37bd83b7a0(0000) GS:ffff880105200000(0000) knlGS:0000000000000000 > > CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000000008 CR3: 00000000a09ea000 CR4: 0000000000042660 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process udevd (pid: 2302, threadinfo ffff88009dace000, task ffff88009afb47f0) > > Stack: > > ffffffff8107f2d0 ffffffff810c2fb7 ffff88009dacfce8 00000000ffffffea > > ffff880105380000 00000000ffffffdd ffff88009dacfd08 ffffffff814d9882 > > 0000000000000003 ffff880105380000 ffff88009dacfd28 ffffffff81340afd > > Call Trace: > > [<ffffffff8107f2d0>] ? collect_cpu_info_local+0x30/0x30 > > [<ffffffff810c2fb7>] ? __might_sleep+0xe7/0x100 > > [<ffffffff814d9882>] cpuidle_register_device+0x32/0x70 > > [<ffffffff81340afd>] intel_idle_cpu_init+0xad/0x110 > > [<ffffffff81340bc8>] cpu_hotplug_notify+0x68/0x80 > > [<ffffffff8166023d>] notifier_call_chain+0x4d/0x70 > > [<ffffffff810bc369>] __raw_notifier_call_chain+0x9/0x10 > > [<ffffffff81094a4b>] __cpu_notify+0x1b/0x30 > > [<ffffffff81652cf7>] _cpu_up+0x103/0x14b > > [<ffffffff81652e18>] cpu_up+0xd9/0xec > > [<ffffffff8164a254>] store_online+0x94/0xd0 > > [<ffffffff814122fb>] dev_attr_store+0x1b/0x20 > > [<ffffffff81216404>] sysfs_write_file+0xf4/0x170 > > [<ffffffff811a1024>] vfs_write+0xb4/0x130 > > [<ffffffff811a17ea>] sys_write+0x5a/0xa0 > > [<ffffffff816643a9>] system_call_fastpath+0x16/0x1b > > Code: 03 18 00 c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 30 48 89 5d e8 4c 89 65 f0 48 89 fb 4c 89 6d f8 e8 84 08 00 00 <48> 8b 78 08 49 89 c4 e8 f8 7f c1 ff 89 c2 b8 ea ff ff ff 84 d2 > > RIP [<ffffffff814d956c>] __cpuidle_register_device+0x1c/0x120 > > RSP <ffff88009dacfcb8> > > > > This patch fixes it by moving the CPU notifier registration > > as the last item to be done by the module. > > > > Cc: stable@vger.kernel.org # for 3.6 and above > > Cc: Daniel Lezcano <daniel.lezcano@linaro.org> > > Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> > > Cc: Rafael J. Wysocki <rjw@sisk.pl> > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > drivers/idle/intel_idle.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c > > index 4ba384f..2df9414 100644 > > --- a/drivers/idle/intel_idle.c > > +++ b/drivers/idle/intel_idle.c > > @@ -448,8 +448,6 @@ static int intel_idle_probe(void) > > else > > on_each_cpu(__setup_broadcast_timer, (void *)true, 1); > > > > - register_cpu_notifier(&cpu_hotplug_notifier); > > - > > pr_debug(PREFIX "v" INTEL_IDLE_VERSION > > " model 0x%X\n", boot_cpu_data.x86_model); > > > > @@ -612,6 +610,7 @@ static int __init intel_idle_init(void) > > return retval; > > } > > } > > + register_cpu_notifier(&cpu_hotplug_notifier); > > > > return 0; > > } > > > -- > I speak only for myself. > Rafael J. Wysocki, Intel Open Source Technology Center.
Rafael J. Wysocki
2013-Jan-15 22:33 UTC
Re: [PATCH 1/2] ACPI: intel-idle: Don''t register CPU notifier if we are not running.
On Tuesday, January 15, 2013 05:23:23 PM Konrad Rzeszutek Wilk wrote:> The ''intel_idle_probe'' probes the CPU and sets the CPU notifier. > But if later on during the module initialization we fail (say > in cpuidle_register_driver) we stop loading but we neglected > to unregister the CPU notifier. This means that during CPU > hotplug events the system will fail:This really is for Len, but it appears to be obviously correct, so I''ll take it as a v3.8 fix if Len doesn''t object. I suppose we need that in -stable too? Rafael> calling intel_idle_init+0x0/0x326 @ 1 > intel_idle: MWAIT substates: 0x1120 > intel_idle: v0.4 model 0x2A > intel_idle: lapic_timer_reliable_states 0xffffffff > intel_idle: intel_idle yielding to none > initcall intel_idle_init+0x0/0x326 returned -19 after 14 usecs > > ... some time later, offlining and onlining a CPU: > > cpu 3 spinlock event irq 62 > BUG: unable to ] __cpuidle_register_device+0x1c/0x120 > PGD 99b8b067 PUD 99b95067 PMD 0 > Oops: 0000 [#1] SMP > Modules linked in: xen_evtchn nouveau mxm_wmi wmi radeon ttm i915 fbcon tileblit font atl1c bitblit softcursor drm_kms_helper video xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd mperf > CPU 0 > Pid: 2302, comm: udevd Not tainted 3.8.0-rc3upstream-00249-g09ad159 #1 MSI MS-7680/H61M-P23 (MS-7680) > RIP: e030:[<ffffffff814d956c>] [<ffffffff814d956c>] __cpuidle_register_device+0x1c/0x120 > RSP: e02b:ffff88009dacfcb8 EFLAGS: 00010286 > RAX: 0000000000000000 RBX: ffff880105380000 RCX: 000000000000001c > RDX: 0000000000000000 RSI: 0000000000000055 RDI: ffff880105380000 > RBP: ffff88009dacfce8 R08: ffffffff81a4f048 R09: 0000000000000008 > R10: 0000000000000008 R11: 0000000000000000 R12: ffff880105380000 > R13: 00000000ffffffdd R14: 0000000000000000 R15: ffffffff81a523d0 > FS: 00007f37bd83b7a0(0000) GS:ffff880105200000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000008 CR3: 00000000a09ea000 CR4: 0000000000042660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process udevd (pid: 2302, threadinfo ffff88009dace000, task ffff88009afb47f0) > Stack: > ffffffff8107f2d0 ffffffff810c2fb7 ffff88009dacfce8 00000000ffffffea > ffff880105380000 00000000ffffffdd ffff88009dacfd08 ffffffff814d9882 > 0000000000000003 ffff880105380000 ffff88009dacfd28 ffffffff81340afd > Call Trace: > [<ffffffff8107f2d0>] ? collect_cpu_info_local+0x30/0x30 > [<ffffffff810c2fb7>] ? __might_sleep+0xe7/0x100 > [<ffffffff814d9882>] cpuidle_register_device+0x32/0x70 > [<ffffffff81340afd>] intel_idle_cpu_init+0xad/0x110 > [<ffffffff81340bc8>] cpu_hotplug_notify+0x68/0x80 > [<ffffffff8166023d>] notifier_call_chain+0x4d/0x70 > [<ffffffff810bc369>] __raw_notifier_call_chain+0x9/0x10 > [<ffffffff81094a4b>] __cpu_notify+0x1b/0x30 > [<ffffffff81652cf7>] _cpu_up+0x103/0x14b > [<ffffffff81652e18>] cpu_up+0xd9/0xec > [<ffffffff8164a254>] store_online+0x94/0xd0 > [<ffffffff814122fb>] dev_attr_store+0x1b/0x20 > [<ffffffff81216404>] sysfs_write_file+0xf4/0x170 > [<ffffffff811a1024>] vfs_write+0xb4/0x130 > [<ffffffff811a17ea>] sys_write+0x5a/0xa0 > [<ffffffff816643a9>] system_call_fastpath+0x16/0x1b > Code: 03 18 00 c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 30 48 89 5d e8 4c 89 65 f0 48 89 fb 4c 89 6d f8 e8 84 08 00 00 <48> 8b 78 08 49 89 c4 e8 f8 7f c1 ff 89 c2 b8 ea ff ff ff 84 d2 > RIP [<ffffffff814d956c>] __cpuidle_register_device+0x1c/0x120 > RSP <ffff88009dacfcb8> > > This patch fixes it by moving the CPU notifier registration > as the last item to be done by the module. > > Cc: stable@vger.kernel.org # for 3.6 and above > Cc: Daniel Lezcano <daniel.lezcano@linaro.org> > Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> > Cc: Rafael J. Wysocki <rjw@sisk.pl> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > drivers/idle/intel_idle.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c > index 4ba384f..2df9414 100644 > --- a/drivers/idle/intel_idle.c > +++ b/drivers/idle/intel_idle.c > @@ -448,8 +448,6 @@ static int intel_idle_probe(void) > else > on_each_cpu(__setup_broadcast_timer, (void *)true, 1); > > - register_cpu_notifier(&cpu_hotplug_notifier); > - > pr_debug(PREFIX "v" INTEL_IDLE_VERSION > " model 0x%X\n", boot_cpu_data.x86_model); > > @@ -612,6 +610,7 @@ static int __init intel_idle_init(void) > return retval; > } > } > + register_cpu_notifier(&cpu_hotplug_notifier); > > return 0; > } >-- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center.
Rafael J. Wysocki
2013-Jan-15 22:45 UTC
Re: [PATCH 2/2] ACPI / cpuidle: Fix NULL pointer issues when cpuidle is disabled
On Tuesday, January 15, 2013 05:23:24 PM Konrad Rzeszutek Wilk wrote:> If cpuidle is disabled, that means the: > > per_cpu(acpi_cpuidle_device, pr->id) > > is set to NULL as the acpi_processor_power_init ends up failing at > > retval = cpuidle_register_driver(&acpi_idle_driver) > > (in acpi_processor_power_init) and never sets the per_cpu idle > device. So when acpi_processor_hotplug on CPU online notification tries > to reference said device it crashes: > > cpu 3 spinlock event irq 62 > BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 > IP: [<ffffffff81381013>] acpi_processor_setup_cpuidle_cx+0x3f/0x105 > PGD a259b067 PUD ab38b067 PMD 0 > Oops: 0002 [#1] SMP > odules linked in: dm_multipath dm_mod xen_evtchn iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi libcrc32c crc32c nouveau mxm_wmi wmi radeon ttm sg sr_mod sd_mod cdrom ata_generic ata_piix libata crc32c_intel scsi_mod atl1c i915 fbcon tileblit font bitblit softcursor drm_kms_helper video xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd mperf > CPU 1 > Pid: 3047, comm: bash Not tainted 3.8.0-rc3upstream-00250-g165c029 #1 MSI MS-7680/H61M-P23 (MS-7680) > RIP: e030:[<ffffffff81381013>] [<ffffffff81381013>] acpi_processor_setup_cpuidle_cx+0x3f/0x105 > RSP: e02b:ffff88001742dca8 EFLAGS: 00010202 > RAX: 0000000000010be9 RBX: ffff8800a0a61800 RCX: ffff880105380000 > RDX: 0000000000000003 RSI: 0000000000000200 RDI: ffff8800a0a61800 > RBP: ffff88001742dce8 R08: ffffffff81812360 R09: 0000000000000200 > R10: aaaaaaaaaaaaaaaa R11: 0000000000000001 R12: ffff8800a0a61800 > R13: 00000000ffffff01 R14: 0000000000000000 R15: ffffffff81a907a0 > FS: 00007fd6942f7700(0000) GS:ffff880105280000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000004 CR3: 00000000a6773000 CR4: 0000000000042660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process bash (pid: 3047, threadinfo ffff88001742c000, task ffff880017944000) > Stack: > 0000000000000150 ffff880100f59e00 ffff88001742dcd8 ffff8800a0a61800 > 0000000000000000 00000000ffffff01 0000000000000000 ffffffff81a907a0 > ffff88001742dd18 ffffffff813815b1 ffff88001742dd08 ffffffff810ae336 > Call Trace: > [<ffffffff813815b1>] acpi_processor_hotplug+0x7c/0x9f > [<ffffffff810ae336>] ? schedule_delayed_work_on+0x16/0x20 > [<ffffffff8137ee8f>] acpi_cpu_soft_notify+0x90/0xca > [<ffffffff8166023d>] notifier_call_chain+0x4d/0x70 > [<ffffffff810bc369>] __raw_notifier_call_chain+0x9/0x10 > [<ffffffff81094a4b>] __cpu_notify+0x1b/0x30 > [<ffffffff81652cf7>] _cpu_up+0x103/0x14b > [<ffffffff81652e18>] cpu_up+0xd9/0xec > [<ffffffff8164a254>] store_online+0x94/0xd0 > [<ffffffff814122fb>] dev_attr_store+0x1b/0x20 > [<ffffffff81216404>] sysfs_write_file+0xf4/0x170 > > This patch fixes it.This appears to be -stable material too. Which -stable kernels should it be applied to? Rafael> Cc: Rafael J. Wysocki <rjw@sisk.pl> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > drivers/acpi/processor_idle.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > index f1a5da4..fea6f8d 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -958,6 +958,9 @@ static int acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr) > return -EINVAL; > } > > + if (!dev) > + return -EINVAL; > + > dev->cpu = pr->id; > > if (max_cstate == 0) >-- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center.
Konrad Rzeszutek Wilk
2013-Jan-16 04:42 UTC
Re: [PATCH 2/2] ACPI / cpuidle: Fix NULL pointer issues when cpuidle is disabled
On Tue, Jan 15, 2013 at 11:45:08PM +0100, Rafael J. Wysocki wrote:> On Tuesday, January 15, 2013 05:23:24 PM Konrad Rzeszutek Wilk wrote: > > If cpuidle is disabled, that means the: > > > > per_cpu(acpi_cpuidle_device, pr->id) > > > > is set to NULL as the acpi_processor_power_init ends up failing at > > > > retval = cpuidle_register_driver(&acpi_idle_driver) > > > > (in acpi_processor_power_init) and never sets the per_cpu idle > > device. So when acpi_processor_hotplug on CPU online notification tries > > to reference said device it crashes: > > > > cpu 3 spinlock event irq 62 > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 > > IP: [<ffffffff81381013>] acpi_processor_setup_cpuidle_cx+0x3f/0x105 > > PGD a259b067 PUD ab38b067 PMD 0 > > Oops: 0002 [#1] SMP > > odules linked in: dm_multipath dm_mod xen_evtchn iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi libcrc32c crc32c nouveau mxm_wmi wmi radeon ttm sg sr_mod sd_mod cdrom ata_generic ata_piix libata crc32c_intel scsi_mod atl1c i915 fbcon tileblit font bitblit softcursor drm_kms_helper video xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd mperf > > CPU 1 > > Pid: 3047, comm: bash Not tainted 3.8.0-rc3upstream-00250-g165c029 #1 MSI MS-7680/H61M-P23 (MS-7680) > > RIP: e030:[<ffffffff81381013>] [<ffffffff81381013>] acpi_processor_setup_cpuidle_cx+0x3f/0x105 > > RSP: e02b:ffff88001742dca8 EFLAGS: 00010202 > > RAX: 0000000000010be9 RBX: ffff8800a0a61800 RCX: ffff880105380000 > > RDX: 0000000000000003 RSI: 0000000000000200 RDI: ffff8800a0a61800 > > RBP: ffff88001742dce8 R08: ffffffff81812360 R09: 0000000000000200 > > R10: aaaaaaaaaaaaaaaa R11: 0000000000000001 R12: ffff8800a0a61800 > > R13: 00000000ffffff01 R14: 0000000000000000 R15: ffffffff81a907a0 > > FS: 00007fd6942f7700(0000) GS:ffff880105280000(0000) knlGS:0000000000000000 > > CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000000004 CR3: 00000000a6773000 CR4: 0000000000042660 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process bash (pid: 3047, threadinfo ffff88001742c000, task ffff880017944000) > > Stack: > > 0000000000000150 ffff880100f59e00 ffff88001742dcd8 ffff8800a0a61800 > > 0000000000000000 00000000ffffff01 0000000000000000 ffffffff81a907a0 > > ffff88001742dd18 ffffffff813815b1 ffff88001742dd08 ffffffff810ae336 > > Call Trace: > > [<ffffffff813815b1>] acpi_processor_hotplug+0x7c/0x9f > > [<ffffffff810ae336>] ? schedule_delayed_work_on+0x16/0x20 > > [<ffffffff8137ee8f>] acpi_cpu_soft_notify+0x90/0xca > > [<ffffffff8166023d>] notifier_call_chain+0x4d/0x70 > > [<ffffffff810bc369>] __raw_notifier_call_chain+0x9/0x10 > > [<ffffffff81094a4b>] __cpu_notify+0x1b/0x30 > > [<ffffffff81652cf7>] _cpu_up+0x103/0x14b > > [<ffffffff81652e18>] cpu_up+0xd9/0xec > > [<ffffffff8164a254>] store_online+0x94/0xd0 > > [<ffffffff814122fb>] dev_attr_store+0x1b/0x20 > > [<ffffffff81216404>] sysfs_write_file+0xf4/0x170 > > > > This patch fixes it. > > This appears to be -stable material too. Which -stable kernels should it > be applied to?Oh, 3.1 and onward. I am basing that on 62027aea since that allowed subsystem to disable the cpuidle API. Thanks!> > Rafael > > > > Cc: Rafael J. Wysocki <rjw@sisk.pl> > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > drivers/acpi/processor_idle.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > > index f1a5da4..fea6f8d 100644 > > --- a/drivers/acpi/processor_idle.c > > +++ b/drivers/acpi/processor_idle.c > > @@ -958,6 +958,9 @@ static int acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr) > > return -EINVAL; > > } > > > > + if (!dev) > > + return -EINVAL; > > + > > dev->cpu = pr->id; > > > > if (max_cstate == 0) > > > -- > I speak only for myself. > Rafael J. Wysocki, Intel Open Source Technology Center.-- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Srivatsa S. Bhat
2013-Jan-16 08:10 UTC
Re: [PATCH 1/2] ACPI: intel-idle: Don''t register CPU notifier if we are not running.
On 01/16/2013 03:53 AM, Konrad Rzeszutek Wilk wrote:> The ''intel_idle_probe'' probes the CPU and sets the CPU notifier. > But if later on during the module initialization we fail (say > in cpuidle_register_driver) we stop loading but we neglected > to unregister the CPU notifier. This means that during CPU > hotplug events the system will fail: > > calling intel_idle_init+0x0/0x326 @ 1 > intel_idle: MWAIT substates: 0x1120 > intel_idle: v0.4 model 0x2A > intel_idle: lapic_timer_reliable_states 0xffffffff > intel_idle: intel_idle yielding to none > initcall intel_idle_init+0x0/0x326 returned -19 after 14 usecs > > ... some time later, offlining and onlining a CPU: > > cpu 3 spinlock event irq 62 > BUG: unable to ] __cpuidle_register_device+0x1c/0x120 > PGD 99b8b067 PUD 99b95067 PMD 0 > Oops: 0000 [#1] SMP > Modules linked in: xen_evtchn nouveau mxm_wmi wmi radeon ttm i915 fbcon tileblit font atl1c bitblit softcursor drm_kms_helper video xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd mperf > CPU 0 > Pid: 2302, comm: udevd Not tainted 3.8.0-rc3upstream-00249-g09ad159 #1 MSI MS-7680/H61M-P23 (MS-7680) > RIP: e030:[<ffffffff814d956c>] [<ffffffff814d956c>] __cpuidle_register_device+0x1c/0x120 > RSP: e02b:ffff88009dacfcb8 EFLAGS: 00010286 > RAX: 0000000000000000 RBX: ffff880105380000 RCX: 000000000000001c > RDX: 0000000000000000 RSI: 0000000000000055 RDI: ffff880105380000 > RBP: ffff88009dacfce8 R08: ffffffff81a4f048 R09: 0000000000000008 > R10: 0000000000000008 R11: 0000000000000000 R12: ffff880105380000 > R13: 00000000ffffffdd R14: 0000000000000000 R15: ffffffff81a523d0 > FS: 00007f37bd83b7a0(0000) GS:ffff880105200000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000008 CR3: 00000000a09ea000 CR4: 0000000000042660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process udevd (pid: 2302, threadinfo ffff88009dace000, task ffff88009afb47f0) > Stack: > ffffffff8107f2d0 ffffffff810c2fb7 ffff88009dacfce8 00000000ffffffea > ffff880105380000 00000000ffffffdd ffff88009dacfd08 ffffffff814d9882 > 0000000000000003 ffff880105380000 ffff88009dacfd28 ffffffff81340afd > Call Trace: > [<ffffffff8107f2d0>] ? collect_cpu_info_local+0x30/0x30 > [<ffffffff810c2fb7>] ? __might_sleep+0xe7/0x100 > [<ffffffff814d9882>] cpuidle_register_device+0x32/0x70 > [<ffffffff81340afd>] intel_idle_cpu_init+0xad/0x110 > [<ffffffff81340bc8>] cpu_hotplug_notify+0x68/0x80 > [<ffffffff8166023d>] notifier_call_chain+0x4d/0x70 > [<ffffffff810bc369>] __raw_notifier_call_chain+0x9/0x10 > [<ffffffff81094a4b>] __cpu_notify+0x1b/0x30 > [<ffffffff81652cf7>] _cpu_up+0x103/0x14b > [<ffffffff81652e18>] cpu_up+0xd9/0xec > [<ffffffff8164a254>] store_online+0x94/0xd0 > [<ffffffff814122fb>] dev_attr_store+0x1b/0x20 > [<ffffffff81216404>] sysfs_write_file+0xf4/0x170 > [<ffffffff811a1024>] vfs_write+0xb4/0x130 > [<ffffffff811a17ea>] sys_write+0x5a/0xa0 > [<ffffffff816643a9>] system_call_fastpath+0x16/0x1b > Code: 03 18 00 c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 30 48 89 5d e8 4c 89 65 f0 48 89 fb 4c 89 6d f8 e8 84 08 00 00 <48> 8b 78 08 49 89 c4 e8 f8 7f c1 ff 89 c2 b8 ea ff ff ff 84 d2 > RIP [<ffffffff814d956c>] __cpuidle_register_device+0x1c/0x120 > RSP <ffff88009dacfcb8> > > This patch fixes it by moving the CPU notifier registration > as the last item to be done by the module. > > Cc: stable@vger.kernel.org # for 3.6 and above > Cc: Daniel Lezcano <daniel.lezcano@linaro.org> > Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> > Cc: Rafael J. Wysocki <rjw@sisk.pl> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > ---Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Regards, Srivatsa S. Bhat> drivers/idle/intel_idle.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c > index 4ba384f..2df9414 100644 > --- a/drivers/idle/intel_idle.c > +++ b/drivers/idle/intel_idle.c > @@ -448,8 +448,6 @@ static int intel_idle_probe(void) > else > on_each_cpu(__setup_broadcast_timer, (void *)true, 1); > > - register_cpu_notifier(&cpu_hotplug_notifier); > - > pr_debug(PREFIX "v" INTEL_IDLE_VERSION > " model 0x%X\n", boot_cpu_data.x86_model); > > @@ -612,6 +610,7 @@ static int __init intel_idle_init(void) > return retval; > } > } > + register_cpu_notifier(&cpu_hotplug_notifier); > > return 0; > } >
Rafael J. Wysocki
2013-Jan-16 22:49 UTC
Re: [PATCH] fixes to ACPI subsystem which assumes cpuidle is always enabled.
On Tuesday, January 15, 2013 05:23:22 PM Konrad Rzeszutek Wilk wrote:> Attached are two patches to the ACPI subsystem and the cpuidle drivers. > > The fixes are to deal with the case when cpuidle_disabled returns true > and we try to hotplug CPUs on/off. > > drivers/acpi/processor_idle.c | 3 +++ > drivers/idle/intel_idle.c | 3 +-- > 2 files changed, 4 insertions(+), 2 deletions(-)Both patches applied. Thanks, Rafael -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center.
Konrad Rzeszutek Wilk
2013-Jan-18 15:12 UTC
Re: [PATCH] fixes to ACPI subsystem which assumes cpuidle is always enabled.
On Wed, Jan 16, 2013 at 11:49:00PM +0100, Rafael J. Wysocki wrote:> On Tuesday, January 15, 2013 05:23:22 PM Konrad Rzeszutek Wilk wrote: > > Attached are two patches to the ACPI subsystem and the cpuidle drivers. > > > > The fixes are to deal with the case when cpuidle_disabled returns true > > and we try to hotplug CPUs on/off. > > > > drivers/acpi/processor_idle.c | 3 +++ > > drivers/idle/intel_idle.c | 3 +-- > > 2 files changed, 4 insertions(+), 2 deletions(-) > > Both patches applied.Thanks!> > Thanks, > Rafael > > > -- > I speak only for myself. > Rafael J. Wysocki, Intel Open Source Technology Center.-- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html