Will Deacon
2022-May-04 09:45 UTC
[PATCH v2] arm64: paravirt: Use RCU read locks to guard stolen_time
On Thu, Apr 28, 2022 at 11:35:36AM -0700, Elliot Berman wrote:> From: Prakruthi Deepak Heragu <quic_pheragu at quicinc.com> > > During hotplug, the stolen time data structure is unmapped and memset. > There is a possibility of the timer IRQ being triggered before memset > and stolen time is getting updated as part of this timer IRQ handler. This > causes the below crash in timer handler - > > [ 3457.473139][ C5] Unable to handle kernel paging request at virtual address ffffffc03df05148 > ... > [ 3458.154398][ C5] Call trace: > [ 3458.157648][ C5] para_steal_clock+0x30/0x50 > [ 3458.162319][ C5] irqtime_account_process_tick+0x30/0x194 > [ 3458.168148][ C5] account_process_tick+0x3c/0x280 > [ 3458.173274][ C5] update_process_times+0x5c/0xf4 > [ 3458.178311][ C5] tick_sched_timer+0x180/0x384 > [ 3458.183164][ C5] __run_hrtimer+0x160/0x57c > [ 3458.187744][ C5] hrtimer_interrupt+0x258/0x684 > [ 3458.192698][ C5] arch_timer_handler_virt+0x5c/0xa0 > [ 3458.198002][ C5] handle_percpu_devid_irq+0xdc/0x414 > [ 3458.203385][ C5] handle_domain_irq+0xa8/0x168 > [ 3458.208241][ C5] gic_handle_irq.34493+0x54/0x244 > [ 3458.213359][ C5] call_on_irq_stack+0x40/0x70 > [ 3458.218125][ C5] do_interrupt_handler+0x60/0x9c > [ 3458.223156][ C5] el1_interrupt+0x34/0x64 > [ 3458.227560][ C5] el1h_64_irq_handler+0x1c/0x2c > [ 3458.232503][ C5] el1h_64_irq+0x7c/0x80 > [ 3458.236736][ C5] free_vmap_area_noflush+0x108/0x39c > [ 3458.242126][ C5] remove_vm_area+0xbc/0x118 > [ 3458.246714][ C5] vm_remove_mappings+0x48/0x2a4 > [ 3458.251656][ C5] __vunmap+0x154/0x278 > [ 3458.255796][ C5] stolen_time_cpu_down_prepare+0xc0/0xd8 > [ 3458.261542][ C5] cpuhp_invoke_callback+0x248/0xc34 > [ 3458.266842][ C5] cpuhp_thread_fun+0x1c4/0x248 > [ 3458.271696][ C5] smpboot_thread_fn+0x1b0/0x400 > [ 3458.276638][ C5] kthread+0x17c/0x1e0 > [ 3458.280691][ C5] ret_from_fork+0x10/0x20 > > As a fix, introduce rcu lock to update stolen time structure. > > Fixes: 75df529bec91 ("arm64: paravirt: Initialize steal time when cpu is online") > Cc: stable at vger.kernel.org > Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu at quicinc.com> > Signed-off-by: Elliot Berman <quic_eberman at quicinc.com> > --- > Changes since v1: https://lore.kernel.org/all/20220420204417.155194-1-quic_eberman at quicinc.com/ > - Use RCU instead of disabling interrupts > > arch/arm64/kernel/paravirt.c | 24 +++++++++++++++++++----- > 1 file changed, 19 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c > index 75fed4460407..e724ea3d86f0 100644 > --- a/arch/arm64/kernel/paravirt.c > +++ b/arch/arm64/kernel/paravirt.c > @@ -52,7 +52,9 @@ early_param("no-steal-acc", parse_no_stealacc); > /* return stolen time in ns by asking the hypervisor */ > static u64 para_steal_clock(int cpu) > { > + struct pvclock_vcpu_stolen_time *kaddr = NULL; > struct pv_time_stolen_time_region *reg; > + u64 ret = 0; > > reg = per_cpu_ptr(&stolen_time_region, cpu); > > @@ -61,28 +63,38 @@ static u64 para_steal_clock(int cpu) > * online notification callback runs. Until the callback > * has run we just return zero. > */ > - if (!reg->kaddr) > + rcu_read_lock(); > + kaddr = rcu_dereference(reg->kaddr); > + if (!kaddr) { > + rcu_read_unlock(); > return 0; > + } > > - return le64_to_cpu(READ_ONCE(reg->kaddr->stolen_time)); > + ret = le64_to_cpu(READ_ONCE(kaddr->stolen_time));Is this READ_ONCE() still required now? Will
Juergen Gross
2022-May-04 13:38 UTC
[PATCH v2] arm64: paravirt: Use RCU read locks to guard stolen_time
On 04.05.22 11:45, Will Deacon wrote:> On Thu, Apr 28, 2022 at 11:35:36AM -0700, Elliot Berman wrote: >> From: Prakruthi Deepak Heragu <quic_pheragu at quicinc.com> >> >> During hotplug, the stolen time data structure is unmapped and memset. >> There is a possibility of the timer IRQ being triggered before memset >> and stolen time is getting updated as part of this timer IRQ handler. This >> causes the below crash in timer handler - >> >> [ 3457.473139][ C5] Unable to handle kernel paging request at virtual address ffffffc03df05148 >> ... >> [ 3458.154398][ C5] Call trace: >> [ 3458.157648][ C5] para_steal_clock+0x30/0x50 >> [ 3458.162319][ C5] irqtime_account_process_tick+0x30/0x194 >> [ 3458.168148][ C5] account_process_tick+0x3c/0x280 >> [ 3458.173274][ C5] update_process_times+0x5c/0xf4 >> [ 3458.178311][ C5] tick_sched_timer+0x180/0x384 >> [ 3458.183164][ C5] __run_hrtimer+0x160/0x57c >> [ 3458.187744][ C5] hrtimer_interrupt+0x258/0x684 >> [ 3458.192698][ C5] arch_timer_handler_virt+0x5c/0xa0 >> [ 3458.198002][ C5] handle_percpu_devid_irq+0xdc/0x414 >> [ 3458.203385][ C5] handle_domain_irq+0xa8/0x168 >> [ 3458.208241][ C5] gic_handle_irq.34493+0x54/0x244 >> [ 3458.213359][ C5] call_on_irq_stack+0x40/0x70 >> [ 3458.218125][ C5] do_interrupt_handler+0x60/0x9c >> [ 3458.223156][ C5] el1_interrupt+0x34/0x64 >> [ 3458.227560][ C5] el1h_64_irq_handler+0x1c/0x2c >> [ 3458.232503][ C5] el1h_64_irq+0x7c/0x80 >> [ 3458.236736][ C5] free_vmap_area_noflush+0x108/0x39c >> [ 3458.242126][ C5] remove_vm_area+0xbc/0x118 >> [ 3458.246714][ C5] vm_remove_mappings+0x48/0x2a4 >> [ 3458.251656][ C5] __vunmap+0x154/0x278 >> [ 3458.255796][ C5] stolen_time_cpu_down_prepare+0xc0/0xd8 >> [ 3458.261542][ C5] cpuhp_invoke_callback+0x248/0xc34 >> [ 3458.266842][ C5] cpuhp_thread_fun+0x1c4/0x248 >> [ 3458.271696][ C5] smpboot_thread_fn+0x1b0/0x400 >> [ 3458.276638][ C5] kthread+0x17c/0x1e0 >> [ 3458.280691][ C5] ret_from_fork+0x10/0x20 >> >> As a fix, introduce rcu lock to update stolen time structure. >> >> Fixes: 75df529bec91 ("arm64: paravirt: Initialize steal time when cpu is online") >> Cc: stable at vger.kernel.org >> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu at quicinc.com> >> Signed-off-by: Elliot Berman <quic_eberman at quicinc.com> >> --- >> Changes since v1: https://lore.kernel.org/all/20220420204417.155194-1-quic_eberman at quicinc.com/ >> - Use RCU instead of disabling interrupts >> >> arch/arm64/kernel/paravirt.c | 24 +++++++++++++++++++----- >> 1 file changed, 19 insertions(+), 5 deletions(-) >> >> diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c >> index 75fed4460407..e724ea3d86f0 100644 >> --- a/arch/arm64/kernel/paravirt.c >> +++ b/arch/arm64/kernel/paravirt.c >> @@ -52,7 +52,9 @@ early_param("no-steal-acc", parse_no_stealacc); >> /* return stolen time in ns by asking the hypervisor */ >> static u64 para_steal_clock(int cpu) >> { >> + struct pvclock_vcpu_stolen_time *kaddr = NULL; >> struct pv_time_stolen_time_region *reg; >> + u64 ret = 0; >> >> reg = per_cpu_ptr(&stolen_time_region, cpu); >> >> @@ -61,28 +63,38 @@ static u64 para_steal_clock(int cpu) >> * online notification callback runs. Until the callback >> * has run we just return zero. >> */ >> - if (!reg->kaddr) >> + rcu_read_lock(); >> + kaddr = rcu_dereference(reg->kaddr); >> + if (!kaddr) { >> + rcu_read_unlock(); >> return 0; >> + } >> >> - return le64_to_cpu(READ_ONCE(reg->kaddr->stolen_time)); >> + ret = le64_to_cpu(READ_ONCE(kaddr->stolen_time)); > > Is this READ_ONCE() still required now?Yes, as it might be called for another cpu than the current one. stolen_time might just be updated, so you want to avoid load tearing. Juergen -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_0xB0DE9DD628BF132F.asc Type: application/pgp-keys Size: 3098 bytes Desc: OpenPGP public key URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20220504/fd242236/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 495 bytes Desc: OpenPGP digital signature URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20220504/fd242236/attachment.sig>