On Tue, Jan 17, 2023 at 02:21:40PM +0000, Sudeep Holla
wrote:> On Tue, Jan 17, 2023 at 01:16:21PM +0000, Mark Rutland wrote:
> > On Tue, Jan 17, 2023 at 11:26:29AM +0100, Peter Zijlstra wrote:
> > > On Mon, Jan 16, 2023 at 04:59:04PM +0000, Mark Rutland wrote:
> > >
> > > > I'm sorry to have to bear some bad news on that front.
:(
> > >
> > > Moo, something had to give..
> > >
> > >
> > > > IIUC what's happenign here is the PSCI cpuidle driver
has entered idle and RCU
> > > > is no longer watching when arm64's cpu_suspend()
manipulates DAIF. Our
> > > > local_daif_*() helpers poke lockdep and tracing, hence the
call to
> > > > trace_hardirqs_off() and the RCU usage.
> > >
> > > Right, strictly speaking not needed at this point, IRQs should
have been
> > > traced off a long time ago.
> >
> > True, but there are some other calls around here that *might* end up
invoking
> > RCU stuff (e.g. the MTE code).
> >
> > That all needs a noinstr cleanup too, which I'll sort out as a
follow-up.
> >
> > > > I think we need RCU to be watching all the way down to
cpu_suspend(), and it's
> > > > cpu_suspend() that should actually enter/exit idle context.
That and we need to
> > > > make cpu_suspend() and the low-level PSCI invocation
noinstr.
> > > >
> > > > I'm not sure whether 32-bit will have a similar issue or
not.
> > >
> > > I'm not seeing 32bit or Risc-V have similar issues here, but
who knows,
> > > maybe I missed somsething.
> >
> > I reckon if they do, the core changes here give us the infrastructure
to fix
> > them if/when we get reports.
> >
> > > In any case, the below ought to cure the ARM64 case and remove
that last
> > > known RCU_NONIDLE() user as a bonus.
> >
> > The below works for me testing on a Juno R1 board with PSCI, using
defconfig +
> > CONFIG_PROVE_LOCKING=y + CONFIG_DEBUG_LOCKDEP=y +
CONFIG_DEBUG_ATOMIC_SLEEP=y.
> > I'm not sure how to test the LPI / FFH part, but it looks good to
me.
> >
> > FWIW:
> >
> > Reviewed-by: Mark Rutland <mark.rutland at arm.com>
> > Tested-by: Mark Rutland <mark.rutland at arm.com>
> >
> > Sudeep, would you be able to give the LPI/FFH side a spin with the
kconfig
> > options above?
> >
>
> Not sure if I have messed up something in my mail setup, but I did reply
> earlier.
Sorry, that was my bad; I had been drafting my reply for a while and forgot to
re-check prior to sending.
> I did test both DT/cpuidle-psci driver and ACPI/LPI+FFH driver
> with the fix Peter sent. I was seeing same splat as you in both DT and
> ACPI boot which the patch fixed it. I used the same config as described by
> you above.
Perfect; thanks!
Mark.