peterz at infradead.org
2020-Jul-23 18:47 UTC
[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR
On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:> BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch? > I will have to update the patch to fix the reported 0-day test problem, but > I want to collect other feedback before sending out v3.I want to say I hate it all, it adds instructions to a path we spend an aweful lot of time optimizing without really getting anything back for it. Will, how do you feel about it?
Waiman Long
2020-Jul-23 19:04 UTC
[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR
On 7/23/20 2:47 PM, peterz at infradead.org wrote:> On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote: >> BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch? >> I will have to update the patch to fix the reported 0-day test problem, but >> I want to collect other feedback before sending out v3. > I want to say I hate it all, it adds instructions to a path we spend an > aweful lot of time optimizing without really getting anything back for > it.It does add some extra instruction that may slow it down slightly, but I don't agree that it gives nothing back. The cpu lock holder information can be useful in analyzing crash dumps and in some debugging situation. I think it can be useful in RHEL for this readon. How about an x86 config option to allow distros to decide if they want to have it enabled? I will make sure that it will have no performance degradation if the option is not enabled. Cheers, Longman
peterz at infradead.org
2020-Jul-23 19:58 UTC
[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR
On Thu, Jul 23, 2020 at 03:04:13PM -0400, Waiman Long wrote:> On 7/23/20 2:47 PM, peterz at infradead.org wrote: > > On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote: > > > BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch? > > > I will have to update the patch to fix the reported 0-day test problem, but > > > I want to collect other feedback before sending out v3. > > I want to say I hate it all, it adds instructions to a path we spend an > > aweful lot of time optimizing without really getting anything back for > > it. > > It does add some extra instruction that may slow it down slightly, but I > don't agree that it gives nothing back. The cpu lock holder information can > be useful in analyzing crash dumps and in some debugging situation. I think > it can be useful in RHEL for this readon. How about an x86 config option to > allow distros to decide if they want to have it enabled? I will make sure > that it will have no performance degradation if the option is not enabled.Config knobs suck too; they create a maintenance burden (we get to make sure all the permutations works/build/etc..) and effectively nobody uses them, since world+dog uses what distros pick. Anyway, instead of adding a second per-cpu variable, can you see how horrible something like this is: unsigned char adds(unsigned char var, unsigned char val) { unsigned short sat = 0xff, tmp = var; asm ("addb %[val], %b[var];" "cmovc %[sat], %[var];" : [var] "+r" (tmp) : [val] "ir" (val), [sat] "r" (sat) ); return tmp; } Another thing to try is, instead of threading that lockval throughout the thing, simply: #define _Q_LOCKED_VAL this_cpu_read_stable(cpu_sat) or combined with the above #define _Q_LOCKED_VAL adds(this_cpu_read_stable(cpu_number), 2) and see if the compiler really makes a mess of things.
Will Deacon
2020-Jul-24 08:16 UTC
[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR
On Thu, Jul 23, 2020 at 08:47:59PM +0200, peterz at infradead.org wrote:> On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote: > > BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch? > > I will have to update the patch to fix the reported 0-day test problem, but > > I want to collect other feedback before sending out v3. > > I want to say I hate it all, it adds instructions to a path we spend an > aweful lot of time optimizing without really getting anything back for > it. > > Will, how do you feel about it?I can see it potentially being useful for debugging, but I hate the limitation to 256 CPUs. Even arm64 is hitting that now. Also, you're talking ~1% gains here. I think our collective time would be better spent off reviewing the CNA series and trying to make it more deterministic. Will
Waiman Long
2020-Jul-24 19:10 UTC
[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR
On 7/24/20 4:16 AM, Will Deacon wrote:> On Thu, Jul 23, 2020 at 08:47:59PM +0200, peterz at infradead.org wrote: >> On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote: >>> BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch? >>> I will have to update the patch to fix the reported 0-day test problem, but >>> I want to collect other feedback before sending out v3. >> I want to say I hate it all, it adds instructions to a path we spend an >> aweful lot of time optimizing without really getting anything back for >> it. >> >> Will, how do you feel about it? > I can see it potentially being useful for debugging, but I hate the > limitation to 256 CPUs. Even arm64 is hitting that now.After thinking more about that, I think we can use all the remaining bits in the 16-bit locked_pending. Reserving 1 bit for locked and 1 bit for pending, there are 14 bits left. So as long as NR_CPUS < 16k (requirement for 16-bit locked_pending), we can put all possible cpu numbers into the lock. We can also just use smp_processor_id() without additional percpu data.> > Also, you're talking ~1% gains here. I think our collective time would > be better spent off reviewing the CNA series and trying to make it more > deterministic.I thought you guys are not interested in CNA. I do want to get CNA merged, if possible. Let review the current version again and see if there are ways we can further improve it. Cheers, Longman