On 7/8/20 4:41 AM, Peter Zijlstra wrote:> On Tue, Jul 07, 2020 at 03:57:06PM +1000, Nicholas Piggin wrote: >> Yes, powerpc could certainly get more performance out of the slow >> paths, and then there are a few parameters to tune. > Can you clarify? The slow path is already in use on ARM64 which is weak, > so I doubt there's superfluous serialization present. And Will spend a > fair amount of time on making that thing guarantee forward progressm, so > there just isn't too much room to play. > >> We don't have a good alternate patching for function calls yet, but >> that would be something to do for native vs pv. > Going by your jump_label implementation, support for static_call should > be fairly straight forward too, no? > > https://lkml.kernel.org/r/20200624153024.794671356 at infradead.org >Speaking of static_call, I am also looking forward to it. Do you have an idea when that will be merged? Cheers, Longman
On Wed, Jul 08, 2020 at 07:54:34PM -0400, Waiman Long wrote:> On 7/8/20 4:41 AM, Peter Zijlstra wrote: > > On Tue, Jul 07, 2020 at 03:57:06PM +1000, Nicholas Piggin wrote: > > > Yes, powerpc could certainly get more performance out of the slow > > > paths, and then there are a few parameters to tune. > > Can you clarify? The slow path is already in use on ARM64 which is weak, > > so I doubt there's superfluous serialization present. And Will spend a > > fair amount of time on making that thing guarantee forward progressm, so > > there just isn't too much room to play. > > > > > We don't have a good alternate patching for function calls yet, but > > > that would be something to do for native vs pv. > > Going by your jump_label implementation, support for static_call should > > be fairly straight forward too, no? > > > > https://lkml.kernel.org/r/20200624153024.794671356 at infradead.org > > > Speaking of static_call, I am also looking forward to it. Do you have an > idea when that will be merged?0day had one crash on the last round, I think Steve send a fix for that last night and I'll go look at it. That said, the last posting got 0 feedback, so either everybody is really happy with it, or not interested. So let us know in the thread, with some review feedback. Once I get through enough of the inbox to actually find the fix and test it, I'll also update the thread, and maybe threaten to merge it if everybody stays silent :-)
Nicholas Piggin
2020-Jul-21 11:20 UTC
[PATCH v3 0/6] powerpc: queued spinlocks and rwlocks
Excerpts from Peter Zijlstra's message of July 9, 2020 6:31 pm:> On Wed, Jul 08, 2020 at 07:54:34PM -0400, Waiman Long wrote: >> On 7/8/20 4:41 AM, Peter Zijlstra wrote: >> > On Tue, Jul 07, 2020 at 03:57:06PM +1000, Nicholas Piggin wrote: >> > > Yes, powerpc could certainly get more performance out of the slow >> > > paths, and then there are a few parameters to tune. >> > Can you clarify? The slow path is already in use on ARM64 which is weak, >> > so I doubt there's superfluous serialization present. And Will spend a >> > fair amount of time on making that thing guarantee forward progressm, so >> > there just isn't too much room to play. >> > >> > > We don't have a good alternate patching for function calls yet, but >> > > that would be something to do for native vs pv. >> > Going by your jump_label implementation, support for static_call should >> > be fairly straight forward too, no? >> > >> > https://lkml.kernel.org/r/20200624153024.794671356 at infradead.org >> > >> Speaking of static_call, I am also looking forward to it. Do you have an >> idea when that will be merged? > > 0day had one crash on the last round, I think Steve send a fix for that > last night and I'll go look at it. > > That said, the last posting got 0 feedback, so either everybody is > really happy with it, or not interested. So let us know in the thread, > with some review feedback. > > Once I get through enough of the inbox to actually find the fix and test > it, I'll also update the thread, and maybe threaten to merge it if > everybody stays silent :-)I'd like to use it in powerpc. We have code now for example that patches a branch immediately at the top of memcpy which branches to a different version of the function. pv queued spinlock selection obviously, and there's a bunch of platform ops struct things that get filled in at boot time, etc. So +1 here if you can get them through. I'm not 100% sure we can do it with existing toolchain and no ugly hacks, but there's no way to structure things that can get around that AFAIKS. We'd eventually use it though, I'd say. Thanks, Nick
Possibly Parallel Threads
- [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks
- [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks
- [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks
- [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks
- [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks