thr3ads.net - llvm dev - [LLVMdev] llvm register reload/spilling around calls [Oct 2010]

If this information is useful, please help other people find it:
Share via:

Roland Scheidegger

2010-Oct-19 18:40 UTC

[LLVMdev] llvm register reload/spilling around calls

Hi,

I was investigating some performance issues with llvm JIT-generated code
(x86_64), and looking at the assembly it indeed seemed quite suboptimal.
In particular, the code is basically implementing some kind of caching.
If there's a cache hit, the code just takes the value from the cache, if
not it will do whatever is necessary to update the cache - this is
expensive but happens only in about 1% of all cases and just calls a
function to do it.
So I saw that the code is doing lots of register spilling/reloading. Now
I understand that due to calling conventions, there's not really a way
to avoid this - I tried using coldcc but apparently the backend doesn't
implement it and hence this is ignored.
But what is really bad about this, is that the spilling/reloading ALWAYS
happens, regardless if the branch containing the call is taken or not.
Since the branch is almost never taken, that is obviously quite bad (but
even if the branch would be taken more often, which the compiler can't
know, I can't see why the reloading is always happening).
I'm not quite sure what performance impact this has, but it looks to me
like it definitely would make a difference, as the code not taking the
branch is quite simple.
I tried with both llvm 2.7 and 2.8, no difference.
So is there any optimization option I'm missing which could improve
this? Or is this simply the way things are (would that be considered a
bug?). If this is a known limitation, any ideas if it's possible to work
around that (by changing the affected jit code)?
I'm attaching the IR I've hack-extracted from the jit code (it might be
bogus but it compiles just fine). I think the assembly shows what I'm
talking about quite well (even has the comments about the
restore/spills). I used llc -O3 to compile.

Roland

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: sillysaverestore
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20101019/d4578f8d/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: sillysaverestore.s
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20101019/d4578f8d/attachment-0001.ksh>

Jakob Stoklund Olesen

2010-Oct-19 21:21 UTC

head link

[LLVMdev] llvm register reload/spilling around calls

On Oct 19, 2010, at 11:40 AM, Roland Scheidegger wrote:
> So I saw that the code is doing lots of register spilling/reloading. Now
> I understand that due to calling conventions, there's not really a way
> to avoid this - I tried using coldcc but apparently the backend doesn't
> implement it and hence this is ignored.
Yes, unfortunately the list of call-clobbered registers is fixed at the moment,
so coldcc is mostly ignored by the backend.

Patches welcome.
> So is there any optimization option I'm missing which could improve
> this? Or is this simply the way things are (would that be considered a
> bug?). If this is a known limitation, any ideas if it's possible to
work
> around that (by changing the affected jit code)?
The -pre-alloc-split option should handle stuff like this when calls clobber an
entire register class. That probably only applies to XMM registers.

Work on proper live range splitting is in progress. You can try it out with
-spiller=inline, but it is highly experimental and volatile at the moment.

I don't know any short term solutions.

/jakob

Roland Scheidegger

2010-Oct-20 01:37 UTC

head link

[LLVMdev] llvm register reload/spilling around calls

Thanks for giving it a look!

On 19.10.2010 23:21, Jakob Stoklund Olesen wrote:> On Oct 19, 2010, at 11:40 AM, Roland Scheidegger wrote:
> 
>> So I saw that the code is doing lots of register
>> spilling/reloading. Now I understand that due to calling
>> conventions, there's not really a way to avoid this - I tried using
>> coldcc but apparently the backend doesn't implement it and hence
>> this is ignored.
> 
> Yes, unfortunately the list of call-clobbered registers is fixed at
> the moment, so coldcc is mostly ignored by the backend.
> 
> Patches welcome.What would be needed there? I actually tried a quick hack and simply
changed the registers included in the list in
X86RegisterInfo::getCalleeSavedRegs, so some xmm regs were included
(similar to what was done for win64). But the result wasn't what I
expected - the callee now indeed saved/restored all the xmm regs I
added, however the calling code did not change at all...
> 
>> So is there any optimization option I'm missing which could improve
>>  this? Or is this simply the way things are (would that be
>> considered a bug?). If this is a known limitation, any ideas if
>> it's possible to work around that (by changing the affected jit
>> code)?
> 
> The -pre-alloc-split option should handle stuff like this when calls
> clobber an entire register class. That probably only applies to XMM
> registers.I tried that and the generated code did not change at all.
> 
> Work on proper live range splitting is in progress. You can try it
> out with -spiller=inline, but it is highly experimental and volatile
> at the moment.Tried that too but the code mostly remained the same (there were 2
additional spills right at the beginning and some of the register
numbers changed but that was all).
There's also a -spiller=splitting option, I don't know what it should do
but it just crashed...

Roland
> 
> I don't know any short term solutions.
> 
> /jakob
>

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Oct 2010 - [LLVMdev] llvm register reload/spilling around calls

[LLVMdev] llvm register reload/spilling around calls

[LLVMdev] llvm register reload/spilling around calls

[LLVMdev] llvm register reload/spilling around calls

Maybe Matching Threads