thr3ads.net - llvm dev - [LLVMdev] x86-64 large stack offsets [Sep 2011]

If this information is useful, please help other people find it:
Share via:

Cameron McInally

2011-Sep-26 19:02 UTC

[LLVMdev] x86-64 large stack offsets

Hey guys,

I'm working on a bug for x86-64 in LLVM 2.9. Well, it's actually two
issues.
The assembly generated for large stack offsets has an overflow; And, once
the overflow is fixed, the displacement is too large for GNU ld to handle
it.

void fool( int long n )
{
  double w[268435600];
  double z[268435600];
  unsigned long i;
  for ( i = 0; i < n; i++ ) {
    w[i] = 1.0;
    z[i] = 2.0;
  }
  printf(" n: %lld, W %g Z %g\n", n, w[1], z[1] );
}

Here's one of the offending instructions produced by 2.9:

movsd   -2147482472(%rsp), %xmm0

Fixing the displacement overflow is pretty easy. It's just a matter of
changing a few variable types in LLVM from unsigned to uint64_t in the
functions that calculate the stack offsets. The real trouble I'm having
is finding a good place to break up the displacements during lowering. I
would like the offset to be calculated similar to gcc:

movabsq $-4294969640, %rdx
movsd     0(%rbp,%rdx), %xmm0

Any suggestions on the correct lowering pass to do a transformation like
this? I'm an LLVM noob, so I'm not sure where it should go.

Tx,
Cameron
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110926/698907db/attachment.html>

Cameron McInally

2011-Sep-26 19:24 UTC

head link

[LLVMdev] x86-64 large stack offsets

To be pedantic... use of the frame pointer isn't necessary. The stack
pointer would be fine. That's just how GCC calculates the offset for this
test case.

On Monday, September 26, 2011, Cameron McInally <cameron.mcinally at
nyu.edu>
wrote:> Hey guys,
>
> I'm working on a bug for x86-64 in LLVM 2.9. Well, it's actually
twoissues. The assembly generated for large stack offsets has an overflow; And,
once the overflow is fixed, the displacement is too large for GNU ld to
handle it.>
> void fool( int long n )
> {
>   double w[268435600];
>   double z[268435600];
>   unsigned long i;
>   for ( i = 0; i < n; i++ ) {
>     w[i] = 1.0;
>     z[i] = 2.0;
>   }
>   printf(" n: %lld, W %g Z %g\n", n, w[1], z[1] );
> }
>
> Here's one of the offending instructions produced by 2.9:
>
> movsd   -2147482472 <tel:2147482472>(%rsp), %xmm0
>
> Fixing the displacement overflow is pretty easy. It's just a matter ofchanging a few variable types in LLVM from unsigned to uint64_t in the
functions that calculate the stack offsets. The real trouble I'm
having> is finding a good place to break up the displacements during lowering. I
would like the offset to be calculated similar to gcc:>
> movabsq $-4294969640, %rdx
> movsd     0(%rbp,%rdx), %xmm0
>
> Any suggestions on the correct lowering pass to do a transformation likethis? I'm an LLVM noob, so I'm not sure where it should
go.>
> Tx,
> Cameron-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110926/ded4cdc4/attachment.html>

Jakob Stoklund Olesen

2011-Sep-26 20:34 UTC

head link

[LLVMdev] x86-64 large stack offsets

On Sep 26, 2011, at 12:02 PM, Cameron McInally wrote:> 
> Here's one of the offending instructions produced by 2.9:
> 
> movsd   -2147482472(%rsp), %xmm0
> 
> Fixing the displacement overflow is pretty easy. It's just a matter of
changing a few variable types in LLVM from unsigned to uint64_t in the functions
that calculate the stack offsets. The real trouble I'm having
> is finding a good place to break up the displacements during lowering. I
would like the offset to be calculated similar to gcc:
> 
> movabsq $-4294969640, %rdx
> movsd     0(%rbp,%rdx), %xmm0
> 
> Any suggestions on the correct lowering pass to do a transformation like
this? I'm an LLVM noob, so I'm not sure where it should go.
Hi Cameron,

As you have noticed, the x86 backend only supports stack frames up to 2GB.

Fixing that would require the x86 backend to use the register scavenger during
prolog epilog insertion like the ARM backend does.  That particular code was
very difficult to get right, and no one has thought it was worth the trouble to
get it working for x86.

Your life will be a whole lot easier if you just use malloc().

/jakob

David A. Greene

2011-Sep-26 21:46 UTC

head link

[LLVMdev] x86-64 large stack offsets

Jakob Stoklund Olesen <stoklund at 2pi.dk> writes:


Hi Jakob,

Thanks for the responses.
> As you have noticed, the x86 backend only supports stack frames up to 2GB.
That's unfortunate.  :(
> Fixing that would require the x86 backend to use the register
> scavenger during prolog epilog insertion like the ARM backend does.
Makes sense.
> That particular code was very difficult to get right, and no one has
> thought it was worth the trouble to get it working for x86.
I wouldn't imagine so, since these kinds of large stack objects are
rather rare in the C world.  They are somewhat more common in the
Fortran world.  :)
> Your life will be a whole lot easier if you just use malloc().
Perhaps.  This is customer-written code and they will (probably) not be
willing to change it.  We could replace the allocas with malloc/free
under the hood but we haven't needed to do that on past platforms.  It's
certainly a mildly large change in our compiler in the sense of how
resources get allocated.  It is certainly doable but for various reasons
may be undesirable.

Do you have a feel for the complexity involved with the ARM code?  What
were the troublesome parts and corner cases, etc.?

                             -Dave

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Sep 2011 - [LLVMdev] x86-64 large stack offsets

[LLVMdev] x86-64 large stack offsets

[LLVMdev] x86-64 large stack offsets

[LLVMdev] x86-64 large stack offsets

[LLVMdev] x86-64 large stack offsets

Apparently Analagous Threads