thr3ads.net - llvm dev - [LLVMdev] endian independence [Oct 2008]

If this information is useful, please help other people find it:
Share via:

Jay Foad

2008-Oct-21 09:27 UTC

[LLVMdev] endian independence

Hi,

I'd like to use LLVM to compile and optimise code when I don't know
whether the target CPU is big- or little-endian. This would allow me
to create a single optimised LLVM bitcode binary of an application,
and then run it through a JIT compiler on systems of differening
endianness.

I realise that in general the LLVM IR depends on various
characteristics of the target; I'd just like to be able to remove this
dependency for the specific case of unknown target endianness.

Here's a sketch of how it would work:

1. Extend TargetData::isBigEndian() and LLVM bitcode's "target data
layout string" so that endianness is represented as either big, little
or unknown. (I see there's already support for something like this in
Module::getEndianness().)

2. For optimisations (like parts of SRA) that depend on knowing the
target endianness, restrict or disable them as necessary if the target
endianness is unknown. I think this will only affect a small handful
of optimisations.

3. In llvm-gcc, if the LLVM backend reports unknown endianness, make
sure that the conversion from GCC trees to LLVM IR doesn't depend on
endianness. This seems to be fairly straightforward, *except* for
access to bitfields, which is a bit convoluted.

4. In llvm-gcc, if the LLVM backend reports unknown endianness, make
sure that GCC's optimisations on trees don't depend on endianness.

5. Have the linker refuse to link a big-endian module with a
little-endian one, but allow linking a module of unknown endianness
with a module of any endianness at all. (I think this might work
already.)

I'm already working on this myself. Would you be interested in having
this work contributed back to LLVM?

Thanks,
Jay.

Scott Graham

2008-Oct-26 23:24 UTC

head link

[LLVMdev] endian independence

I would find this functionality useful if it made it back into trunk.

scott

On Tue, Oct 21, 2008 at 2:27 AM, Jay Foad <jay.foad at gmail.com>
wrote:> Hi,
>
> I'd like to use LLVM to compile and optimise code when I don't know
> whether the target CPU is big- or little-endian. This would allow me
> to create a single optimised LLVM bitcode binary of an application,
> and then run it through a JIT compiler on systems of differening
> endianness.
>
> I realise that in general the LLVM IR depends on various
> characteristics of the target; I'd just like to be able to remove this
> dependency for the specific case of unknown target endianness.
>
> Here's a sketch of how it would work:
>
> 1. Extend TargetData::isBigEndian() and LLVM bitcode's "target
data
> layout string" so that endianness is represented as either big, little
> or unknown. (I see there's already support for something like this in
> Module::getEndianness().)
>
> 2. For optimisations (like parts of SRA) that depend on knowing the
> target endianness, restrict or disable them as necessary if the target
> endianness is unknown. I think this will only affect a small handful
> of optimisations.
>
> 3. In llvm-gcc, if the LLVM backend reports unknown endianness, make
> sure that the conversion from GCC trees to LLVM IR doesn't depend on
> endianness. This seems to be fairly straightforward, *except* for
> access to bitfields, which is a bit convoluted.
>
> 4. In llvm-gcc, if the LLVM backend reports unknown endianness, make
> sure that GCC's optimisations on trees don't depend on endianness.
>
> 5. Have the linker refuse to link a big-endian module with a
> little-endian one, but allow linking a module of unknown endianness
> with a module of any endianness at all. (I think this might work
> already.)
>
> I'm already working on this myself. Would you be interested in having
> this work contributed back to LLVM?
>
> Thanks,
> Jay.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Chris Lattner

2008-Oct-27 01:33 UTC

head link

[LLVMdev] endian independence

On Oct 21, 2008, at 2:27 AM, Jay Foad wrote:
> Hi,
>
> I'd like to use LLVM to compile and optimise code when I don't know
> whether the target CPU is big- or little-endian. This would allow me
> to create a single optimised LLVM bitcode binary of an application,
> and then run it through a JIT compiler on systems of differening
> endianness.
Ok.
> I realise that in general the LLVM IR depends on various
> characteristics of the target; I'd just like to be able to remove this
> dependency for the specific case of unknown target endianness.
Sure.  In practice, it should be possible to produce target- 
independent LLVM IR if you have a target-independent input language.   
The trick is making it so that the optimizers preserve this property.   
Endianness is only one piece of this puzzle.
> 3. In llvm-gcc, if the LLVM backend reports unknown endianness, make
> sure that the conversion from GCC trees to LLVM IR doesn't depend on
> endianness. This seems to be fairly straightforward, *except* for
> access to bitfields, which is a bit convoluted.
This will never work for llvm-gcc.  To much target-specific stuff is  
already folded before the llvm backend is even involved.
> I'm already working on this myself. Would you be interested in having
> this work contributed back to LLVM?
If this were to better support target independent languages, it would  
be very useful.  If you're just trying to *reduce* the endianness  
assumptions that leak through, I don't think it's a good approach.   
There is just no way to solve this problem with C.  By the time the  
preprocessor has run, your C code has already had #ifdef  
__LITTLE_ENDIAN__ etc evaluated, for example.

How do you propose to handle things like:

struct foo {
#ifdef __LITTLE_ENDIAN__
   int x, y;
#else
   int y, x;
#endif
};

-Chris

Jay Foad

2008-Oct-27 10:14 UTC

head link

[LLVMdev] endian independence

>> I'm already working on this myself. Would you be interested in
having
>> this work contributed back to LLVM?
>
> If this were to better support target independent languages, it would
> be very useful.  If you're just trying to *reduce* the endianness
> assumptions that leak through, I don't think it's a good approach.
> There is just no way to solve this problem with C.
Yes, I can see that the llvm part of this is more straightforward and
less controversial than the llvm-gcc part. Maybe I should submit the
llvm part (since it applies to all source languages) and keep the
llvm-gcc part as a local hack.
> How do you propose to handle things like:
>
> struct foo {
> #ifdef __LITTLE_ENDIAN__
>   int x, y;
> #else
>   int y, x;
> #endif
> };
I can't make all C programs work regardless of target endianness. This
one will only work on little-endian:

  int x = 1;
  assert(*(char *)&x == 1);

You've just highlighted another restriction that I'll have to impose:
you shouldn't expect to be able to detect target endianness at compile
time.

All I want is that, if you write your source code so that it doesn't
make assumptions about endianness, then the compiler and its
optimisations won't introduce any new assumptions about endianness.

Thanks,
Jay.

René Rebe

2008-Nov-06 10:26 UTC

head link

[LLVMdev] endian independence

Hi,

Chris Lattner wrote:> On Oct 21, 2008, at 2:27 AM, Jay Foad wrote:
>
>   
>> Hi,
>>
>> I'd like to use LLVM to compile and optimise code when I don't
know
>> whether the target CPU is big- or little-endian. This would allow me
>> to create a single optimised LLVM bitcode binary of an application,
>> and then run it through a JIT compiler on systems of differening
>> endianness.
>>     
>
> Ok.
>
>   
>> I realise that in general the LLVM IR depends on various
>> characteristics of the target; I'd just like to be able to remove
this
>> dependency for the specific case of unknown target endianness.
>>     
>
> Sure.  In practice, it should be possible to produce target- 
> independent LLVM IR if you have a target-independent input language.   
> The trick is making it so that the optimizers preserve this property.   
> Endianness is only one piece of this puzzle.
>
>   
>> 3. In llvm-gcc, if the LLVM backend reports unknown endianness, make
>> sure that the conversion from GCC trees to LLVM IR doesn't depend
on
>> endianness. This seems to be fairly straightforward, *except* for
>> access to bitfields, which is a bit convoluted.
>>     
>
> This will never work for llvm-gcc.  To much target-specific stuff is  
> already folded before the llvm backend is even involved.
>
>   
>> I'm already working on this myself. Would you be interested in
having
>> this work contributed back to LLVM?
>>     
>
> If this were to better support target independent languages, it would  
> be very useful.  If you're just trying to *reduce* the endianness  
> assumptions that leak through, I don't think it's a good approach.
> There is just no way to solve this problem with C.  By the time the  
> preprocessor has run, your C code has already had #ifdef  
> __LITTLE_ENDIAN__ etc evaluated, for example.
>
> How do you propose to handle things like:
>
> struct foo {
> #ifdef __LITTLE_ENDIAN__
>    int x, y;
> #else
>    int y, x;
> #endif
> };
>   Define a fixed endianess as in-memory representation and let the optimizer
at JIT time optimize the shuffling access patterns away, if possible.

E.g. either all big-endian as it's the network order, or all 
little-endian because
more "not so well written" software would continue to just work.

-- 
  René Rebe - ExactCODE GmbH - Europe, Germany, Berlin
  http://exactcode.de | http://t2-project.org | http://rene.rebe.name

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Oct 2008 - [LLVMdev] endian independence

[LLVMdev] endian independence

[LLVMdev] endian independence

[LLVMdev] endian independence

[LLVMdev] endian independence

[LLVMdev] endian independence

Apparently Analagous Threads