thr3ads.net - llvm dev - [LLVMdev] Architecture Dependency of LLVM bitcode (was Re: compile linux kernel) [Sep 2008]

If this information is useful, please help other people find it:
Share via:

Christian Plessl

2008-Sep-29 13:18 UTC

[LLVMdev] Architecture Dependency of LLVM bitcode (was Re: compile linux kernel)

On 29.09.2008, at 11:53, Jonathan S. Shapiro wrote:
> Watching this thread, it occurs to me that the "V" in
"LLVM" is
> creating
> confusion. So far as I know, LLVM is the first project to use  
> "virtual"
> to refer to the instruction set of the intermediate form. I understand
> why this labeling made sense (sort of), but it was unfortunate. The
> machine is abstract, not virtual, and the use of "virtual" here
is so
> out of keeping with any other use of the term that it really does
> generate confusion.
The topic whether LLVM bitcode is independent of the target platform  
was raised several times on the mailing list, but it was never  
discussed in detail. I would appreciate learning more about the  
following questions:

-  Is the architecture dependence of LLVM IR only an artifact of llvm- 
gcc producing architecture dependent results?
- What architecture-specific features are present in the IR that  
prevent running the same LLVM bitcode on different architectures?
> Is this worth a FAQ entry?

I would definitely appreciate such a FAQ entry.

Best regards,
  Christian

Andrew Lenharth

2008-Sep-29 13:46 UTC

head link

[LLVMdev] Architecture Dependency of LLVM bitcode (was Re: compile linux kernel)

On Mon, Sep 29, 2008 at 8:18 AM, Christian Plessl
<christian.plessl at uni-paderborn.de> wrote:> - Is the architecture dependence of LLVM IR only an artifact of llvm-
> gcc producing architecture dependent results?
No.
It also is an artifact of code compiling architecture and OS dependent
features based on what they detect at configure time
It is an artifact of compiling non-type safe languages.
It is an artifact of system headers including inline asm.
It is an artifact of the endianness of the system.
> - What architecture-specific features are present in the IR that
> prevent running the same LLVM bitcode on different architectures?
A better question is: what architecture-abstracting features would
make writing target independent LLVM bitcode easier? There is 1 that
I think is critical, and 3 more that would make life much easier
(though technically redundant).

hton and ntoh intrinsics. These are needed to allow target code to
deal with endianness in a target independent way. (Ok, you could
potentially write code that detected endiannes at runtime and chose
multiversioned code based on that, but that is ugly and optimization
prohibiting).

redundant, but greatly simplifying:

iptr aliased type. There are legitamate cases where you want to
perform arithmetic and comparisons on pointers that the semantics of
GEP make illegal so the only way to do so in a target independent way
is to either cast to an int you hope is >= than any pointer, or
violate the GEP semantics (which is generally works).

GBP instruction (GetBasePointer). The inverse of a GEP. A GEP
selects an offset into a object in a target independent way based on
the type. What GBP would do would be to get a pointer to the base of
an object based on a pointer to field, a type, and the same specifier
as the GEP would use to get the field.
x == GBP (GEP x, 0, 1, 1), typeof(x), 0, 1, 1
This would make upcasts or any conversion from an embedded object to a
parent object not need arch dependent offsets and raw pointer
manipulation. (yes you could figure out the offset from GEP off null
trick and use raw pointer manipulation and casts)

sizeof instruction. Again, you can use the GEP off null trick, but
this isn't very obvious, but since it doesn't involve raw pointer
manipulation.

Andrew

Sherief N. Farouk

2008-Sep-29 19:03 UTC

head link

[LLVMdev] Architecture Dependency of LLVM bitcode (was Re: compile linux kernel)

> hton and ntoh intrinsics.  These are needed to allow target code to
> deal with endianness in a target independent way.  (Ok, you could
> potentially write code that detected endiannes at runtime and chose
> multiversioned code based on that, but that is ugly and optimization
> prohibiting).
> 
Why not add types with explicit endianess? A trick I use for reading binary
files across platforms is to define the types int32, int32_le and int32_be :
int32 is platform-native, _le and _be are little and big endian,
respectively. I use and #ifdef in my types.hpp to determine which of _le and
_be is a typedef for the standard uint32, and the other is implemented as a
class with operator int32(). "add i32_be %X, 8" looks elegant to me,
and
quite easy (for someone writing the ir output to a text file, like me :) to
bolt on to existing code.

- Sherief

Eli Friedman

2008-Sep-29 19:28 UTC

head link

[LLVMdev] Architecture Dependency of LLVM bitcode (was Re: compile linux kernel)

On Mon, Sep 29, 2008 at 6:46 AM, Andrew Lenharth <andrewl at lenharth.org>
wrote:> hton and ntoh intrinsics.
You can write these portably already; just store to an i32, cast the
pointer to i8, read out the bytes, then reconstruct the i32.  If I
recall correctly, scalarrepl+instcombine should be able to eliminate
the abstraction if they have target information.

-Eli

Chris Lattner

2008-Sep-29 21:19 UTC

head link

[LLVMdev] Architecture Dependency of LLVM bitcode

On Sep 29, 2008, at 6:18 AM, Christian Plessl wrote:> On 29.09.2008, at 11:53, Jonathan S. Shapiro wrote:
>> Watching this thread, it occurs to me that the "V" in
"LLVM" is
>> creating
>> confusion. So far as I know, LLVM is the first project to use
>> "virtual"
>> to refer to the instruction set of the intermediate form. I  
>> understand
>> why this labeling made sense (sort of), but it was unfortunate. The
>> machine is abstract, not virtual, and the use of "virtual"
here is so
>> out of keeping with any other use of the term that it really does
>> generate confusion.
>
> The topic whether LLVM bitcode is independent of the target platform
> was raised several times on the mailing list,
Wow, there is a lot of FUD and misinformation on this thread.
> but it was never
> discussed in detail. I would appreciate learning more about the
> following questions:
>
> -  Is the architecture dependence of LLVM IR only an artifact of llvm-
> gcc producing architecture dependent results?
No, it inherent to any C compiler.  The preprocessor introduces target  
specific details and things just go downhill from there:
http://llvm.org/docs/tutorial/LangImpl8.html#targetindep

If you start from a target-independent *language*, you can generate  
target independent LLVM IR.
> - What architecture-specific features are present in the IR that
> prevent running the same LLVM bitcode on different architectures?
Many things are target independent, but the most significant is that  
LLVM allows unrestricted pointer casting.  An example that allows the  
programmer to "see" the underlying endianness of the target is C code
like this:

int X = ...
char C = *(char*)&X

A language that is target independent (java, perl, ...) would not  
allow the programmer to express such things.
>> Is this worth a FAQ entry?
>
> I would definitely appreciate such a FAQ entry.
Patches welcome :)

-Chris

Christian Plessl

2008-Sep-30 19:02 UTC

head link

[LLVMdev] Architecture Dependency of LLVM bitcode

Thanks to anyone for these helpful answers. At least to me the causes  
for architecture dependencies in LLVM IR are much clearer now.

On 29.09.2008, at 23:19, Chris Lattner wrote:
>>> Is this worth a FAQ entry?
>> I would definitely appreciate such a FAQ entry.
> Patches welcome :)
>
> Many things are target independent, but the most significant is that
> LLVM allows unrestricted pointer casting.  An example that allows the
> programmer to "see" the underlying endianness of the target is C
code
> like this:
>
> int X = ...
> char C = *(char*)&X

I don't feel sufficiently confident with the matter to write such a  
FAQ entry myself.

But wouldn't it make sense to move the notes on target independency  
from the Kaleidoscope tutorial
(http://llvm.org/docs/tutorial/LangImpl8.html#targetindep
) to the FAQ page? In my opinion, these explanations and the  
additional endianness example you gave, explain the issues with target  
dependencies quite well.

Best regards,
  Christian

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Sep 2008 - [LLVMdev] Architecture Dependency of LLVM bitcode (was Re: compile linux kernel)

[LLVMdev] Architecture Dependency of LLVM bitcode (was Re: compile linux kernel)

[LLVMdev] Architecture Dependency of LLVM bitcode (was Re: compile linux kernel)

[LLVMdev] Architecture Dependency of LLVM bitcode (was Re: compile linux kernel)

[LLVMdev] Architecture Dependency of LLVM bitcode (was Re: compile linux kernel)

[LLVMdev] Architecture Dependency of LLVM bitcode

[LLVMdev] Architecture Dependency of LLVM bitcode

Apparently Analagous Threads