thr3ads.net - llvm dev - [LLVMdev] LLVM IR is a compiler IR [Oct 2011]

If this information is useful, please help other people find it:
Share via:

Duncan Sands

2011-Oct-05 08:53 UTC

[LLVMdev] LLVM IR is a compiler IR

Hi Óscar,
> There are places where compatibility with the native C ABI is taken too
> far. For instance, time ago I noted that what the user sets through
> Module::setDataLayout is simply ignored.
it's not ignored, it's used by the IR level optimizers.  That way these
optimizers can know stuff about the target without having to be linked
to a target backend.

  LLVM uses the data layout> required by the native C ABI, which is hardcoded into LLVM's source
> code. So I asked: pass the value setted by Module::setDataLayout to the
> layers that are interested on it, as any user would expect.
There are two classes of information in datalayout: things which correspond
to stuff hard-wired into the target processor (for example that x86 is little
endian), and stuff which is not hard-wired in (for example the alignment of
x86 long double, which is 4 or 8 bytes on x86-32 depending on whether you are
on linux, darwin or windows).  Hoping to have code generators override the
hard-wired stuff if they see something different in the data layout is just
too much to ask for - eg the x86 code generators are never going to produce big
endian code just because you set big-endianness in the datalayout.  Even the
second class of "soft" parameters is not completely flexible: for
example most
processors enforce a minimum alignment for types, and trying to reduce it by
giving types a lesser alignment in the datalayout just isn't going to work.
So given that the ways in which codegen could adapt to various datalayout
settings are quite limited and constrained by the target, does it really make
sense to try to parametrize the codegenerators by the datalayout at all?
In any case, it might be good if the code generators produced a warning if they
see that the datalayout string doesn't correspond to what codegen thinks it
should be (I though someone added that already?).

Ciao, Duncan.

  The response> I got was, in essence, "As you are not working on C/C++, I
couldn't care
> less about your language's requirements." So I have a two-line
patch on
> my LLVM local copy, which has the effect of making the IR code generated
> by my compiler portable across Linux/x86 and Windows/x86 (although that
> was not the reason I wanted the change.)
>
> So it is true that LLVM IR has portability limitations, but not all of
> them are intrinsic to the LLVM IR nature.
>
> [snip]
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Óscar Fuentes

2011-Oct-05 11:32 UTC

head link

[LLVMdev] LLVM IR is a compiler IR

Hello Dan.

Duncan Sands <baldrick at free.fr> writes:
>> There are places where compatibility with the native C ABI is taken too
>> far. For instance, time ago I noted that what the user sets through
>> Module::setDataLayout is simply ignored.
>
> it's not ignored, it's used by the IR level optimizers.  That way
these
> optimizers can know stuff about the target without having to be linked
> to a target backend.
Well, it is used by one layer, ignored by another. Anyways LLVM is not
doing what the user expects.
>> LLVM uses the data layout
>> required by the native C ABI, which is hardcoded into LLVM's source
>> code. So I asked: pass the value setted by Module::setDataLayout to the
>> layers that are interested on it, as any user would expect.
>
> There are two classes of information in datalayout: things which correspond
> to stuff hard-wired into the target processor (for example that x86 is
little
> endian), and stuff which is not hard-wired in (for example the alignment of
> x86 long double, which is 4 or 8 bytes on x86-32 depending on whether you
are
> on linux, darwin or windows).  Hoping to have code generators override the
> hard-wired stuff if they see something different in the data layout is just
> too much to ask for - eg the x86 code generators are never going to produce
big
> endian code just because you set big-endianness in the datalayout.  Even
the
> second class of "soft" parameters is not completely flexible: for
example most
> processors enforce a minimum alignment for types, and trying to reduce it
by
> giving types a lesser alignment in the datalayout just isn't going to
work.
> So given that the ways in which codegen could adapt to various datalayout
> settings are quite limited and constrained by the target, does it really
make
> sense to try to parametrize the codegenerators by the datalayout at all?
> In any case, it might be good if the code generators produced a warning if
they
> see that the datalayout string doesn't correspond to what codegen
thinks it
> should be (I though someone added that already?).
You focus your reasoning on possible wrong uses of the data layout
setting (endianness) when, as you say, there are other uses which are
perfectly legit (using a specific alignment within the limits allowed by
the processor.)  So if I need to align my data on a different way of
what the C ABI requires or generate code for a platform that LLVM still
does not know about, my only solution is to patch LLVM because the value
setted through one of its APIs is ignored on key places, as LLVM assumes
that everybody wants full interoperability with C. This is the kind of
logic that tells me that LLVM is a C-obsessed project: any requirement
that falls outside the needs of a C compiler writer is seen as
superfluous even if it does not conflict with the rest of LLVM.

Duncan Sands

2011-Oct-05 12:01 UTC

head link

[LLVMdev] LLVM IR is a compiler IR

Hi Oscar,
>>> There are places where compatibility with the native C ABI is taken
too
>>> far. For instance, time ago I noted that what the user sets through
>>> Module::setDataLayout is simply ignored.
>>
>> it's not ignored, it's used by the IR level optimizers.  That
way these
>> optimizers can know stuff about the target without having to be linked
>> to a target backend.
>
> Well, it is used by one layer, ignored by another. Anyways LLVM is not
> doing what the user expects.
it's not doing what *you* expect: it doesn't match your mental model of
what
it is for (or should be for).  The question is whether LLVM should be changed
or your expectations should be changed.  Just observing the mismatch between
your expectations and current reality is not in itself an argument that LLVM
should be changed.
>>> LLVM uses the data layout
>>> required by the native C ABI, which is hardcoded into LLVM's
source
>>> code. So I asked: pass the value setted by Module::setDataLayout to
the
>>> layers that are interested on it, as any user would expect.
>>
>> There are two classes of information in datalayout: things which
correspond
>> to stuff hard-wired into the target processor (for example that x86 is
little
>> endian), and stuff which is not hard-wired in (for example the
alignment of
>> x86 long double, which is 4 or 8 bytes on x86-32 depending on whether
you are
>> on linux, darwin or windows).  Hoping to have code generators override
the
>> hard-wired stuff if they see something different in the data layout is
just
>> too much to ask for - eg the x86 code generators are never going to
produce big
>> endian code just because you set big-endianness in the datalayout. 
Even the
>> second class of "soft" parameters is not completely flexible:
for example most
>> processors enforce a minimum alignment for types, and trying to reduce
it by
>> giving types a lesser alignment in the datalayout just isn't going
to work.
>> So given that the ways in which codegen could adapt to various
datalayout
>> settings are quite limited and constrained by the target, does it
really make
>> sense to try to parametrize the codegenerators by the datalayout at
all?
>> In any case, it might be good if the code generators produced a warning
if they
>> see that the datalayout string doesn't correspond to what codegen
thinks it
>> should be (I though someone added that already?).
>
> You focus your reasoning on possible wrong uses of the data layout
> setting (endianness) when, as you say, there are other uses which are
> perfectly legit (using a specific alignment within the limits allowed by
> the processor.)  So if I need to align my data on a different way of
> what the C ABI requires or generate code for a platform that LLVM still
> does not know about, my only solution is to patch LLVM because the value
> setted through one of its APIs is ignored on key places, as LLVM assumes
> that everybody wants full interoperability with C. This is the kind of
> logic that tells me that LLVM is a C-obsessed project: any requirement
> that falls outside the needs of a C compiler writer is seen as
> superfluous even if it does not conflict with the rest of LLVM.
You are talking to the wrong person: I pretty much only use Ada not C, so I
don't think I'm C obsessed.  Yet I never had any problems using LLVM
with Ada.
LLVM gives you several mechanisms for aligning things the way you like.  Are
they inadequate?  Do you have a specific example of something you find
problematic?

Ciao, Duncan.

Duncan Sands

2011-Oct-05 12:04 UTC

head link

[LLVMdev] LLVM IR is a compiler IR

>> So given that the ways in which codegen could adapt to various
datalayout
>> settings are quite limited and constrained by the target, does it
really make
>> sense to try to parametrize the codegenerators by the datalayout at
all?
PS: This wasn't a rhetorical question, i.e. I wasn't saying that what
you are
looking for is wrong.  It was a real question about the design of LLVM.

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Oct 2011 - [LLVMdev] LLVM IR is a compiler IR

[LLVMdev] LLVM IR is a compiler IR

[LLVMdev] LLVM IR is a compiler IR

[LLVMdev] LLVM IR is a compiler IR

[LLVMdev] LLVM IR is a compiler IR

Reasonably Related Threads