thr3ads.net - llvm dev - [LLVMdev] First-class aggregate semantics [Jan 2010]

If this information is useful, please help other people find it:
Share via:

Dustin Laurence

2010-Jan-07 21:56 UTC

[LLVMdev] First-class aggregate semantics

On 01/07/2010 01:38 PM, David Greene wrote:
> The way this works on many targets is that the caller allocates stack
> space in its frame for the returned struct and passes a pointer to it
> as a first "hidden" argument to the callee.  The callee then
copies
> that data into the space pointed to by the address.
<nod>
> Long-term, first-class status means that returns of structs should
> "just work" and you don't need to worry about getting a
pointer to
> invalid memory.
OK, so my thought of constructing the object on the stack was correct?
What I originally wanted to do was roughly

    %Token = type {%c_int, %i8*}

    define %Token @foo()
    {
        ...

        ret %Token {%c_int %token, %i8* %value}
    }

but the compiler complains about the invalid usage of a local name.  So
I decided the problem was that I was thinking in terms of languages that
would create a temporary implicitly, and in IR I need to do it
explicitly.  So it occurred to me to create the struct on the stack, as
I mentioned.

What bothers me about that is the explicit specification with alloca
that the space is reserved in the callee's frame.  Do I just trust the
optimizer to eliminate that and turn the reference to alloca'd memory
into a reference to the space reserved by the caller?  Or is that going
to create an unnecessary copy from the alloca'd memory to that reserved
by the caller?  From what you said my guess is the former (optimizer
eliminates the pointless temporary), but us premature optimizers like to
be reassured we haven't given up an all-important microsecond. :-)
> ...I believe right now, however, only structs up to a
> certain size are supported, perhaps because under some ABIs, small
> structs can be returned in registers and one doesn't need to worry
> about generating the hidden argument.
In the case that prompted the question the struct isn't going to be
bigger than two of whatever the architecture regards as a word, which
surely should be fine, but in principle shouldn't LLVM and not the
front-end programmer be making the decision about whether the struct is
big enough to spill into memory?

Dustin

Alastair Lynn

2010-Jan-08 02:03 UTC

head link

[LLVMdev] First-class aggregate semantics

Hi Dustin-

You'll probably need to use insertvalue to construct your return value.

Alastair

On 7 Jan 2010, at 21:56, Dustin Laurence wrote:>    define %Token @foo()
>    {
>        ...
> 
>        ret %Token {%c_int %token, %i8* %value}
>    }

Jon Harrop

2010-Jan-08 02:52 UTC

head link

[LLVMdev] First-class aggregate semantics

On Thursday 07 January 2010 21:56:11 Dustin Laurence
wrote:> On 01/07/2010 01:38 PM, David Greene wrote:
> > The way this works on many targets is that the caller allocates stack
> > space in its frame for the returned struct and passes a pointer to it
> > as a first "hidden" argument to the callee.  The callee then
copies
> > that data into the space pointed to by the address.
>
> <nod>
>
> > Long-term, first-class status means that returns of structs should
> > "just work" and you don't need to worry about getting a
pointer to
> > invalid memory.
>
> OK, so my thought of constructing the object on the stack was correct?
No. The idea is that you pass the structs around as values and not that you 
alloca them and pass by reference/pointer.
> What bothers me about that is the explicit specification with alloca
> that the space is reserved in the callee's frame.
Yes. Don't do that.
> Do I just trust the 
> optimizer to eliminate that and turn the reference to alloca'd memory
> into a reference to the space reserved by the caller?
No. LLVM is trusting you not to return pointers to locals.
> Or is that going 
> to create an unnecessary copy from the alloca'd memory to that reserved
> by the caller?  From what you said my guess is the former (optimizer
> eliminates the pointless temporary), but us premature optimizers like to
> be reassured we haven't given up an all-important microsecond. :-)
I have had great success with my HLVM project by passing around large numbers 
of large structs by hand. LLVM has not only survived but actually generated 
decent code that beats most languages according to my benchmarks. In 
particular, HLVM uses "fat" quadword references (where word =
sizeof(void*))
that are passed everywhere by value except when a struct is returned and HLVM 
gets the caller to alloca and passes that space by pointer to the callee for 
it to fill in.
> > ...I believe right now, however, only structs up to a
> > certain size are supported, perhaps because under some ABIs, small
> > structs can be returned in registers and one doesn't need to worry
> > about generating the hidden argument.
>
> In the case that prompted the question the struct isn't going to be
> bigger than two of whatever the architecture regards as a word, which
> surely should be fine, but in principle shouldn't LLVM and not the
> front-end programmer be making the decision about whether the struct is
> big enough to spill into memory?
Good question. There was a very interesting discussion about this here a while 
ago and everyone coming to LLVM says the same thing: why doesn't LLVM just 
handle this for me automatically? The answer is that LLVM cannot make that 
decision because it depends upon the ABI. C99 apparently returns user-defined 
structs of two doubles by reference but complex numbers in registers. So the 
ABI requires knowledge of the front-end and, therefore, LLVM cannot fully 
automate this.

Something LLVM could do is spill safely when it knows you don't care about
the
foreign ABI (e.g. with fastcc) and that work is underway.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

Dustin Laurence

2010-Jan-08 03:48 UTC

head link

[LLVMdev] First-class aggregate semantics

On 01/07/2010 06:03 PM, Alastair Lynn wrote:> 
> You'll probably need to use insertvalue to construct your return value.
Ah ha!

The fact is I didn't really understand the significance of this part
when I read it, and so didn't remember it when I needed it.  OK, so I
have tested it and I can now build up a struct like this

    %s1 = insertvalue {i32, i32} {i32 0, i32 0}, i32 1, 0 ; s1 = {1,0}
    %s2 = insertvalue {i32, i32} %s1, i32 2, 1            ; %s2 == {1,2}

which reminds me of another thing I never understood.  I can't make my
code (slightly) more readable by changing that to something like

    %s0 = {i32 0, i32 0}
    %s1 = insertvalue {i32, i32} %s0, i32 1, 0     ; s1 = {1,0}
    %s2 = insertvalue {i32, i32} %s1, i32 2, 1     ; %s2 == {1,2}

because LLVM will complain that it "expected instruction opcode" at
the
assignment to %s0.  If there is a general way to give names to constants
in that way I didn't find it.  In fact, I think I tended not to use
temporaries like I would variables precisely because when I tried the
second alternative as the natural way to hand-code it and it didn't
work, I didn't think how to phrase it so only the results of operations
get named.

Help me understand the underlying logic--why can one only name the
results of operations?  I realize that the local temporaries are
notionally register variables for a machine with an infinite number of
registers, but my very dim memory of real assembly was that I not only
could load constants into registers but had to do so.  What part of the
picture am I missing here?

You need an IR tutorial.  Or, to speak correctly, *I* need a tutorial.
:-)  But I'm learning....

Dustin

Dustin Laurence

2010-Jan-08 03:57 UTC

head link

[LLVMdev] First-class aggregate semantics

On 01/07/2010 06:52 PM, Jon Harrop wrote:
> No. The idea is that you pass the structs around as values and not that you
> alloca them and pass by reference/pointer.
OK, then I need to learn more syntax (which Alistair Lynn got me started
on, it appears :-).
> No. LLVM is trusting you not to return pointers to locals.
How naive. :-)
> I have had great success with my HLVM project by passing around large
numbers
> of large structs by hand. LLVM has not only survived but actually generated
> decent code that beats most languages according to my benchmarks.
That's good to know, because I prefer the style of returning structs
rather than passing around pointers or using static data.  I'll be happy
to convert my lexer over to returning structs instead of pulling
lex-style tricks.
> ...LLVM cannot make that 
> decision because it depends upon the ABI. C99 apparently returns
user-defined
> structs of two doubles by reference but complex numbers in registers. So
the
> ABI requires knowledge of the front-end and, therefore, LLVM cannot fully 
> automate this.
Huh.  I'd never have guessed (and would have been quite annoyed if I
had, since numerical code is often at the edge of whatever the computing
budget is (meaning the problem was the largest one the researcher could
afford to solve, not the one he wished he was solving).
> Something LLVM could do is spill safely when it knows you don't care
about the
> foreign ABI (e.g. with fastcc) and that work is underway.
<nod>

Dustin

Duncan Sands

2010-Jan-08 06:55 UTC

head link

[LLVMdev] First-class aggregate semantics

Hi Jon,
>> In the case that prompted the question the struct isn't going to be
>> bigger than two of whatever the architecture regards as a word, which
>> surely should be fine, but in principle shouldn't LLVM and not the
>> front-end programmer be making the decision about whether the struct is
>> big enough to spill into memory?
> 
> Good question. There was a very interesting discussion about this here a
while
> ago and everyone coming to LLVM says the same thing: why doesn't LLVM
just
> handle this for me automatically? The answer is that LLVM cannot make that 
> decision because it depends upon the ABI.
actually LLVM does now handle this for you automatically: if there aren't
enough registers to return the first class aggregate in registers, then it
is automagically returned on the stack.  If this is not ABI conformant then
it is up to the front-end to not generate IR that returns such large structs.
In practice front-ends only generate functions returning first class aggregates
when the ABI says the aggregate should be entirely returned in registers.  Thus
by definition it is sure not to require more registers than the machine has!
This is why the enhancement to automagically use the stack if there aren't
enough registers has no impact on ABI conformance.

Ciao,

Duncan.

David Greene

2010-Jan-09 00:02 UTC

head link

[LLVMdev] First-class aggregate semantics

On Thursday 07 January 2010 20:52, Jon Harrop wrote:
> Good question. There was a very interesting discussion about this here a
> while ago and everyone coming to LLVM says the same thing: why doesn't
LLVM
> just handle this for me automatically? The answer is that LLVM cannot make
> that decision because it depends upon the ABI. C99 apparently returns
> user-defined structs of two doubles by reference but complex numbers in
> registers. So the ABI requires knowledge of the front-end and, therefore,
> LLVM cannot fully automate this.
It's not a C99 thing, but an ABI thing.

And for the x86-64 ABI, complex double and a struct of two doubles is
returned in exactly the same way.  That may not be true for other ABIs.
I'm not as familiar with them.

On some targets it certainly should be possible to do the right thing.

                               -Dave

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Jan 2010 - [LLVMdev] First-class aggregate semantics

[LLVMdev] First-class aggregate semantics

[LLVMdev] First-class aggregate semantics

[LLVMdev] First-class aggregate semantics

[LLVMdev] First-class aggregate semantics

[LLVMdev] First-class aggregate semantics

[LLVMdev] First-class aggregate semantics

[LLVMdev] First-class aggregate semantics

Reasonably Related Threads