On 01/07/2010 01:38 PM, David Greene wrote:> The way this works on many targets is that the caller allocates stack > space in its frame for the returned struct and passes a pointer to it > as a first "hidden" argument to the callee. The callee then copies > that data into the space pointed to by the address.<nod>> Long-term, first-class status means that returns of structs should > "just work" and you don't need to worry about getting a pointer to > invalid memory.OK, so my thought of constructing the object on the stack was correct? What I originally wanted to do was roughly %Token = type {%c_int, %i8*} define %Token @foo() { ... ret %Token {%c_int %token, %i8* %value} } but the compiler complains about the invalid usage of a local name. So I decided the problem was that I was thinking in terms of languages that would create a temporary implicitly, and in IR I need to do it explicitly. So it occurred to me to create the struct on the stack, as I mentioned. What bothers me about that is the explicit specification with alloca that the space is reserved in the callee's frame. Do I just trust the optimizer to eliminate that and turn the reference to alloca'd memory into a reference to the space reserved by the caller? Or is that going to create an unnecessary copy from the alloca'd memory to that reserved by the caller? From what you said my guess is the former (optimizer eliminates the pointless temporary), but us premature optimizers like to be reassured we haven't given up an all-important microsecond. :-)> ...I believe right now, however, only structs up to a > certain size are supported, perhaps because under some ABIs, small > structs can be returned in registers and one doesn't need to worry > about generating the hidden argument.In the case that prompted the question the struct isn't going to be bigger than two of whatever the architecture regards as a word, which surely should be fine, but in principle shouldn't LLVM and not the front-end programmer be making the decision about whether the struct is big enough to spill into memory? Dustin
Hi Dustin- You'll probably need to use insertvalue to construct your return value. Alastair On 7 Jan 2010, at 21:56, Dustin Laurence wrote:> define %Token @foo() > { > ... > > ret %Token {%c_int %token, %i8* %value} > }
On Thursday 07 January 2010 21:56:11 Dustin Laurence wrote:> On 01/07/2010 01:38 PM, David Greene wrote: > > The way this works on many targets is that the caller allocates stack > > space in its frame for the returned struct and passes a pointer to it > > as a first "hidden" argument to the callee. The callee then copies > > that data into the space pointed to by the address. > > <nod> > > > Long-term, first-class status means that returns of structs should > > "just work" and you don't need to worry about getting a pointer to > > invalid memory. > > OK, so my thought of constructing the object on the stack was correct?No. The idea is that you pass the structs around as values and not that you alloca them and pass by reference/pointer.> What bothers me about that is the explicit specification with alloca > that the space is reserved in the callee's frame.Yes. Don't do that.> Do I just trust the > optimizer to eliminate that and turn the reference to alloca'd memory > into a reference to the space reserved by the caller?No. LLVM is trusting you not to return pointers to locals.> Or is that going > to create an unnecessary copy from the alloca'd memory to that reserved > by the caller? From what you said my guess is the former (optimizer > eliminates the pointless temporary), but us premature optimizers like to > be reassured we haven't given up an all-important microsecond. :-)I have had great success with my HLVM project by passing around large numbers of large structs by hand. LLVM has not only survived but actually generated decent code that beats most languages according to my benchmarks. In particular, HLVM uses "fat" quadword references (where word = sizeof(void*)) that are passed everywhere by value except when a struct is returned and HLVM gets the caller to alloca and passes that space by pointer to the callee for it to fill in.> > ...I believe right now, however, only structs up to a > > certain size are supported, perhaps because under some ABIs, small > > structs can be returned in registers and one doesn't need to worry > > about generating the hidden argument. > > In the case that prompted the question the struct isn't going to be > bigger than two of whatever the architecture regards as a word, which > surely should be fine, but in principle shouldn't LLVM and not the > front-end programmer be making the decision about whether the struct is > big enough to spill into memory?Good question. There was a very interesting discussion about this here a while ago and everyone coming to LLVM says the same thing: why doesn't LLVM just handle this for me automatically? The answer is that LLVM cannot make that decision because it depends upon the ABI. C99 apparently returns user-defined structs of two doubles by reference but complex numbers in registers. So the ABI requires knowledge of the front-end and, therefore, LLVM cannot fully automate this. Something LLVM could do is spill safely when it knows you don't care about the foreign ABI (e.g. with fastcc) and that work is underway. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e
On 01/07/2010 06:03 PM, Alastair Lynn wrote:> > You'll probably need to use insertvalue to construct your return value.Ah ha! The fact is I didn't really understand the significance of this part when I read it, and so didn't remember it when I needed it. OK, so I have tested it and I can now build up a struct like this %s1 = insertvalue {i32, i32} {i32 0, i32 0}, i32 1, 0 ; s1 = {1,0} %s2 = insertvalue {i32, i32} %s1, i32 2, 1 ; %s2 == {1,2} which reminds me of another thing I never understood. I can't make my code (slightly) more readable by changing that to something like %s0 = {i32 0, i32 0} %s1 = insertvalue {i32, i32} %s0, i32 1, 0 ; s1 = {1,0} %s2 = insertvalue {i32, i32} %s1, i32 2, 1 ; %s2 == {1,2} because LLVM will complain that it "expected instruction opcode" at the assignment to %s0. If there is a general way to give names to constants in that way I didn't find it. In fact, I think I tended not to use temporaries like I would variables precisely because when I tried the second alternative as the natural way to hand-code it and it didn't work, I didn't think how to phrase it so only the results of operations get named. Help me understand the underlying logic--why can one only name the results of operations? I realize that the local temporaries are notionally register variables for a machine with an infinite number of registers, but my very dim memory of real assembly was that I not only could load constants into registers but had to do so. What part of the picture am I missing here? You need an IR tutorial. Or, to speak correctly, *I* need a tutorial. :-) But I'm learning.... Dustin
On 01/07/2010 06:52 PM, Jon Harrop wrote:> No. The idea is that you pass the structs around as values and not that you > alloca them and pass by reference/pointer.OK, then I need to learn more syntax (which Alistair Lynn got me started on, it appears :-).> No. LLVM is trusting you not to return pointers to locals.How naive. :-)> I have had great success with my HLVM project by passing around large numbers > of large structs by hand. LLVM has not only survived but actually generated > decent code that beats most languages according to my benchmarks.That's good to know, because I prefer the style of returning structs rather than passing around pointers or using static data. I'll be happy to convert my lexer over to returning structs instead of pulling lex-style tricks.> ...LLVM cannot make that > decision because it depends upon the ABI. C99 apparently returns user-defined > structs of two doubles by reference but complex numbers in registers. So the > ABI requires knowledge of the front-end and, therefore, LLVM cannot fully > automate this.Huh. I'd never have guessed (and would have been quite annoyed if I had, since numerical code is often at the edge of whatever the computing budget is (meaning the problem was the largest one the researcher could afford to solve, not the one he wished he was solving).> Something LLVM could do is spill safely when it knows you don't care about the > foreign ABI (e.g. with fastcc) and that work is underway.<nod> Dustin
Hi Jon,>> In the case that prompted the question the struct isn't going to be >> bigger than two of whatever the architecture regards as a word, which >> surely should be fine, but in principle shouldn't LLVM and not the >> front-end programmer be making the decision about whether the struct is >> big enough to spill into memory? > > Good question. There was a very interesting discussion about this here a while > ago and everyone coming to LLVM says the same thing: why doesn't LLVM just > handle this for me automatically? The answer is that LLVM cannot make that > decision because it depends upon the ABI.actually LLVM does now handle this for you automatically: if there aren't enough registers to return the first class aggregate in registers, then it is automagically returned on the stack. If this is not ABI conformant then it is up to the front-end to not generate IR that returns such large structs. In practice front-ends only generate functions returning first class aggregates when the ABI says the aggregate should be entirely returned in registers. Thus by definition it is sure not to require more registers than the machine has! This is why the enhancement to automagically use the stack if there aren't enough registers has no impact on ABI conformance. Ciao, Duncan.
On Thursday 07 January 2010 20:52, Jon Harrop wrote:> Good question. There was a very interesting discussion about this here a > while ago and everyone coming to LLVM says the same thing: why doesn't LLVM > just handle this for me automatically? The answer is that LLVM cannot make > that decision because it depends upon the ABI. C99 apparently returns > user-defined structs of two doubles by reference but complex numbers in > registers. So the ABI requires knowledge of the front-end and, therefore, > LLVM cannot fully automate this.It's not a C99 thing, but an ABI thing. And for the x86-64 ABI, complex double and a struct of two doubles is returned in exactly the same way. That may not be true for other ABIs. I'm not as familiar with them. On some targets it certainly should be possible to do the right thing. -Dave