On Sat, Jun 13, 2009 at 9:44 PM, John McCall<rjmccall at apple.com> wrote:> On Jun 13, 2009, at 3:54 AM, Wesley W. Terpstra wrote: > Currently I just represent %c as i8*. I assume that this can have > consequences in terms of aliasing. I tried opaque*, but llvm-as didn't > like that. Is there any way to better represent the type %c to LLVM? > > I assume this is for tagged sums.Yes.> Logically, what you want is a distinct LLVM type for every ML union type > and each of its constructors. Unfortunately, LLVM does structural > uniquing of types, so that won't work.Is there absolutely no way to generate a new type? Not even an 'opaque' one?> What you can do is abuse address > spaces, giving every distinct type its own address space and casting > back and forth between address spaces as necessary.The manual indicates that only addresses in space 0 can have GC intrinsics used on them. Also I get the impression that this would be a pretty unsafe idea. ;)> Is there any way to express that a pointer is actually a pointer to an > interior element of a type? Something like %opt_33_in_heap > %opt_33_with_header:1 ? > > Something like an ungetelementptr? No, sorry. That would be a > pretty nice extension, though obviously unsound, of course.Well, ungetelementptr could be nice, but I was hoping for something even better: a way to refer to the whole object type (including the header) even though my pointer doesn't point to the start of the object. Ie: this is a pointer to 8 bytes past type X. That way for normal access I punch down to the object part of the type and do my business. For access to the header, I just punch into that part of the type (which happens to involve a negative offset from the address). However, it seems that LLVM pointers always have to point to the start of an object.> Personally, I would create a struct type (hereafter "HeaderType") for the > entire GC header; when you want to access a header field, just cast the > base pointer to i8*, subtract the allocation size of HeaderType, cast the > result to HeaderType*, and getelementptr from there.That's what I'm doing right now; the HeaderType happens to be i32. ;) I am assuming that casting in and out of i8* will cost me in terms of the optimizations LLVM can apply..? Also, I couldn't find a no-op instruction in LLVM. In some places it would be convenient to say: '%x = %y'. For the moment I'm doing a bitcast from the type back to itself, which is rather awkward.
Wesley W. Terpstra wrote:> On Sat, Jun 13, 2009 at 9:44 PM, John McCall<rjmccall at apple.com> wrote: >> On Jun 13, 2009, at 3:54 AM, Wesley W. Terpstra wrote: >> Currently I just represent %c as i8*. I assume that this can have >> consequences in terms of aliasing. I tried opaque*, but llvm-as didn't >> like that. Is there any way to better represent the type %c to LLVM? >> >> I assume this is for tagged sums. > > Yes. > >> Logically, what you want is a distinct LLVM type for every ML union type >> and each of its constructors. Unfortunately, LLVM does structural >> uniquing of types, so that won't work. > > Is there absolutely no way to generate a new type? Not even an 'opaque' one?Each time you say "opaque" in a .ll (or call OpaqueType::get in the C++ API) you get yourself a new distinct opaque type. It's not clear to me at all why opaque didn't work for you in the first place. One thing you'll have to remember is that because of the above, if you want to take an opaque* and pass it to another function that takes an opaque*, you'll get a type mismatch since you said opaque twice. Use "%c = type opaque" in the global space, then %c* to get the same opaque in multiple places. The other reason it might not have worked for you is that you might've tried to dereference your opaque* thereby producing just 'opaque' which isn't allowed.>> What you can do is abuse address >> spaces, giving every distinct type its own address space and casting >> back and forth between address spaces as necessary. > > The manual indicates that only addresses in space 0 can have GC > intrinsics used on them. Also I get the impression that this would be > a pretty unsafe idea. ;) > >> Is there any way to express that a pointer is actually a pointer to an >> interior element of a type? Something like %opt_33_in_heap >> %opt_33_with_header:1 ? >> >> Something like an ungetelementptr? No, sorry. That would be a >> pretty nice extension, though obviously unsound, of course. > > Well, ungetelementptr could be nice, but I was hoping for something > even better: a way to refer to the whole object type (including the > header) even though my pointer doesn't point to the start of the > object. Ie: this is a pointer to 8 bytes past type X. > > That way for normal access I punch down to the object part of the type > and do my business. For access to the header, I just punch into that > part of the type (which happens to involve a negative offset from the > address). However, it seems that LLVM pointers always have to point to > the start of an object. > >> Personally, I would create a struct type (hereafter "HeaderType") for the >> entire GC header; when you want to access a header field, just cast the >> base pointer to i8*, subtract the allocation size of HeaderType, cast the >> result to HeaderType*, and getelementptr from there. > > That's what I'm doing right now; the HeaderType happens to be i32. ;) > I am assuming that casting in and out of i8* will cost me in terms of > the optimizations LLVM can apply..? > > Also, I couldn't find a no-op instruction in LLVM. In some places it > would be convenient to say: '%x = %y'. For the moment I'm doing a > bitcast from the type back to itself, which is rather awkward.There is none, using a bitcast is the workaround. LLVM's optimizers will fix it up. Nick
* Wesley W. Terpstra:>> Logically, what you want is a distinct LLVM type for every ML union type >> and each of its constructors. Unfortunately, LLVM does structural >> uniquing of types, so that won't work. > > Is there absolutely no way to generate a new type? Not even an > 'opaque' one?Is this really a problem for MLton? I think you only get less precise alias analysis, and that's it.
On Sun, Jun 14, 2009 at 4:32 AM, Nick Lewycky<nicholas at mxc.ca> wrote:> Wesley W. Terpstra wrote: >> Is there absolutely no way to generate a new type? Not even an 'opaque' one? > Each time you say "opaque" in a .ll (or call OpaqueType::get in the C++ > API) you get yourself a new distinct opaque type.Ok. That's what I thought it did which is why I tried it in the first place. I must have done something wrong. Thank you!>> Also, I couldn't find a no-op instruction in LLVM. In some places it >> would be convenient to say: '%x = %y'. For the moment I'm doing a >> bitcast from the type back to itself, which is rather awkward. > > There is none, using a bitcast is the workaround. LLVM's optimizers will > fix it up.I'll keep doing what I'm doing then.
On Sun, Jun 14, 2009 at 10:50 AM, Florian Weimer<fw at deneb.enyo.de> wrote:> Is this really a problem for MLton? I think you only get less precise > alias analysis, and that's it.Correct. However, I want a fair comparison between LLVM performance and the native x86 codegen. If I don't give LLVM the same information the x86 codegen has, it's an unfair comparison.
On Jun 13, 2009, at 5:14 PM, Wesley W. Terpstra wrote:> On Sat, Jun 13, 2009 at 9:44 PM, John McCall<rjmccall at apple.com> > wrote: >> Logically, what you want is a distinct LLVM type for every ML union >> type >> and each of its constructors. Unfortunately, LLVM does structural >> uniquing of types, so that won't work. > > Is there absolutely no way to generate a new type? Not even an > 'opaque' one?As mentioned, you can generate new opaque types, but obviously that won't work for, say, distinguishing between separate constructors that are structured identically. If you're not planning to write any LLVM-level language-specific optimizations, that probably doesn't matter at all. On the other hand, you were talking about alias analysis, which generally involves writing a pass to inject language-specific information.>> What you can do is abuse address >> spaces, giving every distinct type its own address space and casting >> back and forth between address spaces as necessary. > > The manual indicates that only addresses in space 0 can have GC > intrinsics used on them.More casts! Although I'm curious why this limitation is in effect at all; probably a consequence of some other overloaded use of address spaces.> Also I get the impression that this would be a pretty unsafe idea. ;)Not particularly less safe than all the other unsafe casts you're planning to use.>> Is there any way to express that a pointer is actually a pointer to >> an >> interior element of a type? Something like %opt_33_in_heap >> %opt_33_with_header:1 ? >> >> Something like an ungetelementptr? No, sorry. That would be a >> pretty nice extension, though obviously unsound, of course. > > Well, ungetelementptr could be nice, but I was hoping for something > even better: a way to refer to the whole object type (including the > header) even though my pointer doesn't point to the start of the > object. Ie: this is a pointer to 8 bytes past type X.Okay. You are right, there is no way to express this in the type system, and that is very unlikely to change.>> Personally, I would create a struct type (hereafter "HeaderType") >> for the >> entire GC header; when you want to access a header field, just >> cast the >> base pointer to i8*, subtract the allocation size of HeaderType, >> cast the >> result to HeaderType*, and getelementptr from there. > > That's what I'm doing right now; the HeaderType happens to be i32. ;) > I am assuming that casting in and out of i8* will cost me in terms of > the optimizations LLVM can apply..?It would only really affect a type-based alias analysis, and there's no cookie-cutter version of that; you would need to write your own AA pass, which could then easily recognize the pattern of accessing the header.> Also, I couldn't find a no-op instruction in LLVM. In some places it > would be convenient to say: '%x = %y'. For the moment I'm doing a > bitcast from the type back to itself, which is rather awkward.The bitcast is a decent workaround, but the real question is why you need a no-op at all; if you're doing it to provide a hook for optimizer information, a call is probably a better idea. John.