thr3ads.net - llvm dev - [LLVMdev] ML types in LLVM [Jun 2009]

If this information is useful, please help other people find it:
Share via:

Wesley W. Terpstra

2009-Jun-14 00:14 UTC

[LLVMdev] ML types in LLVM

On Sat, Jun 13, 2009 at 9:44 PM, John McCall<rjmccall at apple.com>
wrote:> On Jun 13, 2009, at 3:54 AM, Wesley W. Terpstra wrote:
> Currently I just represent %c as i8*. I assume that this can have
> consequences in terms of aliasing. I tried opaque*, but llvm-as didn't
> like that. Is there any way to better represent the type %c to LLVM?
>
> I assume this is for tagged sums.
Yes.
> Logically, what you want is a distinct LLVM type for every ML union type
> and each of its constructors.  Unfortunately, LLVM does structural
> uniquing of types, so that won't work.
Is there absolutely no way to generate a new type? Not even an 'opaque'
one?
> What you can do is abuse address
> spaces, giving every distinct type its own address space and casting
> back and forth between address spaces as necessary.
The manual indicates that only addresses in space 0 can have GC
intrinsics used on them. Also I get the impression that this would be
a pretty unsafe idea. ;)
> Is there any way to express that a pointer is actually a pointer to an
> interior element of a type? Something like %opt_33_in_heap >
%opt_33_with_header:1 ?
>
> Something like an ungetelementptr?  No, sorry.  That would be a
> pretty nice extension, though obviously unsound, of course.
Well, ungetelementptr could be nice, but I was hoping for something
even better: a way to refer to the whole object type (including the
header) even though my pointer doesn't point to the start of the
object. Ie: this is a pointer to 8 bytes past type X.

That way for normal access I punch down to the object part of the type
and do my business. For access to the header, I just punch into that
part of the type (which happens to involve a negative offset from the
address). However, it seems that LLVM pointers always have to point to
the start of an object.
> Personally, I would create a struct type (hereafter
"HeaderType") for the
> entire GC header;  when you want to access a header field, just cast the
> base pointer to i8*, subtract the allocation size of HeaderType, cast the
> result to HeaderType*, and getelementptr from there.
That's what I'm doing right now; the HeaderType happens to be i32. ;)
I am assuming that casting in and out of i8* will cost me in terms of
the optimizations LLVM can apply..?

Also, I couldn't find a no-op instruction in LLVM. In some places it
would be convenient to say: '%x = %y'. For the moment I'm doing a
bitcast from the type back to itself, which is rather awkward.

Nick Lewycky

2009-Jun-14 02:32 UTC

head link

[LLVMdev] ML types in LLVM

Wesley W. Terpstra wrote:> On Sat, Jun 13, 2009 at 9:44 PM, John McCall<rjmccall at apple.com>
wrote:
>> On Jun 13, 2009, at 3:54 AM, Wesley W. Terpstra wrote:
>> Currently I just represent %c as i8*. I assume that this can have
>> consequences in terms of aliasing. I tried opaque*, but llvm-as
didn't
>> like that. Is there any way to better represent the type %c to LLVM?
>>
>> I assume this is for tagged sums.
> 
> Yes.
> 
>> Logically, what you want is a distinct LLVM type for every ML union
type
>> and each of its constructors.  Unfortunately, LLVM does structural
>> uniquing of types, so that won't work.
> 
> Is there absolutely no way to generate a new type? Not even an
'opaque' one?
Each time you say "opaque" in a .ll (or call OpaqueType::get in the
C++
API) you get yourself a new distinct opaque type.

It's not clear to me at all why opaque didn't work for you in the first 
place. One thing you'll have to remember is that because of the above, 
if you want to take an opaque* and pass it to another function that 
takes an opaque*, you'll get a type mismatch since you said opaque 
twice. Use "%c = type opaque" in the global space, then %c* to get the
same opaque in multiple places. The other reason it might not have 
worked for you is that you might've tried to dereference your opaque* 
thereby producing just 'opaque' which isn't allowed.
>>  What you can do is abuse address
>> spaces, giving every distinct type its own address space and casting
>> back and forth between address spaces as necessary.
> 
> The manual indicates that only addresses in space 0 can have GC
> intrinsics used on them. Also I get the impression that this would be
> a pretty unsafe idea. ;)
> 
>> Is there any way to express that a pointer is actually a pointer to an
>> interior element of a type? Something like %opt_33_in_heap >>
%opt_33_with_header:1 ?
>>
>> Something like an ungetelementptr?  No, sorry.  That would be a
>> pretty nice extension, though obviously unsound, of course.
> 
> Well, ungetelementptr could be nice, but I was hoping for something
> even better: a way to refer to the whole object type (including the
> header) even though my pointer doesn't point to the start of the
> object. Ie: this is a pointer to 8 bytes past type X.
> 
> That way for normal access I punch down to the object part of the type
> and do my business. For access to the header, I just punch into that
> part of the type (which happens to involve a negative offset from the
> address). However, it seems that LLVM pointers always have to point to
> the start of an object.
> 
>> Personally, I would create a struct type (hereafter
"HeaderType") for the
>> entire GC header;  when you want to access a header field, just cast
the
>> base pointer to i8*, subtract the allocation size of HeaderType, cast
the
>> result to HeaderType*, and getelementptr from there.
> 
> That's what I'm doing right now; the HeaderType happens to be i32.
;)
> I am assuming that casting in and out of i8* will cost me in terms of
> the optimizations LLVM can apply..?
> 
> Also, I couldn't find a no-op instruction in LLVM. In some places it
> would be convenient to say: '%x = %y'. For the moment I'm doing
a
> bitcast from the type back to itself, which is rather awkward.
There is none, using a bitcast is the workaround. LLVM's optimizers will 
fix it up.

Nick

Florian Weimer

2009-Jun-14 08:50 UTC

head link

[LLVMdev] ML types in LLVM

* Wesley W. Terpstra:
>> Logically, what you want is a distinct LLVM type for every ML union
type
>> and each of its constructors.  Unfortunately, LLVM does structural
>> uniquing of types, so that won't work.
>
> Is there absolutely no way to generate a new type? Not even an
> 'opaque' one?
Is this really a problem for MLton?  I think you only get less precise
alias analysis, and that's it.

Wesley W. Terpstra

2009-Jun-14 12:59 UTC

head link

[LLVMdev] ML types in LLVM

On Sun, Jun 14, 2009 at 4:32 AM, Nick Lewycky<nicholas at mxc.ca>
wrote:> Wesley W. Terpstra wrote:
>> Is there absolutely no way to generate a new type? Not even an
'opaque' one?
> Each time you say "opaque" in a .ll (or call OpaqueType::get in
the C++
> API) you get yourself a new distinct opaque type.
Ok. That's what I thought it did which is why I tried it in the first
place. I must have done something wrong. Thank you!
>> Also, I couldn't find a no-op instruction in LLVM. In some places
it
>> would be convenient to say: '%x = %y'. For the moment I'm
doing a
>> bitcast from the type back to itself, which is rather awkward.
>
> There is none, using a bitcast is the workaround. LLVM's optimizers
will
> fix it up.
I'll keep doing what I'm doing then.

Wesley W. Terpstra

2009-Jun-14 13:09 UTC

head link

[LLVMdev] ML types in LLVM

On Sun, Jun 14, 2009 at 10:50 AM, Florian Weimer<fw at deneb.enyo.de>
wrote:> Is this really a problem for MLton?  I think you only get less precise
> alias analysis, and that's it.
Correct. However, I want a fair comparison between LLVM performance
and the native x86 codegen. If I don't give LLVM the same information
the x86 codegen has, it's an unfair comparison.

John McCall

2009-Jun-14 19:33 UTC

head link

[LLVMdev] ML types in LLVM

On Jun 13, 2009, at 5:14 PM, Wesley W. Terpstra wrote:> On Sat, Jun 13, 2009 at 9:44 PM, John McCall<rjmccall at apple.com>  
> wrote:
>> Logically, what you want is a distinct LLVM type for every ML union  
>> type
>> and each of its constructors.  Unfortunately, LLVM does structural
>> uniquing of types, so that won't work.
>
> Is there absolutely no way to generate a new type? Not even an  
> 'opaque' one?
As mentioned, you can generate new opaque types, but obviously that
won't work for, say, distinguishing between separate constructors that  
are
structured identically.  If you're not planning to write any LLVM-level
language-specific optimizations, that probably doesn't matter at all.
On the other hand, you were talking about alias analysis, which  
generally
involves writing a pass to inject language-specific information.
>>  What you can do is abuse address
>> spaces, giving every distinct type its own address space and casting
>> back and forth between address spaces as necessary.
>
> The manual indicates that only addresses in space 0 can have GC
> intrinsics used on them.
More casts!  Although I'm curious why this limitation is in effect at  
all;
probably a consequence of some other overloaded use of address
spaces.
> Also I get the impression that this would be a pretty unsafe idea. ;)
Not particularly less safe than all the other unsafe casts you're  
planning
to use.
>> Is there any way to express that a pointer is actually a pointer to  
>> an
>> interior element of a type? Something like %opt_33_in_heap >>
%opt_33_with_header:1 ?
>>
>> Something like an ungetelementptr?  No, sorry.  That would be a
>> pretty nice extension, though obviously unsound, of course.
>
> Well, ungetelementptr could be nice, but I was hoping for something
> even better: a way to refer to the whole object type (including the
> header) even though my pointer doesn't point to the start of the
> object. Ie: this is a pointer to 8 bytes past type X.
Okay.  You are right, there is no way to express this in the type  
system,
and that is very unlikely to change.
>> Personally, I would create a struct type (hereafter
"HeaderType")
>> for the
>> entire GC header;  when you want to access a header field, just  
>> cast the
>> base pointer to i8*, subtract the allocation size of HeaderType,  
>> cast the
>> result to HeaderType*, and getelementptr from there.
>
> That's what I'm doing right now; the HeaderType happens to be i32.
;)
> I am assuming that casting in and out of i8* will cost me in terms of
> the optimizations LLVM can apply..?
It would only really affect a type-based alias analysis, and there's no
cookie-cutter version of that;  you would need to write your own AA
pass, which could then easily recognize the pattern of accessing the
header.
> Also, I couldn't find a no-op instruction in LLVM. In some places it
> would be convenient to say: '%x = %y'. For the moment I'm doing
a
> bitcast from the type back to itself, which is rather awkward.
The bitcast is a decent workaround, but the real question is why you  
need
a no-op at all;  if you're doing it to provide a hook for optimizer
information, a call is probably a better idea.

John.

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Jun 2009 - [LLVMdev] ML types in LLVM

[LLVMdev] ML types in LLVM

[LLVMdev] ML types in LLVM

[LLVMdev] ML types in LLVM

[LLVMdev] ML types in LLVM

[LLVMdev] ML types in LLVM

[LLVMdev] ML types in LLVM

Apparently Analagous Threads