thr3ads.net - llvm dev - [LLVMdev] Suggestion: Support union types in IR [May 2009]

If this information is useful, please help other people find it:
Share via:

Talin

2009-May-06 03:09 UTC

[LLVMdev] Suggestion: Support union types in IR

I wanted to mention, by the way, that my need/desire for this hasn't 
gone away :)

And my wish list still includes support for something like uintptr_t - a 
primitive integer type that is defined to always be the same size as a 
pointer, however large or small that may be on different platforms. (So 
that the frontend doesn't need to know how big a pointer is and can 
generate the same IR that works on both 32-bit and 64-bit platforms.)

-- Talin

Chris Lattner wrote:> On Dec 30, 2008, at 12:41 PM, Talin wrote:
>   
>> I've been thinking about how to represent unions or "disjoint
types"
>> in LLVM IR. At the moment, the only way I know to achieve this right  
>> now is to create a struct that is as large as the largest type in  
>> the union and then bitcast it to access the fields contained within.  
>> However, that requires that the frontend know the sizes of all of  
>> the various low-level types (the "size_t" problem, which has
been
>> discussed before), otherwise you get problems trying to mix pointer  
>> and non-pointer types.
>>     
>
> That's an interesting point.  As others have pointed out, we've  
> resisted having a union type because it isn't strictly needed for the  
> current set of front-ends.  If a front-end is trying to generate  
> target-independent IR though, I can see the utility.  The "gep
trick"
> won't work for type generation.
>
>   
>> It seems to me that adding a union type to the IR would be a logical  
>> extension to the language. The syntax for declaring a union would be  
>> similar to that of declaring a struct. To access a union member, you  
>> would use GetElementPointer, just as if it were a struct. The only  
>> difference is that in this case, the GEP doesn't actually modify
the
>> address, it merely returns the input argument as a different type.  
>> In all other ways, unions would be treated like structs, except that  
>> the size of the union would always be the size of the largest  
>> member, and all of the fields within the union would be located  
>> located at relative offset zero.
>>     
>
> Yes, your proposal makes sense, for syntax, I'd suggest:  u{ i32,
float}
>
>   
>> Unions could of course be combined with other types:
>>
>>    {{int|float}, bool} *
>>    n = getelementptr i32 0, i32 0, i32 1
>>
>> So in the above example, the GEP returns a pointer to the float field.
>>     
>
> I don't have a specific problem with adding this.  The cost of doing  
> so is that it adds (a small amount of) complexity to a lot of places  
> that walk the type graphs.  The only pass that I predict will be  
> difficult to update to handle this is the BasicAA pass, which reasons  
> about symbolic (not concrete) offsets and should return mustalias in  
> the appropriate cases.  Also, to validate this, I think llvm-gcc  
> should start generating this for C unions where possible.
>
> If you're interested in implementing this and seeing all the details  
> of the implementation through to the end, I don't see significant  
> problems.  I think adding a simple union type would make more sense  
> than adding first-class support for a *discriminated* union.
>
> -Chris
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>

Chris Lattner

2009-May-06 03:56 UTC

head link

[LLVMdev] Suggestion: Support union types in IR

On May 5, 2009, at 8:09 PM, Talin wrote:
> I wanted to mention, by the way, that my need/desire for this hasn't
> gone away :)
>
> And my wish list still includes support for something like uintptr_t  
> - a
> primitive integer type that is defined to always be the same size as a
> pointer, however large or small that may be on different platforms.  
> (So
> that the frontend doesn't need to know how big a pointer is and can
> generate the same IR that works on both 32-bit and 64-bit platforms.)
Why not just use a pointer, such as i8*?

-Chris
>
>
> -- Talin
>
> Chris Lattner wrote:
>> On Dec 30, 2008, at 12:41 PM, Talin wrote:
>>
>>> I've been thinking about how to represent unions or
"disjoint types"
>>> in LLVM IR. At the moment, the only way I know to achieve this
right
>>> now is to create a struct that is as large as the largest type in
>>> the union and then bitcast it to access the fields contained
within.
>>> However, that requires that the frontend know the sizes of all of
>>> the various low-level types (the "size_t" problem, which
has been
>>> discussed before), otherwise you get problems trying to mix pointer
>>> and non-pointer types.
>>>
>>
>> That's an interesting point.  As others have pointed out, we've
>> resisted having a union type because it isn't strictly needed for
the
>> current set of front-ends.  If a front-end is trying to generate
>> target-independent IR though, I can see the utility.  The "gep
trick"
>> won't work for type generation.
>>
>>
>>> It seems to me that adding a union type to the IR would be a
logical
>>> extension to the language. The syntax for declaring a union would
be
>>> similar to that of declaring a struct. To access a union member,
you
>>> would use GetElementPointer, just as if it were a struct. The only
>>> difference is that in this case, the GEP doesn't actually
modify the
>>> address, it merely returns the input argument as a different type.
>>> In all other ways, unions would be treated like structs, except
that
>>> the size of the union would always be the size of the largest
>>> member, and all of the fields within the union would be located
>>> located at relative offset zero.
>>>
>>
>> Yes, your proposal makes sense, for syntax, I'd suggest:  u{ i32,  
>> float}
>>
>>
>>> Unions could of course be combined with other types:
>>>
>>>   {{int|float}, bool} *
>>>   n = getelementptr i32 0, i32 0, i32 1
>>>
>>> So in the above example, the GEP returns a pointer to the float  
>>> field.
>>>
>>
>> I don't have a specific problem with adding this.  The cost of
doing
>> so is that it adds (a small amount of) complexity to a lot of places
>> that walk the type graphs.  The only pass that I predict will be
>> difficult to update to handle this is the BasicAA pass, which reasons
>> about symbolic (not concrete) offsets and should return mustalias in
>> the appropriate cases.  Also, to validate this, I think llvm-gcc
>> should start generating this for C unions where possible.
>>
>> If you're interested in implementing this and seeing all the
details
>> of the implementation through to the end, I don't see significant
>> problems.  I think adding a simple union type would make more sense
>> than adding first-class support for a *discriminated* union.
>>
>> -Chris
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Talin

2009-May-06 04:55 UTC

head link

[LLVMdev] Suggestion: Support union types in IR

Chris Lattner wrote:> On May 5, 2009, at 8:09 PM, Talin wrote:
>
>   
>> I wanted to mention, by the way, that my need/desire for this
hasn't
>> gone away :)
>>
>> And my wish list still includes support for something like uintptr_t  
>> - a
>> primitive integer type that is defined to always be the same size as a
>> pointer, however large or small that may be on different platforms.  
>> (So
>> that the frontend doesn't need to know how big a pointer is and can
>> generate the same IR that works on both 32-bit and 64-bit platforms.)
>>     
>
> Why not just use a pointer, such as i8*?
>   Suppose I have an STL-like container that has a 'begin' and
'end'
pointer. Now I want to find the size() of the container. Since you 
cannot subtract pointers in LLVM IR, you have to cast them to an integer 
type first. But what integer type do you cast them to? I suppose you 
could simply always cast them to i64, and hope that the backend will 
generate efficient code for the subtraction, but I have no way of 
knowing this.

Now, I'm going to anticipate what I think will be your next argument, 
which is that at some point I must know the size of the result since I 
am assigning the result of size() to some interger variable eventually. 
Which is true, however, if the size of that eventual variable is smaller 
than a pointer, then I want to check it for overflow before I do the 
assignment. I don't want to just do a blind bitcast and have the top 
bits be lopped off.

The problem of checking for overflow when assigning from an integer of 
unknown size to an integer of known size is left as an exercise for the 
reader.> -Chris
>

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - May 2009 - [LLVMdev] Suggestion: Support union types in IR

[LLVMdev] Suggestion: Support union types in IR

[LLVMdev] Suggestion: Support union types in IR

[LLVMdev] Suggestion: Support union types in IR

Reasonably Related Threads