thr3ads.net - llvm dev - [LLVMdev] [PATCH] - Union types, attempt 2 [Jan 2010]

If this information is useful, please help other people find it:
Share via:

Joachim Durchholz

2010-Jan-15 08:41 UTC

[LLVMdev] [PATCH] - Union types, attempt 2

Talin schrieb:> Well, the fact that union members have to be indexed by number means 
> that the ordering has to be part of the type - so even though 
> type-theoretically union { i32, float } is the same as union { float, 
> i32 }, in my implementation they are distinct types. However, from the 
> standpoint of a frontend, this is not a great concern, because the 
> frontend will most likely sort the list of types before constructing the 
> IR type.
Hm... it's placing a burden on the frontend developer.

More importantly, it's something that the fronend developer must not 
forget to do, so you better make sure this is documented in capital 
letters in a place where the frontend developer is likely to look when 
preparing code generation.

Most importantly, however, this will create a lot of hassles when making 
code interoperable between compilers: Compiler writers need to agree on 
a language-independent canonical ordering.
That said, if the ordering is canonical, it could be established at the 
IR level. E.g. by ordering alphabetically.

When coding, please consider that many languages establish assignment 
compatibility between union types. E.g. a union {i32, float} value could 
be assigned to a name that's typed as a union {i32, i64, float}.
This probably means the need for conversion operators, and it definitely 
means that indexes aren't meaningful by themselves, only in conjunction 
with their union type.

 > By always putting the types in a canonical order, regardless
of> the order that they appear in the source code, you can ensure that 
> unions of equal types are always compatible. In other words, you can 
> treat the members like an ordered set rather than like a list.  
Yes, that's closer to the frontend semantics: the variants of a union 
type don't have any natural ordering, so list semantics could cause 
problems.

Regards,
Jo

Talin

2010-Jan-15 19:37 UTC

head link

[LLVMdev] [PATCH] - Union types, attempt 2

On Fri, Jan 15, 2010 at 12:41 AM, Joachim Durchholz <jo at
durchholz.org>wrote:
> Talin schrieb:
>
>  Well, the fact that union members have to be indexed by number means that
>> the ordering has to be part of the type - so even though
type-theoretically
>> union { i32, float } is the same as union { float, i32 }, in my
>> implementation they are distinct types. However, from the standpoint of
a
>> frontend, this is not a great concern, because the frontend will most
likely
>> sort the list of types before constructing the IR type.
>>
>
> Hm... it's placing a burden on the frontend developer.
>
> More importantly, it's something that the fronend developer must not
forget
> to do, so you better make sure this is documented in capital letters in a
> place where the frontend developer is likely to look when preparing code
> generation.
>
> Most importantly, however, this will create a lot of hassles when making
> code interoperable between compilers: Compiler writers need to agree on a
> language-independent canonical ordering.
> That said, if the ordering is canonical, it could be established at the IR
> level. E.g. by ordering alphabetically.
>
> When coding, please consider that many languages establish assignment
> compatibility between union types. E.g. a union {i32, float} value could be
> assigned to a name that's typed as a union {i32, i64, float}.
> This probably means the need for conversion operators, and it definitely
> means that indexes aren't meaningful by themselves, only in conjunction
with
> their union type.
>
> I really feel that these issues should be addressed on a layer above IR.LLVM IR always requires that all types match exactly, and any conversions or
promotions must be inserted explicitly by the frontend. Making unions do
automatic conversions would make them dramatically different from every
other IR type.
>
> > By always putting the types in a canonical order, regardless of
>
>> the order that they appear in the source code, you can ensure that
unions
>> of equal types are always compatible. In other words, you can treat the
>> members like an ordered set rather than like a list.
>>
>
> Yes, that's closer to the frontend semantics: the variants of a union
type
> don't have any natural ordering, so list semantics could cause
problems.
>
> Regards,
> Jo
>


-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20100115/b516d3a6/attachment.html>

Dustin Laurence

2010-Jan-15 21:51 UTC

head link

[LLVMdev] [PATCH] - Union types, attempt 2

On 01/15/2010 11:37 AM, Talin wrote:
>     Yes, that's closer to the frontend semantics: the variants of a
>     union type don't have any natural ordering, so list semantics could
>     cause problems.
I agree.  I probably shouldn't even comment, as I know so little about 
LLVM.  But I've hand-written a couple kLOC of IR now and am starting to 
get a feel for the syntax, so I'll just say what "feels" right
based on
that and leave it to others to decide if I've absorbed enough to make 
any kind of sense.

Just imagining myself using such a language extension, I really would 
not want an ordering imposed where no natural one exists.  Indices feel 
very wrong.  Isn't a union basically just a convenient alternate 
interface to the various other conversion operators like bitcast, 
inttoptr, trunc, zext, and the rest?  (In fact that's how I manipulate 
my expressions, the three-bit tag in the low-order bits tell me how to 
treat the high-order bits.)  The "index" doesn't (generally)
represent
any kind of offset, but rather an interpretation of the bits, and none 
of the offset arithmetic implied by getelementptr or physical register 
choice implied by extractvalue will occur (except perhaps to satisfy 
alignment constraints, but that would be architecture dependent and I 
assume should therefore be invisible).  Correct?

If that argument is persuasive, then the following seems a bit more 
consistent with the existing syntax:

     ; Manipulation of a union register variable
     %myUnion = unioncast i32, %myValue to union {i32, float}
     %fieldValue = unioncast union {i32, float} %myUnion to i32
     ; %fieldValue == %myValue

This specialized union cast fits the pattern of having specialized cast 
operations between value and pointer as opposed to two values or two 
pointers.

That's enough, as you could require that unions be loaded and stored as 
unions and then elements extracted.  But if you want to make it a bit 
less syntactically noisy, and also allow the same flexibility that 
getelementptr would allow in accessing a single member through a 
pointer, you could allow

     ; Load/store of one particular union field
     store i32 %myValue, union {i32, float}* %myUnionPtr
     %fieldValue = load union {i32, float}* %myUnionPtr as i32
     ; %fieldValue == %myValue

Where I've added a preposition 'as' to the load instruction by
analogy
with what the cast operators do with 'to'.

I don't know that I'd argue the point much, but offhand it
"feels"
consistent with the rest of the syntax to have a specialized 'unioncast'
operator analogous with the other specialized conversions, but overload 
load/store as I illustrated so that pointers to unions are conceptually 
just funny kinds of pointers to their fields (which they are).  So in 
that vein, if you want a pointer to one of the alternatives in the union 
you'd just cast one pointer to another; to avoid alignment adjustments 
on what is supposed to be a no-op that cast probably shouldn't be 
bitcast.  So what about

     %intPtr = unioncast union {i32, float}* %myUnionPtr to i32*
     %newUnionPtr = unioncast i32* %intPtr to union {i32, float}*
     ; %newUnionPtr == %myUnionPtr

I'm not necessarily advocating overloading one keyword ('unioncast')
that way, though I note that it should always be unambiguous based on 
whether the operands are values or pointers (LLVM seems to have a strong 
notion of what is and is not a pointer, so this makes some kind of 
conceptual sense to me).  Whether it's OK to create two new keywords is 
perhaps too fine a detail for me to have a good sense of.  What would 
matter to me is not imposing order on unordered interpretations.

Dustin

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Jan 2010 - [LLVMdev] [PATCH] - Union types, attempt 2

[LLVMdev] [PATCH] - Union types, attempt 2

[LLVMdev] [PATCH] - Union types, attempt 2

[LLVMdev] [PATCH] - Union types, attempt 2

Apparently Analagous Threads