Hi, I'm working on developing a programming language using LLVM as a backend, and it'd be very handy for me if LLVM had union support. I've been looking into getting the previous union implementation working properly for the last week or so, but I'm entirely new to the LLVM codebase so I thought I'd ask whether I'm barking up the wrong tree before doing a full-blown implementation. At the moment it seems like the best approach to get unions working is to treat them as a byte array and then have the insertvalue/extractvalue instructions automatically perform conversions to and from other types by bitcasting to/from an equivalent size i8 vector where the bytes can be got at individually. This approach seems to have a few problems. It gets vector instructions involved without any really good reason (I'm looking at the assembler output). It also seems to violate the ABI - my test function is trying to return the result in memory where I think it should be using registers (I'm JITing a function and calling it using GCC compiled code). The x86-86 ABI isn't very clear to me though. Anyway, please let me know if anyone sees major problems with this approach, or has any thoughts on the ABI issues. Regards, James -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101212/fec495fc/attachment.html>
On Sun, Dec 12, 2010 at 6:15 AM, James Lyon <jameslyon0 at gmail.com> wrote:> Hi, > > I'm working on developing a programming language using LLVM as a backend, > and it'd be very handy for me if LLVM had union support. I've been looking > into getting the previous union implementation working properly for the last > week or so, but I'm entirely new to the LLVM codebase so I thought I'd ask > whether I'm barking up the wrong tree before doing a full-blown > implementation. At the moment it seems like the best approach to get unions > working is to treat them as a byte array and then have the > insertvalue/extractvalue instructions automatically perform conversions to > and from other types by bitcasting to/from an equivalent size i8 vector > where the bytes can be got at individually. > > This approach seems to have a few problems. It gets vector instructions > involved without any really good reason (I'm looking at the assembler > output). It also seems to violate the ABI - my test function is trying to > return the result in memory where I think it should be using registers (I'm > JITing a function and calling it using GCC compiled code). The x86-86 ABI > isn't very clear to me though. > > Anyway, please let me know if anyone sees major problems with this approach, > or has any thoughts on the ABI issues.An alternate approach would be to not define a union type as such, but to introduce metadata (using the LLVM metadata support) that marks a memory object as a discriminated union with a particular discriminator value. Then optimizers could be taught to make assumptions such as "the discriminator value has changed if and only if the type of the object's data has changed". If your language lends itself to having all your discriminated unions live in memory until mem2reg time, that should make it work well without having to rework every layer of LLVM.
Hi James,> I'm working on developing a programming language using LLVM as a backend, and > it'd be very handy for me if LLVM had union support. I've been looking into > getting the previous union implementation working properly for the last week or > so, but I'm entirely new to the LLVM codebase so I thought I'd ask whether I'm > barking up the wrong tree before doing a full-blown implementation. At the > moment it seems like the best approach to get unions working is to treat them as > a byte array and then have the insertvalue/extractvalue instructions > automatically perform conversions to and from other types by bitcasting to/from > an equivalent size i8 vector where the bytes can be got at individually.you would do better to use arrays rather than vectors, and access them via memory. As a general rule you shouldn't try to hold aggregate values in registers - it is supported but only efficient for small aggregates like complex numbers. The llvm-gcc front-end represents a union as a struct containing one field with type equal to the type of the largest member of the union. Unions are accessed from memory (rather than placed in registers) and the bitcast instruction is used to turn a pointer to the field into a pointer to one of the other types making up the union.> This approach seems to have a few problems. It gets vector instructions involved > without any really good reason (I'm looking at the assembler output).You used vectors thus you get vector instructions. It also> seems to violate the ABI - my test function is trying to return the result in > memory where I think it should be using registers (I'm JITing a function and > calling it using GCC compiled code).Sadly, it is up to front-ends to take of getting the ABI right. This is because there is not enough information in the LLVM IR for it to handle all ABI details for you automagically. The x86-86 ABI isn't very clear to me though. Yes, it's extremely complicated.> Anyway, please let me know if anyone sees major problems with this approach, or > has any thoughts on the ABI issues.Take a look at http://llvm.org/bugs/show_bug.cgi?id=4246 Ciao, Duncan.