thr3ads.net - llvm dev - [LLVMdev] C++ type erasure in llvm-g++ [Mar 2009]

If this information is useful, please help other people find it:
Share via:

Luke Dalessandro

2009-Mar-24 15:40 UTC

[LLVMdev] C++ type erasure in llvm-g++

I'm curious about the type erasure that goes on when llvm-g++ compiles 
C++ code. Is this a consequence of it just being the easiest way to do 
things based on the design of gcc and how LLVM is plugged into it?

Can someone more familiar with the llvm-gcc infrastructure comment on 
the difficulty of generating more strongly typed virtual function tables 
rather than just having them all be variable length arrays of pointers 
of unknown type and casting to the "known" type before the call?

I understand that there's some issue with structural type equivalence 
that can merge identical looking table types, but I feel like it would 
help me with alias analysis to determine possible targets for indirect 
calls/invokes, or for debug related purposes. What I'd really like would 
be a vcall/vinvoke instruction, but I get LL part of LLVM.

I guess my main complaint is that I'd like to use the LLVM 
infrastructure to do some higher-level (semantics-wise) manipulation of 
code without waiting for clang to handle C++. Any other suggestions 
would be appreciated.

Thanks,
Luke

Chris Lattner

2009-Mar-24 15:45 UTC

head link

[LLVMdev] C++ type erasure in llvm-g++

On Mar 24, 2009, at 8:40 AM, Luke Dalessandro wrote:> I'm curious about the type erasure that goes on when llvm-g++ compiles
> C++ code. Is this a consequence of it just being the easiest way to do
> things based on the design of gcc and how LLVM is plugged into it?
This is just due to how the G++ front-end happens to lower the C++  
types to C types internally.
> Can someone more familiar with the llvm-gcc infrastructure comment on
> the difficulty of generating more strongly typed virtual function  
> tables
> rather than just having them all be variable length arrays of pointers
> of unknown type and casting to the "known" type before the call?
This would require changing the G++ front-end.  I have no idea how  
difficult that would be.

-Chris

Mike Stump

2009-Mar-24 17:08 UTC

head link

[LLVMdev] C++ type erasure in llvm-g++

On Mar 24, 2009, at 8:40 AM, Luke Dalessandro wrote:> Can someone more familiar with the llvm-gcc infrastructure comment on
> the difficulty of generating more strongly typed virtual function  
> tables
> rather than just having them all be variable length arrays of pointers
> of unknown type and casting to the "known" type before the call?
The easiest way would be to handle this in the gcc/llvm interface  
layer.  The type of each slot can easily be figured out and the type  
of the vtable can be built up as a structure instead of an array.  I'd  
guess it shouldn't be more than 100 lines.  Harder to do would be to  
transform the virtual dispatch code.  It is exposed as just raw add/ 
subtract, fetch and then indirect call.  Seems like part of the  
solution may be to propagate the ALIAS_SET information from the type  
system down for llvm to reason with.  It should be more complete and  
accurate than the information llvm has, though, maybe only marginally  
so.  The saving grace would be the code is heavily stylized and you're  
getting it before the optimizer swizzles it on you.  Since all the  
math is with constants usually, you just need to recognize the style  
and the type during the call and the type at the other end, where the  
pointer arithmetic starts and the transform back into the usual llvm  
accessors for structures.  Annoying to do, but, probably under 200  
lines.
> Any other suggestions would be appreciated.
Sure, just add code to propagate the types around, add code to handle  
constant arithmetic on these things and and to handle normal virtual  
dispatches, after that, add support for pointer to member functions  
and you're done.  You should be able to figure out that these things  
don't escape much, that for a given constant (index), the same shape  
(type) is always used, that for a given variable (pointer to member  
function), that the same shape (type) is used and all assignments of  
this variable come from things of the same shape or that they comes  
from literals that have the same shape.

Luke Dalessandro

2009-Mar-24 17:22 UTC

head link

[LLVMdev] C++ type erasure in llvm-g++

Mike Stump wrote:> On Mar 24, 2009, at 8:40 AM, Luke Dalessandro wrote:
>> Can someone more familiar with the llvm-gcc infrastructure comment on
>> the difficulty of generating more strongly typed virtual function  
>> tables
>> rather than just having them all be variable length arrays of pointers
>> of unknown type and casting to the "known" type before the
call?
> 
> The easiest way would be to handle this in the gcc/llvm interface  
> layer.  The type of each slot can easily be figured out and the type  
> of the vtable can be built up as a structure instead of an array.  I'd
> guess it shouldn't be more than 100 lines.  Harder to do would be to  
> transform the virtual dispatch code.  It is exposed as just raw add/ 
> subtract, fetch and then indirect call.  Seems like part of the  
> solution may be to propagate the ALIAS_SET information from the type  
> system down for llvm to reason with.  It should be more complete and  
> accurate than the information llvm has, though, maybe only marginally  
> so.  The saving grace would be the code is heavily stylized and you're
> getting it before the optimizer swizzles it on you.  Since all the  
> math is with constants usually, you just need to recognize the style  
> and the type during the call and the type at the other end, where the  
> pointer arithmetic starts and the transform back into the usual llvm  
> accessors for structures.  Annoying to do, but, probably under 200  
> lines.
OK, so it's mainly a problem of becoming comfortable with the llvm-gcc 
internals that are affected and not a fundamental whole-compiler design 
problem. That sounds like a multi-month rather than multi-year thing for 
me, Thanks.
> 
>> Any other suggestions would be appreciated.
> 
> Sure, just add code to propagate the types around, add code to handle  
> constant arithmetic on these things and and to handle normal virtual  
> dispatches, after that, add support for pointer to member functions  
> and you're done.  You should be able to figure out that these things  
> don't escape much, that for a given constant (index), the same shape  
> (type) is always used, that for a given variable (pointer to member  
> function), that the same shape (type) is used and all assignments of  
> this variable come from things of the same shape or that they comes  
> from literals that have the same shape.
So I can essentially rematerialize the vtable types by pushing things 
back through from the indirect calls in the program. Wouldn't existing 
alias analysis _do_ the same thing in a less specific manner? I guess 
that alias analysis doesn't always "trust" casts, where if I
manually
pushed back I would be assuming that the casts are correct?

Luke
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Seemingly Similar Threads

Search for more maybe matching threads

llvm dev - Mar 2009 - [LLVMdev] C++ type erasure in llvm-g++

[LLVMdev] C++ type erasure in llvm-g++

[LLVMdev] C++ type erasure in llvm-g++

[LLVMdev] C++ type erasure in llvm-g++

[LLVMdev] C++ type erasure in llvm-g++

Seemingly Similar Threads