K Jelesnianski via llvm-dev
2019-Apr-19 04:12 UTC
[llvm-dev] Question: How to access c++ vtable pointer to use as Value* in LLVM pass
Dear Mailing List, This might sound unconventional, but I am trying to access a C++ objects vtable to pass as an argument to a function call for a library function I created. Creating & inserting a function call at the correct location in LLVM is done. I have learned that C++ objects are represented as struct types. But I'm just not quite sure how to get at the vtable pointer within, when looking at the interface of Value:: class. clang, more specifically CGClass.cpp, deals with C++ initializing constructors and destructors and its API is straightforward while I can't find the similar API calls in the LLVM counter part. So far I am able to get the class object itself from a loadInst or CallInst and I can iterate through the StructType, and the structs "Types" contained within via element_begin()/element_end() to confirm what I am looking at is the object. e.g.: i32 (...)*** (this is how vtable is represented according to online sources as a generic pointer) i32 (class member in this case an int) But this doesn't give me a Value* handle i can grab to and use later. How can I leverage this Value to get that contained ?? 2nd question: What happens if the struct object is from a derived class; iterating over the struct again, it looks like the vtable ptr is tangled even deeper within the object: %class.Base.base = type <{i32 (...)**, i32 }> i32 I looked at the ThreadSanitizer.cpp pass for inspiration, and it seems they are also using MD_tbaa as hints for whether a load/store isVTableAccess(), but doesn't need the Value. Maybe MDNode metadata could be of use here? TLDR: How can I leverage a Value that is of StructType generated from a C++ object to get its vtable ptr in LLVM to use as a Value for a to-be-inserted function call?? Thank you in advance! Sincerely, Christopher Jelesnianski Graduate Research Assistant, Virginia Tech
Tim Northover via llvm-dev
2019-Apr-19 06:06 UTC
[llvm-dev] Question: How to access c++ vtable pointer to use as Value* in LLVM pass
On Fri, 19 Apr 2019 at 05:13, K Jelesnianski via llvm-dev <llvm-dev at lists.llvm.org> wrote:> But this doesn't give me a Value* handle i can grab to and use later. > How can I leverage this Value to get that contained ??You need to get that from an instance rather than by iterating over the Type; almost certainly from a pointer to an instance since classes are hardly ever loaded as a whole in LLVM, just individual fields when needed. It sounds like you'll have one lying around. After that, you'd write: %vtable.ptr = getelementptr %MyStruct, %MyStruct* %obj, i32 0, i32 0 %vtable = load i32(...)*, i32(...)** %vtable.ptr The first index on that GEP is just because your object may be in an array, the second selects the vtable pointer. Loading it gives you a *pointer* to the vtable (so the object instance is a pointer to a pointer to the vtable). It's essentially what you'd get if you'd had a Value * from Clang's @_ZTV8MyStruct directly (via Module::getGlobalVariable). You have to bitcast it to the correct type, of course, because at the moment it's pretending to be a i32(...)*. But that's probably what you want to pass to your library function if it's expecting a vtable.> 2nd question: What happens if the struct object is from a derived > class; iterating over the struct again, it looks like the vtable ptr > is tangled even deeper within the object:It can get horribly complicated, with multiple vtables inside the object at different locations and vtables within vtables; sometimes even different vtables at different stages of the object's life. The specification of what goes where is here: https://itanium-cxx-abi.github.io/cxx-abi/abi.html This is a good time to point out that all of this is platform dependent. MSVC in particular does things very differently, and Clang on Windows follows it. Cheers. Tim.
K Jelesnianski via llvm-dev
2019-Apr-19 19:07 UTC
[llvm-dev] Question: How to access c++ vtable pointer to use as Value* in LLVM pass
Thanks for the super detailed answer! The ABI link is a great resource! Thanks for showing it. So my end goal is to have a function pass instrument and insert my custom call before any virtual calls in a program. Reading your response, I noticed similar code to the one you mentioned happening before each virtual call. It looks like the IR is "fetching" the needed vtable (%vtable) followed by extracting the appropriate virtual function (%vfn). Sample code below from a simple main calling the virtual function after the constructor call. I annotated the placement of where the custom call would go. 1 call void @_ZN7DerivedC2Ev(%class.Derived* %0) #3, !dbg !955 --- constructor call 2 store %class.Derived* %0, %class.Derived** %derv, align 8, !dbg !953 3 %3 = load %class.Derived*, %class.Derived** %derv, align 8, !dbg !958 4 %4 = bitcast %class.Derived* %3 to i32 (%class.Derived*)***, !dbg !959 5 %vtable = load i32 (%class.Derived*)**, i32 (%class.Derived*)*** %4, align 8, !dbg !959 6 %vfn = getelementptr inbounds i32 (%class.Derived*)*, i32 (%class.Derived*)** %vtable, i64 0, !dbg !959 ---- too far I dont need a virt function call 7 %5 = load i32 (%class.Derived*)*, i32 (%class.Derived*)** %vfn, align 8, !dbg !959 8~~~~ Insert custom call here ~~~~~~ 9 %call1 = call i32 %5(%class.Derived* %3), !dbg !959 ---- I used this CallInst Operand Value to get this far. In this case, can I "piggyback" off the 5th instruction and use that Instruction value* ? It should be do-able to iterate backwards from the CallInst(line 9) store the Value. The approach you are suggesting is to write my own 2 instructions which do the same things as lines 4 and 5, correct? Thanks again! On Fri, Apr 19, 2019 at 2:07 AM Tim Northover <t.p.northover at gmail.com> wrote:> > On Fri, 19 Apr 2019 at 05:13, K Jelesnianski via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > But this doesn't give me a Value* handle i can grab to and use later. > > How can I leverage this Value to get that contained ?? > > You need to get that from an instance rather than by iterating over > the Type; almost certainly from a pointer to an instance since classes > are hardly ever loaded as a whole in LLVM, just individual fields when > needed. It sounds like you'll have one lying around. > > After that, you'd write: > > %vtable.ptr = getelementptr %MyStruct, %MyStruct* %obj, i32 0, i32 0 > %vtable = load i32(...)*, i32(...)** %vtable.ptr > > The first index on that GEP is just because your object may be in an > array, the second selects the vtable pointer. Loading it gives you a > *pointer* to the vtable (so the object instance is a pointer to a > pointer to the vtable). It's essentially what you'd get if you'd had a > Value * from Clang's @_ZTV8MyStruct directly (via > Module::getGlobalVariable). > > You have to bitcast it to the correct type, of course, because at the > moment it's pretending to be a i32(...)*. But that's probably what you > want to pass to your library function if it's expecting a vtable. > > > 2nd question: What happens if the struct object is from a derived > > class; iterating over the struct again, it looks like the vtable ptr > > is tangled even deeper within the object: > > It can get horribly complicated, with multiple vtables inside the > object at different locations and vtables within vtables; sometimes > even different vtables at different stages of the object's life. The > specification of what goes where is here: > https://itanium-cxx-abi.github.io/cxx-abi/abi.html > > This is a good time to point out that all of this is platform > dependent. MSVC in particular does things very differently, and Clang > on Windows follows it. > > Cheers. > > Tim.