Virgile Bello
2014-Nov-29 17:25 UTC
[LLVMdev] Frontend: How to use Member to Function Pointer as callbacks
Hello, As part of a MSIL (i.e. C#) to LLVM frontend I am currently working on ( https://github.com/xen2/SharpLang ), I would need some help/hint about how to properly design "PInvoke callbacks". Through "PInvoke" mechanism .NET allows you to call C functions, i.e.: C#: [DllImport("libc.so")] extern void mempcy(void* dest, void* src, int size); // declaration of C function memcpy(ptr1, ptr2, 32); // let's use it in C# code That's quite easy to support. However, the tricky part is that C# functions pointers (callbacks, a.k.a delegate) can be transmitted to those C functions: C: void MethodWithCallback(void(*callback)(int)); C#: delegate void CallbackType(int result); // <-- Point to a member to function pointer (need "this") [DllImport("mylib.so")] extern void MethodWithCallback(CallbackType callback); An extra "this" parameter is needed to call the real method, but of course the calling C code doesn't know about it (MethodWithCallback expects a non-member function pointer). This is similar to C++ not being able to cast pointer to function member (containing this) as normal function pointers, because they are not compatible. Using a JIT, it would be quite easy to deal with (generate thunk/code) but I would like to support full AOT scenario where executable memory can't be modified (and also not have to embed/use LLVM at runtime, just like a plain C/C++ executable). One (rather complicated) option I was thinking is: - Define a maximum number of those callback alive (let's say 4096) -- it's not per function signature, but global - Define i8* thunkTargets[4096] - Define i8* ThunkIdToFuncPtr[4096] - Define 4096 funcs (not even sure how to do that with LLVM, might need to emit assembly?) Thunk0: jmp thunkTargets[0]; Thunk1: jmp thunkTargets[1]; ... Thunk4095: jmp thunkTargets[4095]; - When I call a C function from C# with a callback, what happens is: - Find an unused slot in this thunk table (X) - Register C# member to function pointer in ThunkIdToFuncPtr[X] - Replace thunkTargets[X] with pointer address to "RedirectMethodFuncWithIntParameter" (one such function per callback signature) - This redirect method would receive arguments unmodified from C functions (since previous call was a simple jmp) - It would check in the call stack the current slot X being called (up in the callstack, if call instruction is "call Thunk3" from address Thunk3 we know X is 3 -- it will need assembly, might be difficult to compute and won't be portable...) - ThunkIdToFuncPtr[X] would give us the actual method to forward to - RedirectMethodFuncWithIntParameter would call ThunkIdToFuncPtr[X](arg1) This design allow to use the slot number X of MethodX to differentiate the actual C# callback to call (code is small, so OK to have many, 4096 in this case), but still have only ONE actual dispatcher/redirect method per signature (code is much bigger). Does that seem feasible? I don't like the fact that I would have to step out of LLVM bitcode and generate some non-portable assembly code (I was trying to stick with LLVM bitcode so far). Any other idea, or maybe some LLVM infrastructure/system/subproject or another LLVM frontend that had the same issue that might help me there? Also, I am not sure whether LLVM trampoline could help me there? (not sure if they could do that, and if they work in full AOT scenarios, where JIT is not allowed?) Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141130/4e60c66f/attachment.html>
Sanjoy Das
2014-Nov-30 07:31 UTC
[LLVMdev] Frontend: How to use Member to Function Pointer as callbacks
Hi Virgile,> One (rather complicated) option I was thinking is: > - Define a maximum number of those callback alive (let's say 4096) -- it's > not per function signature, but global > - Define i8* thunkTargets[4096] > - Define i8* ThunkIdToFuncPtr[4096] > - Define 4096 funcs (not even sure how to do that with LLVM, might need to > emit assembly?) > Thunk0: jmp thunkTargets[0]; > Thunk1: jmp thunkTargets[1]; > ... > Thunk4095: jmp thunkTargets[4095]; > > - When I call a C function from C# with a callback, what happens is: > - Find an unused slot in this thunk table (X) > - Register C# member to function pointer in ThunkIdToFuncPtr[X] > - Replace thunkTargets[X] with pointer address to > "RedirectMethodFuncWithIntParameter" (one such function per callback > signature) > - This redirect method would receive arguments unmodified from C > functions (since previous call was a simple jmp) > - It would check in the call stack the current slot X being called (up > in the callstack, if call instruction is "call Thunk3" from address Thunk3 > we know X is 3 -- it will need assembly, might be difficult to compute and > won't be portable...) > - ThunkIdToFuncPtr[X] would give us the actual method to forward to > - RedirectMethodFuncWithIntParameter would call > ThunkIdToFuncPtr[X](arg1)I couldn't quite understand this approach -- at what point do you figure out what the value of 'this' is? -- Sanjoy
Virgile Bello
2014-Nov-30 07:46 UTC
[LLVMdev] Frontend: How to use Member to Function Pointer as callbacks
My bad, it would be captured as well during thunk allocation in a ThunkIdToThis[] array, and last line should be: - RedirectMethodFuncWithIntParameter would call ThunkIdToFuncPtr[X](ThunkIdToThis[X], arg1) On Nov 30, 2014 4:32 PM, "Sanjoy Das" <sanjoy at playingwithpointers.com> wrote:> Hi Virgile, > > > One (rather complicated) option I was thinking is: > > - Define a maximum number of those callback alive (let's say 4096) -- > it's > > not per function signature, but global > > - Define i8* thunkTargets[4096] > > - Define i8* ThunkIdToFuncPtr[4096] > > - Define 4096 funcs (not even sure how to do that with LLVM, might need > to > > emit assembly?) > > Thunk0: jmp thunkTargets[0]; > > Thunk1: jmp thunkTargets[1]; > > ... > > Thunk4095: jmp thunkTargets[4095]; > > > > - When I call a C function from C# with a callback, what happens is: > > - Find an unused slot in this thunk table (X) > > - Register C# member to function pointer in ThunkIdToFuncPtr[X] > > - Replace thunkTargets[X] with pointer address to > > "RedirectMethodFuncWithIntParameter" (one such function per callback > > signature) > > - This redirect method would receive arguments unmodified from C > > functions (since previous call was a simple jmp) > > - It would check in the call stack the current slot X being called > (up > > in the callstack, if call instruction is "call Thunk3" from address > Thunk3 > > we know X is 3 -- it will need assembly, might be difficult to compute > and > > won't be portable...) > > - ThunkIdToFuncPtr[X] would give us the actual method to forward to > > - RedirectMethodFuncWithIntParameter would call > > ThunkIdToFuncPtr[X](arg1) > > I couldn't quite understand this approach -- at what point do you > figure out what the value of 'this' is? > > -- Sanjoy >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141130/7e856cf2/attachment.html>
Reid Kleckner
2014-Dec-01 18:47 UTC
[LLVMdev] Frontend: How to use Member to Function Pointer as callbacks
That feature sounds pretty hard to implement efficiently with an AOT compiler. =/ What is the lifetime of the callback object in .NET? And how frequently are they created? That seems like an important constraint. I think you meant to use a table like this: mov thunkThisPtrs[0] -> ecx ; or scratch reg jmp thunkTargets[0] That's probably the best you can do without creating the thunks at runtime. --- The trampoline support in LLVM exists for GCC nested functions, which llvm-gcc used to support. It will only work if you have some writable and executable memory around at runtime. GCC used stack memory, back when stacks were executable, so the nested function pointer only was only live until the parent function frame returned. You should also take a look at the 'nest' parameter attribute which was used to thread through an extra parameter without conflicting with the prototyped ones. It might be useful in your AOT scheme. On Sat, Nov 29, 2014 at 9:25 AM, Virgile Bello <virgile.bello at gmail.com> wrote:> Hello, > > As part of a MSIL (i.e. C#) to LLVM frontend I am currently working on ( > https://github.com/xen2/SharpLang ), I would need some help/hint about > how to properly design "PInvoke callbacks". > > Through "PInvoke" mechanism .NET allows you to call C functions, i.e.: > > C#: > [DllImport("libc.so")] extern void mempcy(void* dest, void* src, int > size); // declaration of C function > memcpy(ptr1, ptr2, 32); // let's use it in C# code > > That's quite easy to support. > However, the tricky part is that C# functions pointers (callbacks, a.k.a > delegate) can be transmitted to those C functions: > > C: > void MethodWithCallback(void(*callback)(int)); > > C#: > delegate void CallbackType(int result); // <-- Point to a member to > function pointer (need "this") > [DllImport("mylib.so")] extern void MethodWithCallback(CallbackType > callback); > > An extra "this" parameter is needed to call the real method, but of course > the calling C code doesn't know about it (MethodWithCallback expects a > non-member function pointer). > This is similar to C++ not being able to cast pointer to function member > (containing this) as normal function pointers, because they are not > compatible. > > Using a JIT, it would be quite easy to deal with (generate thunk/code) but > I would like to support full AOT scenario where executable memory can't be > modified (and also not have to embed/use LLVM at runtime, just like a plain > C/C++ executable). > > One (rather complicated) option I was thinking is: > - Define a maximum number of those callback alive (let's say 4096) -- it's > not per function signature, but global > - Define i8* thunkTargets[4096] > - Define i8* ThunkIdToFuncPtr[4096] > - Define 4096 funcs (not even sure how to do that with LLVM, might need to > emit assembly?) > Thunk0: jmp thunkTargets[0]; > Thunk1: jmp thunkTargets[1]; > ... > Thunk4095: jmp thunkTargets[4095]; > > - When I call a C function from C# with a callback, what happens is: > - Find an unused slot in this thunk table (X) > - Register C# member to function pointer in ThunkIdToFuncPtr[X] > - Replace thunkTargets[X] with pointer address to > "RedirectMethodFuncWithIntParameter" (one such function per callback > signature) > - This redirect method would receive arguments unmodified from C > functions (since previous call was a simple jmp) > - It would check in the call stack the current slot X being called (up > in the callstack, if call instruction is "call Thunk3" from address Thunk3 > we know X is 3 -- it will need assembly, might be difficult to compute and > won't be portable...) > - ThunkIdToFuncPtr[X] would give us the actual method to forward to > - RedirectMethodFuncWithIntParameter would call > ThunkIdToFuncPtr[X](arg1) > > This design allow to use the slot number X of MethodX to differentiate the > actual C# callback to call (code is small, so OK to have many, 4096 in this > case), but still have only ONE actual dispatcher/redirect method per > signature (code is much bigger). > > Does that seem feasible? I don't like the fact that I would have to step > out of LLVM bitcode and generate some non-portable assembly code (I was > trying to stick with LLVM bitcode so far). > Any other idea, or maybe some LLVM infrastructure/system/subproject or > another LLVM frontend that had the same issue that might help me there? > > Also, I am not sure whether LLVM trampoline could help me there? (not sure > if they could do that, and if they work in full AOT scenarios, where JIT is > not allowed?) > > Thanks, > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141201/0512b9ec/attachment.html>