Stéphane Letz
2007-Jun-21 18:57 UTC
[LLVMdev] Runtime optimization of C++ code with virtual functions
> On Wed, 20 Jun 2007, Maurizio Vitale wrote: >>>> Is there any possible method using LLVM that would help in this >>>> case? >>> >>> LLVM won't help in this case. >> >> Is that so or it means that LLVM wouldn't have a prebuilt solution? > > It means that LLVM doesn't have any trivial builtin solution. > >> I'm asking because (without having ever looked seriously into LLVM) I >> was thinking to experiment along these lines: >> >> class Source { >> void send (T data) { >> invoke_jit_magic(); >> transport (data); >> } >> } >> >> transport() would be a virtual method like the original posting. >> In my >> case send() would be part of the framework, so it is not a problem to >> add the invoke_jit_magic. In other case it might be trickier. > > Ok. > >> On the first call, invoke_jit_magic gains control, traverse the >> binary >> converting (a subset of) what it finds to LLVM IR, until it gets >> to the >> concrete target. It may have to do a bit of work to understand how >> parameters are passed to the transport code (it is a virtual function >> call and might be messy in presence of multiple/virtual inheritance. >> After that LLVM jit can be used to replace the original binary >> fragment >> with something faster. > > Ok. > >> I agree with the suggestion of using templates when possible.But this works at compile time only right?>> In my case >> it is not doable because transport would be propietary and the code >> containing it distributed only as binary. > > Ok. > >> I understand that the disassemblying portion need to be rewritten. Is >> there anything else that would prevent this approach from working? >> Again, haven't looked into LLVM yet, so I can immagine there might be >> problems in describing physical registers in the IR and at some point >> stuff must be exactly where the pre-existing code expects it. I don;t >> want to take your time, but if you could elaborate a bit it might >> prevent me from going down the wrong path. > > This should work, I don't expect you to run into any significant > problems. > When you're rewriting the LLVM IR for the indirect call, you can just > replace it with a direct call to the native code. >Compared to template based specialization this would have the advantage of being dynamic. Stephane Letz
David Greene
2007-Jun-25 22:56 UTC
[LLVMdev] Runtime optimization of C++ code with virtual functions
On Thursday 21 June 2007 13:57, Stéphane Letz wrote:> >> I understand that the disassemblying portion need to be rewritten. Is > >> there anything else that would prevent this approach from working? > >> Again, haven't looked into LLVM yet, so I can immagine there might be > >> problems in describing physical registers in the IR and at some point > >> stuff must be exactly where the pre-existing code expects it. I don;t > >> want to take your time, but if you could elaborate a bit it might > >> prevent me from going down the wrong path. > > > > This should work, I don't expect you to run into any significant > > problems. > > When you're rewriting the LLVM IR for the indirect call, you can just > > replace it with a direct call to the native code. > > Compared to template based specialization this would have the > advantage of being dynamic.But templates have the advantage of being able to be inlined. This is a much more important transformation than simply converting an indirect call to a direct one, especially on modern implementations like Core or Opteron. You approach is going to make inlining very difficult, I think. Not that there's a whole lot that can be done about it, given the binary translation going on. For example, how would you inline calls to send() where transport() has been inlined (assuming send() wasn't already inlined)? Is there some other set of transformations you have in mind to generate more efficient code for transport() at run time? Partial evaluation might be interesting, but that's applicable whether or not transport() is virtual. In fact, virtual call resolution is a form of partial evaluation where the run-time constants are the "this" pointer and its most-derived subclass type. If you really want to generate fast code, it might be worth your while to implement more general partial evaluation and specialization. If you make it general enough, you'll get run-time virtual call resolution "for free." You might also have a look at the Self papers. The Self team did a lot of work on runtime optimization of dynamic dispatch. IIRC they also did some partial evaluation work. -Dave
Possibly Parallel Threads
- [LLVMdev] Runtime optimization of C++ code with virtual functions
- [LLVMdev] Runtime optimization of C++ code with virtual functions
- [LLVMdev] Runtime optimization of C++ code with virtual functions
- [LLVMdev] Runtime optimization of C++ code with virtual functions
- [LLVMdev] Vectorized LLVM IR