On 22 February 2015 at 22:54, Hal Finkel <hfinkel at anl.gov> wrote:>> I tried setting the module's DataLayout to the engine's DataLayout. >> Don't see any improvement. >> The memcpy() is to perform a struct assign, so I tried replacing that >> with member by member store. >> But even then the loads are not being eliminated so I guess the >> memcpy() isn't the issue. > > If you run the IR through opt -O3 do you get the optimization you expect? >Hi, Tried that - no improvement. Also tried removing the redundant GEP instructions, leaving just the loads. Here is a dump that shows the output after running the optimizer passes (this is from the passes in my program not opt -O3; this version does not use memcpy() either): %L_ci = getelementptr inbounds %ravi.lua_State* %L, i64 0, i32 6 %0 = load %ravi.CallInfo** %L_ci, align 8 %base = getelementptr inbounds %ravi.CallInfo* %0, i64 0, i32 4, i32 0 %1 = bitcast %ravi.CallInfo* %0 to %ravi.LClosure*** %2 = load %ravi.LClosure*** %1, align 8 %3 = load %ravi.LClosure** %2, align 8 %Proto = getelementptr inbounds %ravi.LClosure* %3, i64 0, i32 5 %4 = load %ravi.Proto** %Proto, align 8 %k = getelementptr inbounds %ravi.Proto* %4, i64 0, i32 14 %5 = load %ravi.TValue** %k, align 8 %6 = load %ravi.TValue** %base, align 8 %srcvalue = getelementptr inbounds %ravi.TValue* %5, i64 0, i32 0, i32 0 %destvalue = getelementptr inbounds %ravi.TValue* %6, i64 0, i32 0, i32 0 %7 = load double* %srcvalue, align 8 store double %7, double* %destvalue, align 8 %srctype = getelementptr inbounds %ravi.TValue* %5, i64 0, i32 1 %desttype = getelementptr inbounds %ravi.TValue* %6, i64 0, i32 1 %8 = load i32* %srctype, align 4 store i32 %8, i32* %desttype, align 4 %9 = load %ravi.TValue** %base, align 8 %srcvalue1 = getelementptr inbounds %ravi.TValue* %5, i64 1, i32 0, i32 0 %destvalue2 = getelementptr inbounds %ravi.TValue* %9, i64 1, i32 0, i32 0 %10 = load double* %srcvalue1, align 8 store double %10, double* %destvalue2, align 8 Regards Dibyendu
Hi Dibyendu, It would be very helpful if you could post the original source code or snippet. That way, one can investigate deeper to understand the problem. Regards, Kamal Sharma On Sun, Feb 22, 2015 at 4:44 PM, Dibyendu Majumdar <mobile at majumdar.org.uk> wrote:> On 22 February 2015 at 22:54, Hal Finkel <hfinkel at anl.gov> wrote: > >> I tried setting the module's DataLayout to the engine's DataLayout. > >> Don't see any improvement. > >> The memcpy() is to perform a struct assign, so I tried replacing that > >> with member by member store. > >> But even then the loads are not being eliminated so I guess the > >> memcpy() isn't the issue. > > > > If you run the IR through opt -O3 do you get the optimization you expect? > > > > Hi, > Tried that - no improvement. > > Also tried removing the redundant GEP instructions, leaving just the > loads. Here is a dump that shows the output after running the > optimizer passes (this is from the passes in my program not opt -O3; > this version does not use memcpy() either): > > %L_ci = getelementptr inbounds %ravi.lua_State* %L, i64 0, i32 6 > %0 = load %ravi.CallInfo** %L_ci, align 8 > %base = getelementptr inbounds %ravi.CallInfo* %0, i64 0, i32 4, i32 0 > %1 = bitcast %ravi.CallInfo* %0 to %ravi.LClosure*** > %2 = load %ravi.LClosure*** %1, align 8 > %3 = load %ravi.LClosure** %2, align 8 > %Proto = getelementptr inbounds %ravi.LClosure* %3, i64 0, i32 5 > %4 = load %ravi.Proto** %Proto, align 8 > %k = getelementptr inbounds %ravi.Proto* %4, i64 0, i32 14 > %5 = load %ravi.TValue** %k, align 8 > %6 = load %ravi.TValue** %base, align 8 > %srcvalue = getelementptr inbounds %ravi.TValue* %5, i64 0, i32 0, i32 0 > %destvalue = getelementptr inbounds %ravi.TValue* %6, i64 0, i32 0, i32 0 > %7 = load double* %srcvalue, align 8 > store double %7, double* %destvalue, align 8 > %srctype = getelementptr inbounds %ravi.TValue* %5, i64 0, i32 1 > %desttype = getelementptr inbounds %ravi.TValue* %6, i64 0, i32 1 > %8 = load i32* %srctype, align 4 > store i32 %8, i32* %desttype, align 4 > %9 = load %ravi.TValue** %base, align 8 > %srcvalue1 = getelementptr inbounds %ravi.TValue* %5, i64 1, i32 0, i32 0 > %destvalue2 = getelementptr inbounds %ravi.TValue* %9, i64 1, i32 0, i32 > 0 > %10 = load double* %srcvalue1, align 8 > store double %10, double* %destvalue2, align 8 > > > Regards > Dibyendu > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150222/1f79ddc3/attachment.html>
On 23 February 2015 at 01:29, Kamal Sharma <kgs1.rice at gmail.com> wrote:> Hi Dibyendu, > > It would be very helpful if you could post the original source code or > snippet. > That way, one can investigate deeper to understand the problem. > > Regards, > Kamal Sharma >Hi Kamal, Sure. I guess I ought to create a test that one can look in isolation. I am working on building a JIT compiler for Lua (actually a derivative of Lua). It is currently work in progress. The approach is to compile Lua bytecodes into LLVM IR. The IR generated from my code is here (after optimization I should add): https://github.com/dibyendumajumdar/ravi/blob/master/clang-output/lua_op_loadk_return_ravi.ll I am using the output from Clang as a guide to generating IR. So I write small snippets of code in C which are equivalent to Lua bytecodes - then use Clang to emit the IR. I use this to work out the IR I need to build. The C equivalent of the program I am compiling is here: https://github.com/dibyendumajumdar/ravi/blob/master/clang-output/lua_op_loadk_return.c The difference between the C version and what I generate is that I put a load of the "base" pointer at the beginning of every Lua opcode. This is because some Lua opcodes can reallocate the memory pointed to by base. I was hoping that the optimizer will get rid of the redundant stuff. The code generation is all done here: https://github.com/dibyendumajumdar/ravi/blob/master/src/ravijit.cpp I don't expect you to wade through all this - but I will be grateful for any help / guidance you can provide. Thanks and Regards Dibyendu