Hi, The attached is the fix to radar://11663049. The optimization can be outlined by following rules: (select (x != c), e, c) -> select (x != c), e, x), (select (x == c), c, e) -> select (x == c), x, e) where the <c> is an integer constant. The reason for this change is that : on x86, conditional-move-from-constant needs two instructions; however, conditional-move-from-register need only one instruction. While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place.The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase. The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource". Thanks Shuxin -------------- next part -------------- Index: lib/Target/X86/X86ISelLowering.cpp ==================================================================--- lib/Target/X86/X86ISelLowering.cpp (revision 165638) +++ lib/Target/X86/X86ISelLowering.cpp (working copy) @@ -14418,6 +14418,7 @@ if (TrueC->getAPIntValue().ult(FalseC->getAPIntValue())) { CC = X86::GetOppositeBranchCondition(CC); std::swap(TrueC, FalseC); + std::swap(TrueOp, FalseOp); } // Optimize C ? 8 : 0 -> zext(setcc(C)) << 3. Likewise for any pow2/0. @@ -14500,6 +14501,45 @@ } } } + + // Handle these cases: + // (select (x != c), e, c) -> select (x != c), e, x), + // (select (x == c), c, e) -> select (x == c), x, e) + // where the c is an integer constant, and the "select" is the combination + // of CMOV and CMP. + // + // The rationale for this change is that the conditional-move from a constant + // needs two instructions, however, conditional-move from a register needs + // only one instruction. + // + // CAVEAT: By replacing a constant with a symbolic value, it may obscure + // some instruction-combining opportunities. This opt needs to be + // postponed as late as possible. + // + if (!DCI.isBeforeLegalize() && !DCI.isBeforeLegalizeOps()) { + // the DCI.xxxx conditions are provided to postpone the optimization as + // late as possible. + + ConstantSDNode *CmpAgainst = 0; + if ((Cond.getOpcode() == X86ISD::CMP || Cond.getOpcode() == X86ISD::SUB) && + (CmpAgainst = dyn_cast<ConstantSDNode>(Cond.getOperand(1))) && + dyn_cast<ConstantSDNode>(Cond.getOperand(0)) == 0) { + + if (CC == X86::COND_NE && + CmpAgainst == dyn_cast<ConstantSDNode>(FalseOp)) { + CC = X86::GetOppositeBranchCondition(CC); + std::swap(TrueOp, FalseOp); + } + + if (CC == X86::COND_E && + CmpAgainst == dyn_cast<ConstantSDNode>(TrueOp)) { + SDValue Ops[] = { FalseOp, Cond.getOperand(0), N->getOperand(2), Cond }; + return DAG.getNode(X86ISD::CMOV, DL, N->getVTList (), Ops, + array_lengthof(Ops)); + } + } + } + return SDValue(); } Index: test/CodeGen/X86/select_const.ll ==================================================================--- test/CodeGen/X86/select_const.ll (revision 0) +++ test/CodeGen/X86/select_const.ll (revision 0) @@ -0,0 +1,16 @@ +; RUN: llc < %s -mtriple=x86_64-apple-darwin10 -mcpu=corei7 | FileCheck %s + +define i64 @test1(i64 %x) nounwind { +entry: + %cmp = icmp eq i64 %x, 2 + %add = add i64 %x, 1 + %retval.0 = select i1 %cmp, i64 2, i64 %add + ret i64 %retval.0 + +; CHECK: test1: +; CHECK: leaq 1(%rdi), %rax +; CHECK: cmpq $2, %rdi +; CHECK: cmoveq %rdi, %rax +; CHECK: ret + +}
LGTM. I will commit. On Oct 10, 2012, at 1:20 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:> Hi, > > The attached is the fix to radar://11663049. The optimization can be outlined by following rules: > > (select (x != c), e, c) -> select (x != c), e, x), > (select (x == c), c, e) -> select (x == c), x, e) > where the <c> is an integer constant. > > The reason for this change is that : on x86, conditional-move-from-constant needs two instructions; > however, conditional-move-from-register need only one instruction. > > While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place.The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase. > > The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource". > > Thanks > Shuxin > <diff.patch>_______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Hi All, I just started with the API of LLVM and have a beginner's question. I'm trying to JIT compile the attached C program "getinmemory" to an inmemory module by a my attached example LLVM program. After linking of this inmemory module against libc.so.6 I call the main routine of the C program. Why do I have still malloc as a unresoveld externals ? Contains the module "module_ex" not an fully executable program ? Here are the printouts ( Linux): ------------------------ compile_to_module-return clang -fdiagnostics-format=clang getinmemory.c -fsyntax-only -I /usr/include -I /usr/lib/gcc/i586-suse-linux/4.6/include -I /usr/src/linux/include -I /usr/src/linux/include/linux -v -fdollars-in-identifiers -fno-operator-names -ffp-contract=on -fobjc-runtime=macosx -triple i386-pc-linux-gnu clang -cc1 version 3.2 based upon LLVM 3.2svn default target i386-pc-linux-gnu #include "..." search starts here: #include <...> search starts here: /usr/include /usr/lib/gcc/i586-suse-linux/4.6/include /usr/src/linux/include /usr/src/linux/include/linux End of search list. [ clip .. warnings] Main malloc curl_global_init curl_easy_init curl_easy_setopt WriteMemoryCallback curl_easy_perform curl_easy_cleanup printf free curl_global_cleanup main realloc exit llvm.memcpy.p0i8.p0i8.i32 LLVM ERROR: Tried to execute an unknown external function: malloc ----------------------- My target is to JIT compile e.g. the sources of a C/C++ library into executable inmemory code in order to call the individual functions of the library. It's just a different kind of late bindings ... Best Regards Armin Steinhoff -------------- next part -------------- A non-text attachment was scrubbed... Name: compile_to_module-return.cpp Type: text/x-c++src Size: 3732 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121011/b368244e/attachment.cpp> -------------- next part -------------- A non-text attachment was scrubbed... Name: getinmemory.c Type: text/x-c++src Size: 3402 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121011/b368244e/attachment.c> -------------- next part -------------- INCLUDES=-I /usr/local/include/clang -I/usr/local/include/llvm -I/usr/local/include LIBS= -L/home/CLING/build/Debug+Asserts/lib/clang/3.2/lib/linux -lclang_rt.full-i386 -lclang_rt.profile-i386 -L/home/CLING/build/Debug+Asserts/lib \ -lclangFrontendTool -lclangFrontend -lclangDriver \ -lclangSerialization -lclangCodeGen -lclangParse -lclangSema \ -lclangStaticAnalyzerFrontend -lclangStaticAnalyzerCheckers \ -lclangStaticAnalyzerCore \ -lclangAnalysis -lclangARCMigrate -lclangRewrite \ -lclangEdit -lclangAST -lclangLex -lclangBasic \ -lLLVMAsmParser -lLLVMTableGen -lLLVMDebugInfo -lLLVMX86AsmParser -lLLVMX86Disassembler -lLLVMX86CodeGen -lLLVMSelectionDAG -lLLVMAsmPrinter \ -lLLVMJIT -lLLVMMCDisassembler -lLLVMMCParser -lLLVMInstrumentation -lLLVMInterpreter \ -lLLVMCodeGen -lLLVMipo -lLLVMVectorize -lLLVMScalarOpts -lLLVMInstCombine -lLLVMLinker -lLLVMTransformUtils -lLLVMipa -lLLVMAnalysis -lLLVMArchive \ -lLLVMBitReader -lLLVMBitWriter -lLLVMMCJIT -lLLVMRuntimeDyld -lLLVMExecutionEngine -lLLVMTarget -lLLVMMC -lLLVMObject -lLLVMCore -lLLVMSupport \ -lLLVMX86Desc -lLLVMX86Info -lLLVMX86AsmPrinter -lLLVMX86Utils all : build_executable compile_to_module-return compile_to_module #build_executable : # g++ `llvm-config --cxxflags` $(INCLUDES) -o build_executable build_executable.cpp $(LIBS) -ldl -lpthread #-Wl,-rpath-link /home/CLING/build/Debug+Asserts/lib #compile_to_module : # g++ `llvm-config --cxxflags` $(INCLUDES) -o compile_to_module compile_to_module.cpp $(LIBS) -ldl -lpthread compile_to_module-return : g++ `llvm-config --cxxflags` $(INCLUDES) -o compile_to_module-return compile_to_module-return.cpp $(LIBS) -ldl -lpthread
Possibly Parallel Threads
- [LLVMdev] Solicit code review (change to CodeGen)
- About memory index/search in multithread program
- [RFC] Moving (parts of) the Cling REPL in Clang
- [cfe-dev] [RFC] Moving (parts of) the Cling REPL in Clang
- [cfe-dev] [RFC] Moving (parts of) the Cling REPL in Clang