Hi,
The attached is the fix to radar://11663049. The optimization can be
outlined by following rules:
(select (x != c), e, c) -> select (x != c), e, x),
(select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.
The reason for this change is that : on x86,
conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.
While the LowerSELECT() sounds to be the most convenient place for
this optimization, it turns out to be a bad place.The reason is that by
replacing the constant <c> with a symbolic value, it obscure some
instruction-combining opportunities which would otherwise be very easy
to spot. For that reason, I have to postpone the change to last
instruction-combining phase.
The change passes the test of "make check-all -C
<build-root/test"
and "make -C project/test-suite/SingleSource".
Thanks
Shuxin
-------------- next part --------------
Index: lib/Target/X86/X86ISelLowering.cpp
==================================================================---
lib/Target/X86/X86ISelLowering.cpp (revision 165638)
+++ lib/Target/X86/X86ISelLowering.cpp (working copy)
@@ -14418,6 +14418,7 @@
if (TrueC->getAPIntValue().ult(FalseC->getAPIntValue())) {
CC = X86::GetOppositeBranchCondition(CC);
std::swap(TrueC, FalseC);
+ std::swap(TrueOp, FalseOp);
}
// Optimize C ? 8 : 0 -> zext(setcc(C)) << 3. Likewise for any
pow2/0.
@@ -14500,6 +14501,45 @@
}
}
}
+
+ // Handle these cases:
+ // (select (x != c), e, c) -> select (x != c), e, x),
+ // (select (x == c), c, e) -> select (x == c), x, e)
+ // where the c is an integer constant, and the "select" is the
combination
+ // of CMOV and CMP.
+ //
+ // The rationale for this change is that the conditional-move from a constant
+ // needs two instructions, however, conditional-move from a register needs
+ // only one instruction.
+ //
+ // CAVEAT: By replacing a constant with a symbolic value, it may obscure
+ // some instruction-combining opportunities. This opt needs to be
+ // postponed as late as possible.
+ //
+ if (!DCI.isBeforeLegalize() && !DCI.isBeforeLegalizeOps()) {
+ // the DCI.xxxx conditions are provided to postpone the optimization as
+ // late as possible.
+
+ ConstantSDNode *CmpAgainst = 0;
+ if ((Cond.getOpcode() == X86ISD::CMP || Cond.getOpcode() == X86ISD::SUB)
&&
+ (CmpAgainst = dyn_cast<ConstantSDNode>(Cond.getOperand(1)))
&&
+ dyn_cast<ConstantSDNode>(Cond.getOperand(0)) == 0) {
+
+ if (CC == X86::COND_NE &&
+ CmpAgainst == dyn_cast<ConstantSDNode>(FalseOp)) {
+ CC = X86::GetOppositeBranchCondition(CC);
+ std::swap(TrueOp, FalseOp);
+ }
+
+ if (CC == X86::COND_E &&
+ CmpAgainst == dyn_cast<ConstantSDNode>(TrueOp)) {
+ SDValue Ops[] = { FalseOp, Cond.getOperand(0), N->getOperand(2),
Cond };
+ return DAG.getNode(X86ISD::CMOV, DL, N->getVTList (), Ops,
+ array_lengthof(Ops));
+ }
+ }
+ }
+
return SDValue();
}
Index: test/CodeGen/X86/select_const.ll
==================================================================---
test/CodeGen/X86/select_const.ll (revision 0)
+++ test/CodeGen/X86/select_const.ll (revision 0)
@@ -0,0 +1,16 @@
+; RUN: llc < %s -mtriple=x86_64-apple-darwin10 -mcpu=corei7 | FileCheck %s
+
+define i64 @test1(i64 %x) nounwind {
+entry:
+ %cmp = icmp eq i64 %x, 2
+ %add = add i64 %x, 1
+ %retval.0 = select i1 %cmp, i64 2, i64 %add
+ ret i64 %retval.0
+
+; CHECK: test1:
+; CHECK: leaq 1(%rdi), %rax
+; CHECK: cmpq $2, %rdi
+; CHECK: cmoveq %rdi, %rax
+; CHECK: ret
+
+}
LGTM. I will commit. On Oct 10, 2012, at 1:20 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:> Hi, > > The attached is the fix to radar://11663049. The optimization can be outlined by following rules: > > (select (x != c), e, c) -> select (x != c), e, x), > (select (x == c), c, e) -> select (x == c), x, e) > where the <c> is an integer constant. > > The reason for this change is that : on x86, conditional-move-from-constant needs two instructions; > however, conditional-move-from-register need only one instruction. > > While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place.The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase. > > The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource". > > Thanks > Shuxin > <diff.patch>_______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Hi All, I just started with the API of LLVM and have a beginner's question. I'm trying to JIT compile the attached C program "getinmemory" to an inmemory module by a my attached example LLVM program. After linking of this inmemory module against libc.so.6 I call the main routine of the C program. Why do I have still malloc as a unresoveld externals ? Contains the module "module_ex" not an fully executable program ? Here are the printouts ( Linux): ------------------------ compile_to_module-return clang -fdiagnostics-format=clang getinmemory.c -fsyntax-only -I /usr/include -I /usr/lib/gcc/i586-suse-linux/4.6/include -I /usr/src/linux/include -I /usr/src/linux/include/linux -v -fdollars-in-identifiers -fno-operator-names -ffp-contract=on -fobjc-runtime=macosx -triple i386-pc-linux-gnu clang -cc1 version 3.2 based upon LLVM 3.2svn default target i386-pc-linux-gnu #include "..." search starts here: #include <...> search starts here: /usr/include /usr/lib/gcc/i586-suse-linux/4.6/include /usr/src/linux/include /usr/src/linux/include/linux End of search list. [ clip .. warnings] Main malloc curl_global_init curl_easy_init curl_easy_setopt WriteMemoryCallback curl_easy_perform curl_easy_cleanup printf free curl_global_cleanup main realloc exit llvm.memcpy.p0i8.p0i8.i32 LLVM ERROR: Tried to execute an unknown external function: malloc ----------------------- My target is to JIT compile e.g. the sources of a C/C++ library into executable inmemory code in order to call the individual functions of the library. It's just a different kind of late bindings ... Best Regards Armin Steinhoff -------------- next part -------------- A non-text attachment was scrubbed... Name: compile_to_module-return.cpp Type: text/x-c++src Size: 3732 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121011/b368244e/attachment.cpp> -------------- next part -------------- A non-text attachment was scrubbed... Name: getinmemory.c Type: text/x-c++src Size: 3402 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121011/b368244e/attachment.c> -------------- next part -------------- INCLUDES=-I /usr/local/include/clang -I/usr/local/include/llvm -I/usr/local/include LIBS= -L/home/CLING/build/Debug+Asserts/lib/clang/3.2/lib/linux -lclang_rt.full-i386 -lclang_rt.profile-i386 -L/home/CLING/build/Debug+Asserts/lib \ -lclangFrontendTool -lclangFrontend -lclangDriver \ -lclangSerialization -lclangCodeGen -lclangParse -lclangSema \ -lclangStaticAnalyzerFrontend -lclangStaticAnalyzerCheckers \ -lclangStaticAnalyzerCore \ -lclangAnalysis -lclangARCMigrate -lclangRewrite \ -lclangEdit -lclangAST -lclangLex -lclangBasic \ -lLLVMAsmParser -lLLVMTableGen -lLLVMDebugInfo -lLLVMX86AsmParser -lLLVMX86Disassembler -lLLVMX86CodeGen -lLLVMSelectionDAG -lLLVMAsmPrinter \ -lLLVMJIT -lLLVMMCDisassembler -lLLVMMCParser -lLLVMInstrumentation -lLLVMInterpreter \ -lLLVMCodeGen -lLLVMipo -lLLVMVectorize -lLLVMScalarOpts -lLLVMInstCombine -lLLVMLinker -lLLVMTransformUtils -lLLVMipa -lLLVMAnalysis -lLLVMArchive \ -lLLVMBitReader -lLLVMBitWriter -lLLVMMCJIT -lLLVMRuntimeDyld -lLLVMExecutionEngine -lLLVMTarget -lLLVMMC -lLLVMObject -lLLVMCore -lLLVMSupport \ -lLLVMX86Desc -lLLVMX86Info -lLLVMX86AsmPrinter -lLLVMX86Utils all : build_executable compile_to_module-return compile_to_module #build_executable : # g++ `llvm-config --cxxflags` $(INCLUDES) -o build_executable build_executable.cpp $(LIBS) -ldl -lpthread #-Wl,-rpath-link /home/CLING/build/Debug+Asserts/lib #compile_to_module : # g++ `llvm-config --cxxflags` $(INCLUDES) -o compile_to_module compile_to_module.cpp $(LIBS) -ldl -lpthread compile_to_module-return : g++ `llvm-config --cxxflags` $(INCLUDES) -o compile_to_module-return compile_to_module-return.cpp $(LIBS) -ldl -lpthread
Possibly Parallel Threads
- [LLVMdev] Solicit code review (change to CodeGen)
- Where to solicit bids on RoR project?
- [Bug 795] RELATED doesn't accommodate multicast UDP solicitation resulting in unicast reply
- Multicast (ICMP6 router solicitation) flood
- Some observations on live migration & solution solicitation