search for: cmovbe

Displaying 9 results from an estimated 9 matches for "cmovbe".

Did you mean: movbe
2004 Sep 10
2
An assembly optimization and fix
...rlq mm3, 32 ; mm3 = 0:total_error_1 - movd ebx, mm0 ; ebx = total_error_0 movd ecx, mm3 ; ecx = total_error_1 - emms - mov eax, ebx ; eax = total_error_0 - cmp ecx, ebx + + xor ebx, ebx + xor ebp, ebp + inc ebx + cmp ecx, eax cmovb eax, ecx ; eax = min(total_error_0, total_error_1) + cmovbe ebp, ebx + inc ebx cmp edx, eax cmovb eax, edx ; eax = min(total_error_0, total_error_1, total_error_2) + cmovbe ebp, ebx + inc ebx cmp esi, eax cmovb eax, esi ; eax = min(total_error_0, total_error_1, total_error_2, total_error_3) + cmovbe ebp, ebx + inc ebx cmp edi, eax cmovb eax,...
2015 Jan 22
2
[LLVMdev] X86TargetLowering::LowerToBT
> On Jan 22, 2015, at 1:22 PM, Fiona Glaser <fglaser at apple.com> wrote: > > According to Agner’s docs, many CPUs have slower BT than TEST; Haswell has only 0.5 inverse throughput as opposed to 0.25, Atom has 1 instead of 0.5, and Silvermont can’t even dual-issue BT (it locks both ALUs). So while BT does seem have a shorter instruction encoding than TEST for TEST reg, imm32 where
2017 Apr 19
3
[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long
Changing the list from cfe-dev to llvm-dev > On 20 Apr 2017, at 4:52 AM, Michael Clark <michaeljclark at mac.com> wrote: > > I’m getting close. I think it may be an issue with an individual intrinsic. I’m looking for the X86 lowering of Instruction::FPToUI. > > I found a comment around the rationale for using a conditional move versus a branch. I believe the predicate logic
2015 Jan 22
2
[LLVMdev] X86TargetLowering::LowerToBT
...<chris.sears at gmail.com> wrote: > > I think the partial update issue isn't really valid concern, Agner Fogg, p 142. I don't think LLVM is going to emit this fragment. > > ; Example 10.7. Partial register access > bt eax,2 ; modifies carry flag but not zero flag > cmovbe eax,ebx ; reads both carry flag and zero flag > > In cases like this, you may consider whether it is a programming error or a deliberate testing of two different conditions with a single instruction. > _______________________________________________ > LLVM Developers mailing list >...
2012 Mar 02
3
[LLVMdev] Access Violation using ExecutionEngine on 64-bit Windows 8 Consumer Preview
Hi everyone! I've faced a strange problem after updating to Windows 8 Consumer Preview recently. It seems that LLVM inserts 4 calls to the same function at the start of generated code. The function's disassembly (taken from nearby computer with Windows 7) is: 00000000773A0DD0 sub rsp,10h 00000000773A0DD4 mov qword ptr [rsp],r10 00000000773A0DD8 mov qword ptr
2012 Feb 14
0
[LLVMdev] Strange behaviour with x86-64 windows, bad call instruction address
Hi all, Some background: I'm working on a project to replace a custom VM with various components of llvm. We have everything running just peachy keen with one recent exception, one of our executables crashes when attempting run a JIT'd function. We have llvm building and running on 64 bit Windows and Linux, using Visual Studio 2008 on Windows and gcc on Linux, and we have the llvm
2017 Apr 20
4
[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long
> This seems like it was done for perf reason (mispredict). Conditional-to-cmov transformation should keep > from introducing additional observable side-effects, and it's clear that whatever did this did not account > for floating point exception. That’s a very reasonable statement, but I’m not sure it corresponds to the way we have typically approached this sort of problem. In
2012 Feb 21
0
[LLVMdev] Strange behaviour with x86-64 windows, bad call instruction address
Hi all, me again! Well, after much hacking of code and thinking and frustration, I finally figured out what I was doing wrong. It turns out my initial attempts at using various gflags settings were causing VirtualAlloc to return GIANT addresses. In particular, the Application Verifier flag ( -vrf ), seems to cause VirtualAlloc to do what looks like top-down allocations and then llvm happily
2017 Apr 21
2
[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long
I think it’s generally true that whenever branches can reliably be predicted branching is faster than a cmov that involves speculative execution, and I would guess that your assessment regarding looping on input values is probably correct. I believe the code that actually creates most of the transformation you’re interested in here is in SelectionDAGLegalize::ExpandNode() in LegalizeDAG.cpp. The