Displaying 9 results from an estimated 9 matches for "cmovbe".
Did you mean:
movbe
2004 Sep 10
2
An assembly optimization and fix
...rlq mm3, 32 ; mm3 = 0:total_error_1
- movd ebx, mm0 ; ebx = total_error_0
movd ecx, mm3 ; ecx = total_error_1
- emms
- mov eax, ebx ; eax = total_error_0
- cmp ecx, ebx
+
+ xor ebx, ebx
+ xor ebp, ebp
+ inc ebx
+ cmp ecx, eax
cmovb eax, ecx ; eax = min(total_error_0, total_error_1)
+ cmovbe ebp, ebx
+ inc ebx
cmp edx, eax
cmovb eax, edx ; eax = min(total_error_0, total_error_1, total_error_2)
+ cmovbe ebp, ebx
+ inc ebx
cmp esi, eax
cmovb eax, esi ; eax = min(total_error_0, total_error_1, total_error_2, total_error_3)
+ cmovbe ebp, ebx
+ inc ebx
cmp edi, eax
cmovb eax,...
2015 Jan 22
2
[LLVMdev] X86TargetLowering::LowerToBT
> On Jan 22, 2015, at 1:22 PM, Fiona Glaser <fglaser at apple.com> wrote:
>
> According to Agner’s docs, many CPUs have slower BT than TEST; Haswell has only 0.5 inverse throughput as opposed to 0.25, Atom has 1 instead of 0.5, and Silvermont can’t even dual-issue BT (it locks both ALUs). So while BT does seem have a shorter instruction encoding than TEST for TEST reg, imm32 where
2017 Apr 19
3
[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long
Changing the list from cfe-dev to llvm-dev
> On 20 Apr 2017, at 4:52 AM, Michael Clark <michaeljclark at mac.com> wrote:
>
> I’m getting close. I think it may be an issue with an individual intrinsic. I’m looking for the X86 lowering of Instruction::FPToUI.
>
> I found a comment around the rationale for using a conditional move versus a branch. I believe the predicate logic
2015 Jan 22
2
[LLVMdev] X86TargetLowering::LowerToBT
...<chris.sears at gmail.com> wrote:
>
> I think the partial update issue isn't really valid concern, Agner Fogg, p 142. I don't think LLVM is going to emit this fragment.
>
> ; Example 10.7. Partial register access
> bt eax,2 ; modifies carry flag but not zero flag
> cmovbe eax,ebx ; reads both carry flag and zero flag
>
> In cases like this, you may consider whether it is a programming error or a deliberate testing of two different conditions with a single instruction.
> _______________________________________________
> LLVM Developers mailing list
>...
2012 Mar 02
3
[LLVMdev] Access Violation using ExecutionEngine on 64-bit Windows 8 Consumer Preview
Hi everyone!
I've faced a strange problem after updating to Windows 8 Consumer
Preview recently. It seems that LLVM inserts 4 calls to the same
function at the start of generated code. The function's disassembly
(taken from nearby computer with Windows 7) is:
00000000773A0DD0 sub rsp,10h
00000000773A0DD4 mov qword ptr [rsp],r10
00000000773A0DD8 mov qword ptr
2012 Feb 14
0
[LLVMdev] Strange behaviour with x86-64 windows, bad call instruction address
Hi all,
Some background: I'm working on a project to replace a custom VM with various components of llvm. We have everything running just peachy keen with one recent exception, one of our executables crashes when attempting run a JIT'd function. We have llvm building and running on 64 bit Windows and Linux, using Visual Studio 2008 on Windows and gcc on Linux, and we have the llvm
2017 Apr 20
4
[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long
> This seems like it was done for perf reason (mispredict). Conditional-to-cmov transformation should keep
> from introducing additional observable side-effects, and it's clear that whatever did this did not account
> for floating point exception.
That’s a very reasonable statement, but I’m not sure it corresponds to the way we have typically approached this sort of problem.
In
2012 Feb 21
0
[LLVMdev] Strange behaviour with x86-64 windows, bad call instruction address
Hi all, me again!
Well, after much hacking of code and thinking and frustration, I finally figured out what I was doing wrong. It turns out my initial attempts at using various gflags settings were causing VirtualAlloc to return GIANT addresses. In particular, the Application Verifier flag ( -vrf ), seems to cause VirtualAlloc to do what looks like top-down allocations and then llvm happily
2017 Apr 21
2
[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long
I think it’s generally true that whenever branches can reliably be predicted branching is faster than a cmov that involves speculative execution, and I would guess that your assessment regarding looping on input values is probably correct.
I believe the code that actually creates most of the transformation you’re interested in here is in SelectionDAGLegalize::ExpandNode() in LegalizeDAG.cpp. The