similar to: [LLVMdev] Floating point compare instruction selection

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] Floating point compare instruction selection"

2005 Mar 16
0
[LLVMdev] Floating point compare instruction selection
On Wed, 16 Mar 2005, Morten Ofstad wrote: > Hello, > > I didn't get any reply to my previous mail about adding floating point > intrinsics to the X86 pattern instruction selector... And I could really need > some help. Sorry about that, it slipped through the cracks. :( > Anyway, I think my confusion was caused partly by an already > existing bug in the instruction
2005 Mar 11
0
[LLVMdev] FP Intrinsics
Update: I have been working on this all day, and I finally got it working more or less with the pattern instruction selector... However, the generated code is not very good, and I haven't implemented the expand to calls if the target does not support these FP instructions. As an example, in the following function the sub abs and compare compiles to 13 instructions! Also it has changed the
2005 Mar 11
5
[LLVMdev] FP Intrinsics
Hello, I am trying to make the FP intrinsics (abs, sin, cos, sqrt) I've added work with the X86ISelPattern, but I'm having some difficulties understanding what needs to be done. I assume I have to add new nodetypes for the FP instructions to SelectionDAGNodes.h, and make nodes for these in SelectionDAGLowering::visitCall when I find the intrinsic... The part I don't quite
2005 Mar 17
1
[LLVMdev] Floating point compare instruction selection
Chris Lattner wrote: > On Wed, 16 Mar 2005, Morten Ofstad wrote: >> The case which emits code for the special case of comparing against >> constant 0.0 does not return after generating it's code, so the normal >> compare is also generated! As far as I can tell it should return right >> after this: >> >> BuildMI(BB, X86::SAHF, 1); >> >>
2005 May 13
1
[LLVMdev] gmake check failures on FreeBSD
FAIL: /usr/home/llvm/obj/../test/Regression/CodeGen/X86/2004-02-22-Casts.llx: Assertion failed: (isNew && "Got into the map somehow?"), function AddLegalizedOperand, file /usr/home/llvm/obj/../lib/CodeGen/SelectionDAG/LegalizeDAG.cpp, line 79. .text .align 16 .globl test1 .type test1, @function test1: fldl 4(%esp) ftst
2013 Jul 19
2
[LLVMdev] SIMD instructions and memory alignment on X86
Hmm, maybe sse isn't being enabled so its falling back to emulating sqrt? On Thu, Jul 18, 2013 at 10:45 PM, Peter Newman <peter at uformia.com> wrote: > In the disassembly, I'm seeing three cases of > call 76719BA1 > > I am assuming this is the sqrt function as this is the only function > called in the LLVM IR. > > The code at 76719BA1 is: > >
2013 Jul 19
2
[LLVMdev] SIMD instructions and memory alignment on X86
That should map directly to sqrtpd which can't modify ecx. On Thu, Jul 18, 2013 at 10:27 PM, Peter Newman <peter at uformia.com> wrote: > Sorry, that should have been llvm.x86.sse2.sqrt.pd > > > On 19/07/2013 3:25 PM, Craig Topper wrote: > > What is "frep.x86.sse2.sqrt.pd". I'm only familiar with things prefixed > with "llvm.x86". >
2013 Jul 19
4
[LLVMdev] SIMD instructions and memory alignment on X86
Hmm, I'm not able to get those .ll files to compile if I disable SSE and I end up with SSE instructions(including sqrtpd) if I don't disable it. On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman <peter at uformia.com> wrote: > Is there something specifically required to enable SSE? If it's not > detected as available (based from the target triple?) then I don't think
2013 Jul 19
0
[LLVMdev] SIMD instructions and memory alignment on X86
In the disassembly, I'm seeing three cases of call 76719BA1 I am assuming this is the sqrt function as this is the only function called in the LLVM IR. The code at 76719BA1 is: 76719BA1 push ebp 76719BA2 mov ebp,esp 76719BA4 sub esp,20h 76719BA7 and esp,0FFFFFFF0h 76719BAA fld st(0) 76719BAC fst dword ptr [esp+18h] 76719BB0 fistp
2013 Jul 19
3
[LLVMdev] fptoui calling a function that modifies ECX
Try adding ECX to the Defs of this part of lib/Target/X86/X86InstrCompiler.td like I've done below. I don't have a Windows machine to test myself. let Defs = [EAX, EDX, ECX, EFLAGS], FPForm = SpecialFP in { def WIN_FTOL_32 : I<0, Pseudo, (outs), (ins RFP32:$src), "# win32 fptoui", [(X86WinFTOL RFP32:$src)]>,
2013 Jul 19
0
[LLVMdev] SIMD instructions and memory alignment on X86
Is there something specifically required to enable SSE? If it's not detected as available (based from the target triple?) then I don't think we enable it specifically. Also it seems that it should handle converting to/from the vector types, although I can see it getting confused about needing to do that if it thinks SSE isn't available at all. On 19/07/2013 3:47 PM, Craig Topper
2013 Jul 19
2
[LLVMdev] fptoui calling a function that modifies ECX
I don't think that's going to work. On Fri, Jul 19, 2013 at 12:24 AM, Peter Newman <peter at uformia.com> wrote: > Thank you, I'm trying this now. > > > On 19/07/2013 5:23 PM, Craig Topper wrote: > > Try adding ECX to the Defs of this part of > lib/Target/X86/X86InstrCompiler.td like I've done below. I don't have a > Windows machine to test
2013 Jul 19
2
[LLVMdev] fptoui calling a function that modifies ECX
Here's my attempt at a fix. Adding Jakob to make sure I did this right. On Fri, Jul 19, 2013 at 2:34 AM, Peter Newman <peter at uformia.com> wrote: > That does appear to have worked. All my tests are passing now. > > I'll hand this out to our other devs & testers and make sure it's working > for them as well (not just on my machine). > > Thank you,
2013 Jul 19
0
[LLVMdev] fptoui calling a function that modifies ECX
Oh, excellent point, I agree. My bad. Now that I'm not assuming those are the sqrt, I see the sqrtpd's in the output. Also there are three fptoui's and there are 3 call instances. (Changing subject line again.) Now it looks like it's bug #13862 On 19/07/2013 4:51 PM, Craig Topper wrote: > I think those calls correspond to this > > %110 = fptoui double %109 to i32
2013 Jul 19
0
[LLVMdev] fptoui calling a function that modifies ECX
Thank you, I'm trying this now. On 19/07/2013 5:23 PM, Craig Topper wrote: > Try adding ECX to the Defs of this part of > lib/Target/X86/X86InstrCompiler.td like I've done below. I don't have > a Windows machine to test myself. > > let Defs = [EAX, EDX, ECX, EFLAGS], FPForm = SpecialFP in { > def WIN_FTOL_32 : I<0, Pseudo, (outs), (ins RFP32:$src), >
2013 Jul 19
0
[LLVMdev] fptoui calling a function that modifies ECX
That does appear to have worked. All my tests are passing now. I'll hand this out to our other devs & testers and make sure it's working for them as well (not just on my machine). Thank you, again. -- Peter N On 19/07/2013 5:45 PM, Craig Topper wrote: > I don't think that's going to work. > > > On Fri, Jul 19, 2013 at 12:24 AM, Peter Newman <peter at
2013 Jul 20
0
[LLVMdev] fptoui calling a function that modifies ECX
I've applied this and the test cases I have here continue to work, so it looks good to me. I've ran into another (seemingly unrelated) issue which I'll describe in a separate email to the dev list. -- Peter N On 20/07/2013 5:30 AM, Craig Topper wrote: > Here's my attempt at a fix. Adding Jakob to make sure I did this right. > > > On Fri, Jul 19, 2013 at 2:34 AM,
2012 Apr 04
4
[LLVMdev] Disabling x87 instructions for a sub-target
Hello there, I recently started working on the LLVM backend for a target that doesn't support x87 instructions. Currently, I am in the process of completely disabling some x87 instructions such as fcomi, fcompi,... for a specific sub-target. I also do not have SSE enabled for my sub-target, and llvm resorts to fcomi* instructions for FP compare instructions. Is there a way to bypass the
2015 Feb 13
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
I submitted the problem report to clang's bugzilla but no one seems to care so I have to send it to the mailing list. clang 3.7 svn (trunk 229055 as the time I was to report this problem) generates slower code than 3.5 (Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)) for the following code. It is a "8 queens puzzle" solver written as an educational example. As
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
The regressions in the performance of generated code, introduced by the llvm 3.6 release, don't seem to be limited to this 8 queens puzzle" solver test case. See... http://www.phoronix.com/scan.php?page=article&item=llvm-clang-3.5-3.6-rc1&num=1 where a bit hit in the performance of the Sparse Matrix Multiply test of the SciMark v2.0 benchmark was observed as well as others.