similar to: [LLVMdev] llvm_fcmp_ord and llvm_fcmp_uno and assembly code generation

Displaying 20 results from an estimated 400 matches similar to: "[LLVMdev] llvm_fcmp_ord and llvm_fcmp_uno and assembly code generation"

2007 Oct 22
0
[LLVMdev] llvm_fcmp_ord and llvm_fcmp_uno and assembly code generation
Hi, Can you file a bugzilla on this? Thanks! Evan On Oct 19, 2007, at 3:50 AM, Török Edvin wrote: > Hi, > > The C backend in llc generates code like: > static inline int llvm_fcmp_ord(double X, double Y) { return X == X > && Y == Y; } > static inline int llvm_fcmp_uno(double X, double Y) { return X != X > || Y != Y; } > > First of all it generates a
2017 May 08
2
RFC: Element-atomic memory intrinsics
Hi Sanjoy, Responses inlined… > On May 8, 2017, at 12:49 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > Hi Daniel, > > [+CC Mehdi, Vedant for the auto upgrade issue] > > On Mon, May 8, 2017 at 7:54 AM, Daniel Neilson via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> **Method** >> >> Clearly we are going to have to teach
2001 Aug 20
1
Idletimeout patch, third attempt
Here is my third attempt at the idletimeout patch. I tried to address the points which Marcus Friedl brought up. It is actually bigger than the previous patches, but not as intrusive. It is big because it moves some stuff from serverloop.c to packet.c. - I moved all the logic to packet.c. This means that I also had to move the actual select() call, which used to be in serverloop.c to packet.c.
2017 May 08
3
RFC: Element-atomic memory intrinsics
Greetings all, I am picking up the work that was started in https://reviews.llvm.org/D27133 — adding support for an element-atomic memcpy/memset/memmove to LLVM. I would appreciate suggestions/thoughts/advice/comments on how to best proceed with this work in a way that will be acceptable to the LLVM group. I apologize in advance; this is going to be a long one... **Background** Loads/stores
2016 Mar 14
2
LLVM-3.8.0 libcxx in-tree build fails with cmath error ::signbit has not been declared
cmake -E cmake_progress_report llvm-3.8.0.src_bld_x86_64-rhel6.4-linux-gnu/CMakeFiles In file included from llvm-3.8.0.src/projects/libcxx/include/__hash_table:19:0, from llvm-3.8.0.src/projects/libcxx/src/hash.cpp:10: llvm-3.8.0.src/projects/libcxx/include/cmath:310:9: error: '::signbit' has not been declared using ::signbit; ^
2014 May 30
3
[LLVMdev] lit test suite on Windows always hangs.
I'm using Windows 8.1, and every time I run check-clang, I eventually end up with a bunch of hung processes. Generally this is an instance of clang.exe, a bunch of instances of FileCheck.exe, and occasionally an llc.exe and an opt.exe. Inside, the processes are all hung inside of calls to WriteFile() attempting to write to stdout. I notice some of the tests fail with output indicating that
2014 Oct 24
3
[LLVMdev] IndVar widening in IndVarSimplify causing performance regression on GPU programs
Hi, I noticed a significant performance regression (up to 40%) on some internal CUDA benchmarks (a reduced example presented below). The root cause of this regression seems that IndVarSimpilfy widens induction variables assuming arithmetics on wider integer types are as cheap as those on narrower ones. However, this assumption is wrong at least for the NVPTX64 target. Although the NVPTX64 target
2013 Mar 01
0
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
Ok, as I said, the most precise way to figure out what's wrong is to emit LLVM IR first (use clang -emit-llvm ...) and check out how it differs from working examples, for instance, nvptx regression tests. ----- Original message ----- > I'm building this with llvm-c, and accessing these intrinsics via calling > the intrinsic as if it were a function. > > class F_SREG<string
2013 Mar 01
4
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
I'm building this with llvm-c, and accessing these intrinsics via calling the intrinsic as if it were a function. class F_SREG<string OpStr, NVPTXRegClass regclassOut, Intrinsic IntOp> : NVPTXInst<(outs regclassOut:$dst), (ins), OpStr, [(set regclassOut:$dst, (IntOp))]>; def INT_PTX_SREG_TID_X : F_SREG<"mov.u32 \t$dst, %tid.x;",
2013 Mar 01
0
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
Hi Timothy, I'm not sure what you mean by this working for other intrinsics, but in this case, I think you want the intrinsic name llvm.nvvm.read.ptx.sreg.tid.x. For me, this looks like: %x = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() Pete On Fri, Mar 1, 2013 at 11:51 AM, Timothy Baldridge <tbaldridge at gmail.com> wrote: > I'm building this with llvm-c, and accessing these
2013 Mar 01
1
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
The identifier INT_PTX_SREG_TID_X is the name of an instruction as the back-end sees it, and has very little to do with the name you should use in your IR. Your best bet is to look at the include/llvm/IR/IntrinsicsNVVM.td file and see the definitions for each intrinsic. Then, the name mapping is just: int_foo_bar -> llvm.foo.bar() int_ prefix becomes llvm., and all underscores turn into
2012 Jul 10
2
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Hi, Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of definitions of the function-scope and asm-scope. Here is an example: > cat test.cu __attribute__((device)) __attribute__((nv_linkonce_odr)) __inline__ int __any(int a) { int result; asm __volatile__ ("{ \n\t" ".reg .pred
2010 Mar 28
0
[LLVMdev] Which floating-point comparison?
On Sun, Mar 28, 2010 at 7:45 AM, Russell Wallace <russell.wallace at gmail.com> wrote: > I notice llvm provides both ordered and unordered variants of > floating-point comparison. Which of these is the right one to use by > default? I suppose the two criteria would be, in order of importance: > > 1. Which is more efficient (more directly maps to typical hardware)? You can
2019 Oct 08
2
PR43374 - when should comparing NaN values raise a floating point exception?
* Sanjay Patel via llvm-dev <llvm-dev at lists.llvm.org> [2019-10-01 09:44:54 -0400]: > Let's change the example to eliminate suspects: > #include <math.h> > int is_nan(float x) { > /* > The following subclauses provide macros that are quiet (non > floating-point exception raising) > versions of the relational operators, and other comparison
2017 Mar 01
2
[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm
Hi, We seem to have found a bug in the LLVM 3.8 code generator. We are using MCJIT and have isolated working.ll and broken.ll after middle-end optimizations -- in the block merge128, notice that broken.ll has a fcmp une comparison to zero and a jump based on that branch: merge128: ; preds = %true71, %false72 %_rtB_724 = load %B_repro_T*, %B_repro_T**
2005 Aug 23
0
[LLVMdev] Marking source locations without interfering with optimization?
On Fri, 19 Aug 2005, Michael McCracken wrote: > I've been thinking of adding an instruction, and I'm following the > advice in the docs to consult the list before doing something rash. Always a good idea! :) Instead of adding an instruction, I'd suggest adding an intrinsic. You can mark intrinsics as not reading/writing to memory (see lib/Analysis/BasicAliasAnalysis.cpp for
2013 May 20
2
[LLVMdev] VCOMISS instruction in X86
Hi, I'm looking at scalar and packed instructions in X86. The instruction VCOMISS is scalar. May I remove SSEPackedSingle/SSEPackedDouble domain from it? defm VUCOMISS : sse12_ord_cmp<0x2E, FR32, X86cmp, f32, f32mem, loadf32, "ucomiss", SSEPackedSingle>, TB, VEX, VEX_LIG; defm VUCOMISD : sse12_ord_cmp<0x2E, FR64, X86cmp, f64,
2012 Jul 10
0
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Dmitry, You might be better served by filing this as a bug (http://llvm.org/bugs/). Please include a test case and the steps to reproduce (i.e., what you've provided below). Chad On Jul 10, 2012, at 3:15 PM, Dmitry N. Mikushin wrote: > Hi, > > Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of
2013 Mar 01
2
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
I've written a compiler that outputs PTX code, the result seems fairly reasonable, but I'm not sure the intrinsics are getting compiled correctly. In addition, when I try load the module using CUDA, I get an error: CUDA_ERROR_NO_BINARY_FOR_GPU. I'm running this on a 2012 MBP with a 640M GPU. PTX Code (for a mandelbrot calculation): // // Generated by LLVM NVPTX Back-End //
2012 Jul 10
1
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Yes, sure, good idea, because might be also Clang-related. http://llvm.org/bugs/show_bug.cgi?id=13322 2012/7/11 Chad Rosier <mcrosier at apple.com> > Dmitry, > You might be better served by filing this as a bug (http://llvm.org/bugs/). > Please include a test case and the steps to reproduce (i.e., what you've > provided below). > > Chad > > On Jul 10, 2012,