thr3ads.net - similar to: "[LLVMdev] Adding scheduling constraints to intrinsics"

Displaying 20 results from an estimated 20000 matches similar to: "[LLVMdev] Adding scheduling constraints to intrinsics"

[LLVMdev] Behaviour of NVPTX intrinsic

2014 Sep 30

[LLVMdev] Behaviour of NVPTX intrinsic

The actual purpose that I wanted such an intrinsic is to solve a problem similar to this one in X86. Say I wanted to read the "mxcsr" register(which is the status register for SSE instructions) after a particular instruction, then I need a kind of barrier intrinsic which will not allow the arithmetic instructions to move around it. Or else I will be reading the status of some other

[LLVMdev] floating point exception and SSE2 instructions

2006 Apr 19

[LLVMdev] floating point exception and SSE2 instructions

On Wed, 19 Apr 2006 19:28:34 +0100 Simon Burton <simon at arrowtheory.com> wrote: > > >From what I remember, this is a bug in debian libc: > some floating point flags are set incorrectly causing SIGFPE. > Can't find the bug report ATM. Oh, it just showed up on numpy-discussion: http://sources.redhat.com/bugzilla/show_bug.cgi?id=10 """ #include

[LLVMdev] LLVM floating point rounding modes

2011 Jul 09

[LLVMdev] LLVM floating point rounding modes

Hi, I am not sure if this is the right mailing list to ask my question, if not, please refer me to the proper one. Is there any support for rounding modes in LLVM floating point? I looked in the assembler reference manual, and it doesn't seem so. I am thinking about choosing LLVM as one of the backends for my programming language Babel-17 (www.babel-17.com). Babel-17 features interval

High CPU usage

2009 Sep 23

High CPU usage

Hi Jeff, Hi Jean-Marc, I first modified the FPU control word to raise an exception whenever a denormal is used. Then I used the debugger to locate the exceptions and added VERY_SMALLs where they seem to fit well. Although I got CPU usage as low as 10%, I seriously lack knowledge of how things work inside speex. So just changing some code is not the best idea for me. My second attempt was to

[LLVMdev] Generate scalar SSE instructions instead of packed instructions

2013 Feb 21

[LLVMdev] Generate scalar SSE instructions instead of packed instructions

On Thu, Feb 21, 2013 at 12:14 PM, Nadav Rotem <nrotem at apple.com> wrote: > You can change the input LLVM-IR. > > On Feb 21, 2013, at 7:16 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com> > wrote: > > Hi,**** > > ** ** > > I am interested in evaluating the performance of packed vs scalar > double-precision floating point instructions on

[LLVMdev] ldmxcsr reordering issue

2014 Jan 28

[LLVMdev] ldmxcsr reordering issue

Hi, I met troubles with jitting x86 codes when using Intrinsic::x86_sse_ldmxcsr. The target code must execute some SSE2 instruction with DAZ/FTZ modes enabled and others with DAZ/FTZ disabled. I'm trying to get this by emitting LDMXCSR instructions with proper flag words. It appeared however that execution engine sometimes reorders these instructions with computational ones (say with

[LLVMdev] failures in test-suite for make TEST=simple

2012 Dec 13

[LLVMdev] failures in test-suite for make TEST=simple

I use the 'make TEST=simple' as a pre-commit test. I think that everybody should run these tests before committing to LLVM. On Dec 12, 2012, at 5:06 PM, reed kotler <rkotler at mips.com> wrote: > when I create the report, there are no failures in it. so maybe these are being filtered for known failures. > > On 12/12/2012 05:03 PM, reed kotler wrote: >> The first

[LLVMdev] failures in test-suite for make TEST=simple

2012 Dec 13

[LLVMdev] failures in test-suite for make TEST=simple

The first one failed on a diff: ******************** TEST (simple) 'sse.expandfft' FAILED! ******************** Execution Context Diff: /home/rkotler/llvmpb3/build/projects/test-suite/tools/fpcmp: Compared: 1.139094e-07 and 1.159249e-07 abs. diff = 2.015500e-09 rel.diff = 1.738626e-02 Out of tolerance: rel/abs: 1.600000e-02/0.000000e+00 ******************** TEST (simple)

Adding FP environment register modeling for constrained FP nodes

2017 Feb 14

Adding FP environment register modeling for constrained FP nodes

Hi Hal, Thanks for the guidance. I hope you don’t mind that I’m adding LLVMDev to this e-mail thread, as it seems as though it may be of general interest. I agree that duplicating the FP opcodes should be our goal. I just wasn’t sure that was entirely possible. I’ll try adding implicit defs in the way you’ve suggested, but I’m concerned that there may be code that relies on the TII for that

[LLVMdev] failures in test-suite for make TEST=simple

2012 Dec 13

[LLVMdev] failures in test-suite for make TEST=simple

when I create the report, there are no failures in it. so maybe these are being filtered for known failures. On 12/12/2012 05:03 PM, reed kotler wrote: > The first one failed on a diff: > ******************** TEST (simple) 'sse.expandfft' FAILED! > ******************** > Execution Context Diff: > /home/rkotler/llvmpb3/build/projects/test-suite/tools/fpcmp: Compared: >

[LLVMdev] floating point exception and SSE2 instructions

2006 Apr 19

[LLVMdev] floating point exception and SSE2 instructions

On Thu, 20 Apr 2006, Simon Burton wrote: >>> From what I remember, this is a bug in debian libc: >> some floating point flags are set incorrectly causing SIGFPE. >> Can't find the bug report ATM. > > Oh, it just showed up on numpy-discussion: > http://sources.redhat.com/bugzilla/show_bug.cgi?id=10 > > """ > #include <fenv.h> > void

[LLVMdev] floating point exception and SSE2 instructions

2006 Apr 19

[LLVMdev] floating point exception and SSE2 instructions

On Tue, 18 Apr 2006 23:27:39 -0700 Evan Cheng <evan.cheng at apple.com> wrote: > Hi Simon, > > The x86 backend does generate scalar SSE2 instructions. For your > example, it should emit something like: Oh, how did you get this ? [...] > > There is nothing here that should cause an exception. Are you using a > release or cvs? CVS. >From what I remember,

[LLVMdev] X86 - Help on fixing a poor code generation bug

2013 Dec 05

[LLVMdev] X86 - Help on fixing a poor code generation bug

Hi all, I noticed that the x86 backend tends to emit unnecessary vector insert instructions immediately after sse scalar fp instructions like addss/mulss. For example: ///////////////////////////////// __m128 foo(__m128 A, __m128 B) { _mm_add_ss(A, B); } ///////////////////////////////// produces the sequence: addss %xmm0, %xmm1 movss %xmm1, %xmm0 which could be easily optimized into

[LLVMdev] failures in test-suite for make TEST=simple

2012 Dec 13

[LLVMdev] failures in test-suite for make TEST=simple

I forgot to mention that you can also run "make TEST=simple report" which will generate a nice report. Do you know why these tests fail ? You can step into the test directory and run 'make TEST=simple' from there. It will save you some time. On Dec 12, 2012, at 4:04 PM, reed kotler <rkotler at mips.com> wrote: > I'm getting three failures. > > TEST-FAIL:

[LLVMdev] failures in test-suite for make TEST=simple

2012 Dec 13

[LLVMdev] failures in test-suite for make TEST=simple

I'm getting three failures. TEST-FAIL: exec /home/rkotler/llvmpb3/build/projects/test-suite/SingleSource/UnitTests/Vector/SSE/sse.expandfft TEST-RESULT-exec-time: user 0.3200 TEST-RESULT-exec-real-time: real 0.3172 TEST-FAIL: exec /home/rkotler/llvmpb3/build/projects/test-suite/SingleSource/UnitTests/Vector/SSE/sse.stepfft TEST-RESULT-exec-time: user 0.4000

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

Yes, I was just about to send out: DL->getABITypeAlignment(ScalarDataTy); The question is: “… ABI alignment for the target …" is that getPrefTypeAlignment or getABITypeAlignment I would have thought the latter. On Nov 15, 2013, at 4:12 PM, Hal Finkel <hfinkel at anl.gov> wrote: > ----- Original Message ----- >> From: "Arnold Schwaighofer"

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

----- Original Message ----- > From: "Arnold Schwaighofer" <aschwaighofer at apple.com> > To: "Joshua Klontz" <josh.klontz at gmail.com> > Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu> > Sent: Friday, November 15, 2013 4:05:53 PM > Subject: Re: [LLVMdev] Limit loop vectorizer to SSE > > > Something like: > > index

[LLVMdev] Enabling Vector-select

2011 Oct 16

[LLVMdev] Enabling Vector-select

Hello everyone, I wanted to let everybody know that I am going to enable the support for vector-select by default later today. Details: Currently the LLVM code-generator only supports 'select' [1] instructions with a boolean condition. Vectorizing compilers, such as the Intel OpenCL Vectorizer and the GCC vectorizer often use vector-select instructions to implements masks. This change

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 19

[LLVMdev] Limit loop vectorizer to SSE

On 16/11/2013 7:58 AM, Nadav Rotem wrote: > > On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org > <mailto:renato.golin at linaro.org>> wrote: > >> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com >> <mailto:josh.klontz at gmail.com>> wrote: >> >> Agreed, is there a pass that will insert a

[LLVMdev] Bug #16941

2013 Oct 26

[LLVMdev] Bug #16941

Hi Dmitry, Yes, this is a known problem with legalizing vector masks. The type <8 x i1> is legalized to 8 x i16, on SSE, but your operands are legalized to <4 x i32>. Type-legalization is performed per-node and we don’t have a good way to support instructions that mix the mask and operand type. Why does ISPC generate illegal vector types ? Does ISPC rely on the LLVM codegen to

similar to: [LLVMdev] Adding scheduling constraints to intrinsics