Displaying 20 results from an estimated 20000 matches similar to: "[LLVMdev] Adding scheduling constraints to intrinsics"
2014 Sep 30
2
[LLVMdev] Behaviour of NVPTX intrinsic
The actual purpose that I wanted such an intrinsic is to solve a problem
similar to this one in X86. Say I wanted to read the "mxcsr" register(which
is the status register for SSE instructions) after a particular
instruction, then I need a kind of barrier intrinsic which will not allow
the arithmetic instructions to move around it. Or else I will be reading
the status of some other
2006 Apr 19
0
[LLVMdev] floating point exception and SSE2 instructions
On Wed, 19 Apr 2006 19:28:34 +0100
Simon Burton <simon at arrowtheory.com> wrote:
>
> >From what I remember, this is a bug in debian libc:
> some floating point flags are set incorrectly causing SIGFPE.
> Can't find the bug report ATM.
Oh, it just showed up on numpy-discussion:
http://sources.redhat.com/bugzilla/show_bug.cgi?id=10
"""
#include
2011 Jul 09
1
[LLVMdev] LLVM floating point rounding modes
Hi,
I am not sure if this is the right mailing list to ask my question, if not, please refer me to the proper one.
Is there any support for rounding modes in LLVM floating point? I looked in the assembler reference manual, and it doesn't seem so. I am thinking about choosing LLVM as one of the backends for my programming language Babel-17 (www.babel-17.com). Babel-17 features interval
2009 Sep 23
1
High CPU usage
Hi Jeff,
Hi Jean-Marc,
I first modified the FPU control word to raise an exception whenever a denormal is used. Then I used the debugger to locate the exceptions and added VERY_SMALLs where they seem to fit well.
Although I got CPU usage as low as 10%, I seriously lack knowledge of how things work inside speex. So just changing some code is not the best idea for me.
My second attempt was to
2013 Feb 21
2
[LLVMdev] Generate scalar SSE instructions instead of packed instructions
On Thu, Feb 21, 2013 at 12:14 PM, Nadav Rotem <nrotem at apple.com> wrote:
> You can change the input LLVM-IR.
>
> On Feb 21, 2013, at 7:16 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com>
> wrote:
>
> Hi,****
>
> ** **
>
> I am interested in evaluating the performance of packed vs scalar
> double-precision floating point instructions on
2014 Jan 28
2
[LLVMdev] ldmxcsr reordering issue
Hi,
I met troubles with jitting x86 codes when using Intrinsic::x86_sse_ldmxcsr.
The target code must execute some SSE2 instruction with DAZ/FTZ modes enabled and others with DAZ/FTZ disabled.
I'm trying to get this by emitting LDMXCSR instructions with proper flag words.
It appeared however that execution engine sometimes reorders these instructions with computational ones (say with
2012 Dec 13
1
[LLVMdev] failures in test-suite for make TEST=simple
I use the 'make TEST=simple' as a pre-commit test. I think that everybody should run these tests before committing to LLVM.
On Dec 12, 2012, at 5:06 PM, reed kotler <rkotler at mips.com> wrote:
> when I create the report, there are no failures in it. so maybe these are being filtered for known failures.
>
> On 12/12/2012 05:03 PM, reed kotler wrote:
>> The first
2012 Dec 13
2
[LLVMdev] failures in test-suite for make TEST=simple
The first one failed on a diff:
******************** TEST (simple) 'sse.expandfft' FAILED!
********************
Execution Context Diff:
/home/rkotler/llvmpb3/build/projects/test-suite/tools/fpcmp: Compared:
1.139094e-07 and 1.159249e-07
abs. diff = 2.015500e-09 rel.diff = 1.738626e-02
Out of tolerance: rel/abs: 1.600000e-02/0.000000e+00
******************** TEST (simple)
2017 Feb 14
2
Adding FP environment register modeling for constrained FP nodes
Hi Hal,
Thanks for the guidance. I hope you don’t mind that I’m adding LLVMDev to this e-mail thread, as it seems as though it may be of general interest.
I agree that duplicating the FP opcodes should be our goal. I just wasn’t sure that was entirely possible. I’ll try adding implicit defs in the way you’ve suggested, but I’m concerned that there may be code that relies on the TII for that
2012 Dec 13
0
[LLVMdev] failures in test-suite for make TEST=simple
when I create the report, there are no failures in it. so maybe these
are being filtered for known failures.
On 12/12/2012 05:03 PM, reed kotler wrote:
> The first one failed on a diff:
> ******************** TEST (simple) 'sse.expandfft' FAILED!
> ********************
> Execution Context Diff:
> /home/rkotler/llvmpb3/build/projects/test-suite/tools/fpcmp: Compared:
>
2006 Apr 19
2
[LLVMdev] floating point exception and SSE2 instructions
On Thu, 20 Apr 2006, Simon Burton wrote:
>>> From what I remember, this is a bug in debian libc:
>> some floating point flags are set incorrectly causing SIGFPE.
>> Can't find the bug report ATM.
>
> Oh, it just showed up on numpy-discussion:
> http://sources.redhat.com/bugzilla/show_bug.cgi?id=10
>
> """
> #include <fenv.h>
> void
2006 Apr 19
2
[LLVMdev] floating point exception and SSE2 instructions
On Tue, 18 Apr 2006 23:27:39 -0700
Evan Cheng <evan.cheng at apple.com> wrote:
> Hi Simon,
>
> The x86 backend does generate scalar SSE2 instructions. For your
> example, it should emit something like:
Oh, how did you get this ?
[...]
>
> There is nothing here that should cause an exception. Are you using a
> release or cvs?
CVS.
>From what I remember,
2013 Dec 05
3
[LLVMdev] X86 - Help on fixing a poor code generation bug
Hi all,
I noticed that the x86 backend tends to emit unnecessary vector insert
instructions immediately after sse scalar fp instructions like
addss/mulss.
For example:
/////////////////////////////////
__m128 foo(__m128 A, __m128 B) {
_mm_add_ss(A, B);
}
/////////////////////////////////
produces the sequence:
addss %xmm0, %xmm1
movss %xmm1, %xmm0
which could be easily optimized into
2012 Dec 13
0
[LLVMdev] failures in test-suite for make TEST=simple
I forgot to mention that you can also run "make TEST=simple report" which will generate a nice report.
Do you know why these tests fail ? You can step into the test directory and run 'make TEST=simple' from there. It will save you some time.
On Dec 12, 2012, at 4:04 PM, reed kotler <rkotler at mips.com> wrote:
> I'm getting three failures.
>
> TEST-FAIL:
2012 Dec 13
2
[LLVMdev] failures in test-suite for make TEST=simple
I'm getting three failures.
TEST-FAIL: exec
/home/rkotler/llvmpb3/build/projects/test-suite/SingleSource/UnitTests/Vector/SSE/sse.expandfft
TEST-RESULT-exec-time: user 0.3200
TEST-RESULT-exec-real-time: real 0.3172
TEST-FAIL: exec
/home/rkotler/llvmpb3/build/projects/test-suite/SingleSource/UnitTests/Vector/SSE/sse.stepfft
TEST-RESULT-exec-time: user 0.4000
2013 Nov 15
2
[LLVMdev] Limit loop vectorizer to SSE
Yes,
I was just about to send out:
DL->getABITypeAlignment(ScalarDataTy);
The question is:
“… ABI alignment for the target …"
is that
getPrefTypeAlignment
or
getABITypeAlignment
I would have thought the latter.
On Nov 15, 2013, at 4:12 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
>> From: "Arnold Schwaighofer"
2013 Nov 15
0
[LLVMdev] Limit loop vectorizer to SSE
----- Original Message -----
> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
> To: "Joshua Klontz" <josh.klontz at gmail.com>
> Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu>
> Sent: Friday, November 15, 2013 4:05:53 PM
> Subject: Re: [LLVMdev] Limit loop vectorizer to SSE
>
>
> Something like:
>
> index
2011 Oct 16
3
[LLVMdev] Enabling Vector-select
Hello everyone,
I wanted to let everybody know that I am going to enable the support for vector-select by default later today.
Details:
Currently the LLVM code-generator only supports 'select' [1] instructions with a boolean condition. Vectorizing compilers, such as the Intel OpenCL Vectorizer and the GCC vectorizer often use vector-select instructions to implements masks. This change
2013 Nov 19
0
[LLVMdev] Limit loop vectorizer to SSE
On 16/11/2013 7:58 AM, Nadav Rotem wrote:
>
> On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org
> <mailto:renato.golin at linaro.org>> wrote:
>
>> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com
>> <mailto:josh.klontz at gmail.com>> wrote:
>>
>> Agreed, is there a pass that will insert a
2013 Oct 26
0
[LLVMdev] Bug #16941
Hi Dmitry,
Yes, this is a known problem with legalizing vector masks. The type <8 x i1> is legalized to 8 x i16, on SSE, but your operands are legalized to <4 x i32>. Type-legalization is performed per-node and we don’t have a good way to support instructions that mix the mask and operand type. Why does ISPC generate illegal vector types ? Does ISPC rely on the LLVM codegen to