Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] Making use of SSE intrinsics"
2008 May 20
0
[LLVMdev] Making use of SSE intrinsics
On Tue, May 20, 2008 at 5:03 AM, Nicolas Capens <nicolas at capens.net> wrote:
> LoadInst *x = new LoadInst(ptr_x, "", false, basicBlock);
>
> // y = rcpps(x) // FIXME
> StoreInst *storeResult = new StoreInst(y, ptr_y, false, basicBlock);
Using an IRBuilder, something like the following (uncompiled, but it's
at least approximately right):
Value* x =
2008 May 22
4
[LLVMdev] SSE intrinsic alignment bug?
Hi all,
I think I might have found a potential bug when using SSE intrinsic and
unaligned memory. Here's the code to reproduce it:
#include "llvm/Module.h"
#include "llvm/Intrinsics.h"
#include "llvm/Instructions.h"
#include "llvm/ModuleProvider.h"
#include "llvm/ExecutionEngine/JIT.h"
#include
2008 May 08
3
[LLVMdev] Vector code
Hi Nicolas (at least, I suspect your signing of your mail with "Anton" was not
intentional :-p),
> I assume that's the same as the online demo's "Show LLVM C++ API code"
> option (http://llvm.org/demo/)? I've tried that with a structure containing
> four floating-point components but it also appears to add them individually
> using extract/insert. Maybe
2008 May 08
0
[LLVMdev] Vector code
Hi Matthijs,
Yes, I've turned off the link-time optimizations (otherwise it just
propagates my constant vectors and immediate prints the result). :-)
Here's essentially what I try to generate:
void add(float z[4], float x[4], float y[4])
{
z[0] = x[0] + y[0];
z[1] = x[1] + y[1];
z[2] = x[2] + y[2];
z[3] = x[3] + y[3];
}
And here's part of the output from the online
2008 May 08
2
[LLVMdev] Vector code
llvm does not automatically vectorize your scalar code (as least for
now). You have to write gcc generic vector code or use vector builtins.
Evan
On May 8, 2008, at 1:46 PM, Nicolas Capens wrote:
> Hi Matthijs,
>
> Yes, I've turned off the link-time optimizations (otherwise it just
> propagates my constant vectors and immediate prints the result). :-)
>
> Here's
2008 May 23
2
[LLVMdev] SSE intrinsic alignment bug?
Yep, I'm fixing it (and others) now. Good catch.
Evan
On May 22, 2008, at 4:59 PM, Dan Gohman wrote:
>
> On May 22, 2008, at 4:24 PM, Nicolas Capens wrote:
>>
>>
>> Since I’m fairly new to LLVM I’m not entirely sure if this is really
>> a bug or something I’m not doing correctly, or whether it’s already
>> being addressed. The following thread appears to
2008 May 22
2
[LLVMdev] SSE intrinsic alignment bug?
The intent here is that "in" and "out" are always aligned, by forcing
the stack pointer in the function that defines them to be aligned. On
some targets (darwin) the stack pointer is always 16-byte aligned; on
other targets there should be code in the function prologue to force
it to be aligned.
On May 22, 2008, at 4:36 PM, Nicolas Capens wrote:
> Small typo, for
2008 May 22
0
[LLVMdev] SSE intrinsic alignment bug?
On May 22, 2008, at 4:24 PM, Nicolas Capens wrote:
>
>
> Since I’m fairly new to LLVM I’m not entirely sure if this is really
> a bug or something I’m not doing correctly, or whether it’s already
> being addressed. The following thread appears to talk about
> something similar:http://thread.gmane.org/gmane.comp.compilers.llvm.devel/9476/focus=9478
Looking at LLVM's
2008 May 23
0
[LLVMdev] SSE intrinsic alignment bug?
Fixed.
Evan
On May 22, 2008, at 5:02 PM, Evan Cheng wrote:
> Yep, I'm fixing it (and others) now. Good catch.
>
> Evan
>
> On May 22, 2008, at 4:59 PM, Dan Gohman wrote:
>
>>
>> On May 22, 2008, at 4:24 PM, Nicolas Capens wrote:
>>>
>>>
>>> Since I’m fairly new to LLVM I’m not entirely sure if this is really
>>> a bug or
2008 May 22
0
[LLVMdev] SSE intrinsic alignment bug?
Small typo, for the correct assembly code I meant:
mov eax,dword ptr [esp+8]
movups xmm0,xmmword ptr [eax]
rcpps xmm1,xmm0
mov eax,dword ptr [esp+4]
movups xmmword ptr [eax],xmm1
ret
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2008 May 09
0
[LLVMdev] Vector code
Hi Evan,
Please note that I'm not trying to compile from C code, I try to generate
functions at run-time directly. I want to keep it target-independent too, so
I can't use intrinsics either.
Cheers,
-Nicolas
-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Evan Cheng
Sent: Thursday, 08 May, 2008 23:31
To: LLVM
2013 Jan 01
0
[LLVMdev] git repository of the tutorial
On Sun, Dec 30, 2012 at 12:19 AM, Journeyer J. Joh
<oosaprogrammer at gmail.com> wrote:
> Hello,
>
> I just applied changes of LLVM 3.2 and it is tested with LLVM 3.2
> downloaded from the LLVM Download Page.
>
> I just worked for master branch only.
> The rest of the other branches need to be changed about this also.
> This will be done as soon as possible.
>
2013 Jan 01
1
[LLVMdev] git repository of the tutorial
Hello Peng Yu,
I found the same error on my Macbook Air.
This was my first trying on MacOS X.
Troubleshooting this might take sometime.
Only thing I can say now that klang is tested successfully on Ubuntu with
- LLVM 3.2 official release on LLVM download page
- LLVM svn latest update
Clang compile produces an error message on MacOS X with the LLVM svn
latest update
I am trying to find the
2016 Jun 15
3
[Proposal][RFC] Strided Memory Access Vectorization
Sorry for the spam. Copy-paste didn't capture the Subject properly. Resending with the correct Subject so that the thread is captured properly.
-----Original Message-----
From: Saito, Hideki
Sent: Wednesday, June 15, 2016 1:39 PM
To: 'llvm-dev at lists.llvm.org' <llvm-dev at lists.llvm.org>
Subject: RE: [llvm-dev] [Proposal][RFC] Strided Memory Access
Ashutosh,
First,
2012 Dec 30
3
[LLVMdev] git repository of the tutorial
Hello,
I just applied changes of LLVM 3.2 and it is tested with LLVM 3.2
downloaded from the LLVM Download Page.
I just worked for master branch only.
The rest of the other branches need to be changed about this also.
This will be done as soon as possible.
(Before I fix this, If you look at the commits for this issue for
master branch, you can easily fix it and test for others also.)
Sorry for
2005 Apr 13
2
easy question: obtaining rw1080.exe
Dear All,
Can anyone please tell me where I can obtain uncompiled binary
instalation files for R version 1.8. (i.e. rw1080.exe)?
I can only find the uncompiled source code on CRAN today.
Thank you,
Mary Wisz
msw@dmu.dk
[[alternative HTML version deleted]]
2009 Mar 19
1
[LLVMdev] Implementing MMX and SSE shifts
Hi all,
Recently some great work has been done to implement vector shifts as
described in the language reference, and I'd like to contribute by
attempting to match these operations on x86 to MMX and SSE instructions
whenever possible.
I'm experienced in writing MMX and SSE assembly but I'm unfamiliar with how
LLVM performs instruction selection. So every bit of information to
2016 Jun 18
2
[Proposal][RFC] Strided Memory Access Vectorization
>Vectorizer's output should be as clean as vector code can be so that analyses and optimizers downstream can
>do a great job optimizing.
Guess I should clarify this philosophical position of mine. In terms of vector code optimization that complicates
the output of vectorizer:
If vectorizer is the best place to perform the optimization, it should do so.
This includes the cases like
2005 Jan 25
1
Regex Crashing R (perl = TRUE) (PR#7564)
R-developers,
I've encountered another perl library regex bug that causes a
segmentation faults on my Linux/Windows R session. I reduced the script
to the snippet below. (Apologies if this was fixed with bug 7479, but
this bug seems quite different).
string <- paste(rep("=", 10000), collapse = " ")
crash <- function(x) {
for (i in 1:5) {
x <-
2016 Jun 30
0
[Proposal][RFC] Strided Memory Access Vectorization
One common concern raised for cases where Loop Vectorizer generate
bigger types than target supported:
Based on VF currently we check the cost and generate the expected set of
instruction[s] for bigger type. It has two challenges for bigger types cost
is not always correct and code generation may not generate efficient
instruction[s].
Probably can depend on the support provided by below RFC by