Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] LLVM intrinsic for SSE ANDPS instruction"
2009 Dec 08
0
[LLVMdev] LLVM intrinsic for SSE ANDPS instruction
On Dec 8, 2009, at 11:18 AM, Zoltan Varga wrote:
> Hi,
>
> LLVM is used to have an llvm.x86.and_ps instrinsic for the ANDPS instruction, but it seems to be gone, and it is a bit hard to
> synthetize it from vector instructions, since 'and' only works on vectors of integer types. Would a patch be accepted which adds this and related instructions back ?
No. It won't be.
2009 Dec 08
2
[LLVMdev] LLVM intrinsic for SSE ANDPS instruction
Hi,
The arguments to the 'and' instruction must be integer types or vectors of
integer types. If
I have a compiler whose source language has support for andps by having its
own intrinsics,
then I would have to generate code to convert the float vector into an int
vector before passing
it to llvm's and instruction, then convert the result back.
2009 Dec 08
0
[LLVMdev] LLVM intrinsic for SSE ANDPS instruction
Hi Zoltan,
I think the bitcast operation is rather painless to use. And if you want to be able to execute it on a float vector you could try putting the and operation in a function with inline linkage and that would be all that's needed to convert over and back. BTW, bitcasting is a no-op conversion in actual code.
--Sam Crow
>
>From: Zoltan Varga <vargaz at gmail.com>
2010 May 11
2
[LLVMdev] How does SSEDomainFix work?
Hello. This is my 1st post.
I have tried SSE execution domain fixup pass.
But I am not able to see any improvements.
I expect for the example below to use MOVDQA, PAND &c.
(On nehalem, ANDPS is extremely slower than PAND)
Please tell me if something would be wrong for me.
Thank you.
Takumi
Host: i386-mingw32
Build: trunk at 103373
foo.ll:
define <4 x i32> @foo(<4 x i32> %x,
2010 May 11
0
[LLVMdev] How does SSEDomainFix work?
On May 10, 2010, at 9:07 PM, NAKAMURA Takumi wrote:
> Hello. This is my 1st post.
ようこそ!
> I have tried SSE execution domain fixup pass.
> But I am not able to see any improvements.
Did you actually measure runtime, or did you look at assembly?
> I expect for the example below to use MOVDQA, PAND &c.
> (On nehalem, ANDPS is extremely slower than PAND)
Are you sure? The
2009 May 04
4
[LLVMdev] [PATCH] Add support for accessing the FS segment register on X86
Hi,
Here is an updated version of the patch using address space 257.
Zoltan
On Mon, May 4, 2009 at 11:36 PM, Shantonu Sen <ssen at apple.com> wrote:
> Maybe 257 would be better (or other unused), because of r70197, which gives
> special behavior for <256
>
> Shantonu Sen
> ssen at apple.com
>
> Sent from my Mac Pro
>
>
> On May 4, 2009,
2008 Dec 09
3
[LLVMdev] [PATH] Add sub.ovf/mul.ovf intrinsics
Hi,
Attached is the final version of the patch, adding the requested
FIXME. If this is ok, can
somebody check it in ?
thanks
Zoltan
On Tue, Dec 9, 2008 at 9:58 PM, Bill Wendling <isanbard at gmail.com> wrote:
> On Tue, Dec 9, 2008 at 6:11 AM, Zoltan Varga <vargaz at gmail.com> wrote:
>> Hi,
>>
2009 Sep 05
4
[LLVMdev] loads from a null address and optimizations
Hi,
I don't intentionally want to induce a tramp, the load null is created by
an llvm optimization
pass from code like:
v = null;
.....
v.Call ();
Zoltan
On Sat, Sep 5, 2009 at 11:39 PM, Bill Wendling <isanbard at gmail.com> wrote:
> Hi Zoltan,
>
> We've come across this before where people meant to induce a trap by
> dereferencing a null. It
2009 May 04
3
[LLVMdev] [PATH] Fixes for the amd64 JIT code
Hi,
If this looks ok, could somebody check it in ?
thanks
Zoltan
Evan Cheng-2 wrote:
>
> Looks good. Thanks.
>
> Evan
>
> On May 1, 2009, at 8:40 AM, Zoltan Varga wrote:
>
>> Hi,
>>
>> The attached patch contains the following changes:
>>
>> * X86InstrInfo.cpp: Synchronize a few places with the code
2009 Jun 01
3
[LLVMdev] [PATH] Fix support for .umul.with.overflow on x86 + fix c binding
Hi,
The first patch fixes the implementation of umul.with.overflow on x86
which was throwing a 'Cannot yet select' error.
The second patch fixes the definition of LLVMTypeKind in the C binding by
syncing it with the c++ counterpart.
Please review and commit if it looks ok.
thanks
Zoltan
-------------- next part --------------
An HTML attachment was
2009 May 05
1
[LLVMdev] [PATH] Fixes for the amd64 JIT code
Hi,
It looks like the problem was with the RIP relative addressing. The
original patch mistakenly
removed the || DispForReloc part because I tough that the RIP relative
addressing was done
by the SIB encodings, but it is actually done by the shorter ones.
The attached patch seems to work for me on linux and when simulating darwin
by forcing some variables in X86TargetMachine.cpp to their darwin
2009 Sep 05
3
[LLVMdev] loads from a null address and optimizations
Hi,
Currently, llvm treats the loads from a null address as unreachable code,
i.e.:
load i32* null
is transformed by some optimization pass into
unreachable
This presents problems in JIT compilers like mono which implement null
pointer checks by trapping SIGSEGV signals. It also
looks incorrect since it changes program behavior, which might be undefined
in general, but it is quite
2009 May 05
2
[LLVMdev] [PATH] Fixes for the amd64 JIT code
Hi Zoltan,
The part that determines whether SIB byte is needed caused a lot of
regressions last night (see Geryon-X86-64 etc.). I've reverted it for
now. Please take a look.
Thanks,
Evan
On May 4, 2009, at 3:49 PM, Evan Cheng wrote:
> Committed as revision 70929. Thanks.
>
> Evan
>
> On May 3, 2009, at 8:29 PM, vargaz wrote:
>
>>
>> Hi,
>>
>>
2009 May 04
1
[LLVMdev] [PATCH] Add support for accessing the FS segment register on X86
Hi,
If I'm writing a JIT, and want to access the TLS variables of the app
containing the JIT, I can't
use thread_local since that only works for variables declared in LLVM IL
and/or managed by
the ExecutionEngine. While this patch allows a JIT to generate the TLS
accesses itself, if
it knows the tls offset of the variable in question.
Zoltan
On Tue, May 5, 2009 at
2008 Dec 09
1
[LLVMdev] [PATH] Add sub.ovf/mul.ovf intrinsics
Hi,
The add.with.overflow instrinsics don't seem to work with constant
arguments, i.e.
changing the call in add-with-overflow.ll to:
%t = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 0, i32 0)
causes the following exception when running the codegen tests:
llc: DAGCombiner.cpp:646:
void<unnamed>::DAGCombiner::Run(llvm::CombineLevel): Assertion
`N->getValueType(0) ==
2009 Sep 14
3
[LLVMdev] merge request for 2.6
Hi,
Would it be possible to merge this commit:
http://llvm.org/viewvc/llvm-project?view=rev&revision=80960
to the llvm 2.6 branch ? Without it, incomplete unwind info is generated for
functions with 0 stack size.
thanks
Zoltan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2009 May 04
0
[LLVMdev] [PATCH] Add support for accessing the FS segment register on X86
Hello,
The preferred way to do TLS is to use the thread_local keyword.
There is x86-64 support for thread_local on ELF; if you need
it for other targets, I recommend looking at adapting it.
Dan
On May 4, 2009, at 2:59 PM, Zoltan Varga wrote:
> Hi,
>
> Here is an updated version of the patch using address space 257.
>
> Zoltan
>
> On Mon, May 4, 2009 at
2008 Dec 09
0
[LLVMdev] [PATH] Add sub.ovf/mul.ovf intrinsics
Applied. Thanks, Zoltan!
-bw
On Tue, Dec 9, 2008 at 1:12 PM, Zoltan Varga <vargaz at gmail.com> wrote:
> Hi,
>
> Attached is the final version of the patch, adding the requested
> FIXME. If this is ok, can
> somebody check it in ?
>
> thanks
>
> Zoltan
>
> On Tue, Dec 9, 2008 at 9:58 PM,
2009 May 05
0
[LLVMdev] [PATH] Fixes for the amd64 JIT code
Hi,
I can't reproduce these failures on my linux machine. The test machine
seems to be
running darwin. I suspect that the problem might be with RIP relative
addressing, or with
the encoding of R12/R13, but the code seems to handle the latter, since it
checks for
ESP/EBP which is the same as R12/R13.
Zoltan
On Tue, May 5, 2009 at 8:18 PM, Evan Cheng <evan.cheng at
2009 May 01
2
[LLVMdev] [PATH] Fixes for the amd64 JIT code
Hi,
The attached patch contains the following changes:
* X86InstrInfo.cpp: Synchronize a few places with the code in
X86CodeEmitter.cpp
* X86CodeEmitter.cpp: Avoid the longer SIB encoding on amd64 if it is not
neeed.
Zoltan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: