Displaying 12 results from an estimated 12 matches for "movmsk".
2012 Sep 02
2
[LLVMdev] branch on vector compare?
Hi all, llvm newbie here.
I'm trying to branch based on a vector compare. I've found a slow way (below)
which goes through memory. Is there some idiom I'm missing so that it would use
for instance movmsk for SSE or vcmpgt & cr6 for altivec?
Or do I need to resort to calling the intrinsic directly?
Thanks,
Stephen.
%16 = fcmp ogt <4 x float> %15, %cr
%17 = extractelement <4 x i1> %16, i32 0
%18 = extractelement <4 x i1> %16, i32 1
%19 = extractelement <4 x i1>...
2013 Oct 21
2
[LLVMdev] Bug #16941
Nadav,
You are absolutely right, it's ISPC workload. I've checked SSE4 and it's
also severely affected.
We use intrinsics only for conversion <N x i32> <=> i32, i.e. movmsk.ps.
For the rest we use general LLVM instructions. And I actually would really
like to stick this way. We rely on LLVM's ability to produce efficient code
from general LLVM IR. Relying on intrinsics too much would be a crunch and
a path to nowhere for many reasons :)
What is the reason for thi...
2012 Sep 03
0
[LLVMdev] branch on vector compare?
Hi Stephen,
> Hi all, llvm newbie here.
welcome!
> I'm trying to branch based on a vector compare. I've found a slow way (below)
> which goes through memory. Is there some idiom I'm missing so that it would use
> for instance movmsk for SSE or vcmpgt & cr6 for altivec?
I don't think you are missing anything: LLVM IR has no support for horizontal
operations like or'ing the elements of a vector of boolean together. The code
generators do try to recognize a few idioms and synthesize horizontal
operations from them,...
2013 Oct 21
0
[LLVMdev] Bug #16941
...On Oct 21, 2013, at 11:12 AM, Dmitry Babokin <babokin at gmail.com> wrote:
> Nadav,
>
> You are absolutely right, it's ISPC workload. I've checked SSE4 and it's also severely affected.
>
> We use intrinsics only for conversion <N x i32> <=> i32, i.e. movmsk.ps. For the rest we use general LLVM instructions. And I actually would really like to stick this way. We rely on LLVM's ability to produce efficient code from general LLVM IR. Relying on intrinsics too much would be a crunch and a path to nowhere for many reasons :)
>
> What is the reas...
2012 Sep 03
3
[LLVMdev] branch on vector compare?
> > which goes through memory. Is there some idiom I'm missing so that it would
use
> > for instance movmsk for SSE or vcmpgt & cr6 for altivec?
>
> I don't think you are missing anything: LLVM IR has no support for horizontal
> operations like or'ing the elements of a vector of boolean together. The code
> generators do try to recognize a few idioms and synthesize horizontal
&g...
2008 Dec 26
0
[LLVMdev] vector compare
On Dec 25, 2008, at 11:02 AM, Eli Friedman wrote:
> On Thu, Dec 25, 2008 at 1:54 AM, Eli Friedman
> <eli.friedman at gmail.com> wrote:
>> On Thu, Dec 25, 2008 at 1:28 AM, Claudio Basile <cbasile at tempo-
>> da.com> wrote:
>>> Hi all,
>>>
>>> is there any way to compare two 128bit values?
>>> I have tried 3 different approaches
2013 Oct 21
2
[LLVMdev] Bug #16941
Nadav,
Could you please have a look at bug #16941 and let us know what you think
about it? It's performance regression after one of your commits.
Thanks.
Dmitry.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131021/036e81d6/attachment.html>
2013 Oct 21
0
[LLVMdev] Bug #16941
Hi Dmitry.
This looks like an ISPC workload. ISPC works around a limitation in selection dag which does not know how to legalize mask types when both 128 and 256 bit registers are available. ISPC works around this problem by expanding the mask to i32s and using intrinsics. Can you please verify that this regression only happens on AVX ? Can you change ISPC to use intrinsics ?
Thanks
Nadav
Sent
2008 Dec 25
2
[LLVMdev] vector compare
On Thu, Dec 25, 2008 at 1:54 AM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Thu, Dec 25, 2008 at 1:28 AM, Claudio Basile <cbasile at tempo-da.com> wrote:
>> Hi all,
>>
>> is there any way to compare two 128bit values?
>> I have tried 3 different approaches and they all fail with an internal
>> assertion.
>> I'm running llvm 2.4 on
2012 Sep 04
0
[LLVMdev] branch on vector compare?
Am 04.09.2012 00:08, schrieb Stephen:
>>> which goes through memory. Is there some idiom I'm missing so that it would
> use
>>> for instance movmsk for SSE or vcmpgt & cr6 for altivec?
>>
>> I don't think you are missing anything: LLVM IR has no support for horizontal
>> operations like or'ing the elements of a vector of boolean together. The code
>> generators do try to recognize a few idioms and synthesiz...
2013 Oct 21
2
[LLVMdev] Bug #16941
...21, 2013, at 11:12 AM, Dmitry Babokin <babokin at gmail.com> wrote:
>
> Nadav,
>
> You are absolutely right, it's ISPC workload. I've checked SSE4 and it's
> also severely affected.
>
> We use intrinsics only for conversion <N x i32> <=> i32, i.e. movmsk.ps.
> For the rest we use general LLVM instructions. And I actually would really
> like to stick this way. We rely on LLVM's ability to produce efficient code
> from general LLVM IR. Relying on intrinsics too much would be a crunch and
> a path to nowhere for many reasons :)
>
&g...
2013 Oct 21
0
[LLVMdev] LLVMdev Digest, Vol 112, Issue 56
...PzHhMyyzpvm-2Lco0gdqNXSw8LQ at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
Nadav,
You are absolutely right, it's ISPC workload. I've checked SSE4 and it's
also severely affected.
We use intrinsics only for conversion <N x i32> <=> i32, i.e. movmsk.ps.
For the rest we use general LLVM instructions. And I actually would really
like to stick this way. We rely on LLVM's ability to produce efficient code
from general LLVM IR. Relying on intrinsics too much would be a crunch and
a path to nowhere for many reasons :)
What is the reason for thi...