similar to: [LLVMdev] vselect on ARM/NEON

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] vselect on ARM/NEON"

2012 Oct 11
1
[LLVMdev] vselect on ARM/NEON
If you mark VSELECT as 'expand' then it will be expanded to a sequence of AND/OR/XOR, which is pretty efficient (found in LegalizeVectorOps.cpp ExpandVSELECT). On Oct 11, 2012, at 11:05 AM, Jim Grosbach <grosbach at apple.com> wrote: > Seems reasonable to me. Plain 'SELECT' is already marked expand for vector types. I bet that just didn't get updates when VSELECT
2012 Oct 11
0
[LLVMdev] vselect on ARM/NEON
Seems reasonable to me. Plain 'SELECT' is already marked expand for vector types. I bet that just didn't get updates when VSELECT was introduced. -Jim On Oct 11, 2012, at 10:25 AM, Peter Couperus <peter.couperus at st.com> wrote: > Hello, > > We've run into a couple of cases where we'd like to use select on vector types, but vselect handling is absent from
2013 Mar 11
2
[LLVMdev] Bug in visitSIGN_EXTEND in DAGCombiner.cpp?
On Mar 8, 2013, at 2:29 PM, Nadav Rotem <nrotem at apple.com<mailto:nrotem at apple.com>> wrote: Hi Richard, visitSIGN_EXTEND() in DAGCombiner.cpp generates an ISD::SELECT even if VT is a vector, which causes ExpandSELECT() to assert during legalization. I think what's required is to have visitSIGN_EXTEND generate a VSELECT if VT is a vector… ISD::SELECT should be used for
2013 Mar 11
0
[LLVMdev] Bug in visitSIGN_EXTEND in DAGCombiner.cpp?
Hi Richard, > > I did… It originates from an icmp ne <2x i8>, zero initializer followed by a sext of the result 2x i1 to 2x i8. When we visit the SIGN_EXTEND, we generate the ISD::SELECT even though the selector and both operands are vectors. > It sounds like a bug in the dag combine optimization. If you send me the line number I will take a look. >> We should probably
2013 Mar 11
3
[LLVMdev] Bug in visitSIGN_EXTEND in DAGCombiner.cpp?
On Mar 11, 2013, at 9:41 AM, Nadav Rotem <nrotem at apple.com<mailto:nrotem at apple.com>> wrote: Hi Richard, I did… It originates from an icmp ne <2x i8>, zero initializer followed by a sext of the result 2x i1 to 2x i8. When we visit the SIGN_EXTEND, we generate the ISD::SELECT even though the selector and both operands are vectors. It sounds like a bug in the dag combine
2013 Mar 08
0
[LLVMdev] Bug in visitSIGN_EXTEND in DAGCombiner.cpp?
Hi Richard, > visitSIGN_EXTEND() in DAGCombiner.cpp generates an ISD::SELECT even if VT is a vector, which causes ExpandSELECT() to assert during legalization. > I think what's required is to have visitSIGN_EXTEND generate a VSELECT if VT is a vector… ISD::SELECT should be used for cases where the selector is a scalar, even if the operands are vector. If you found a case where SELECT
2013 Mar 08
2
[LLVMdev] Bug in visitSIGN_EXTEND in DAGCombiner.cpp?
visitSIGN_EXTEND() in DAGCombiner.cpp generates an ISD::SELECT even if VT is a vector, which causes ExpandSELECT() to assert during legalization. I think what's required is to have visitSIGN_EXTEND generate a VSELECT if VT is a vector… I've tried a local change that cures this particular assert, but uncovers another assert later, so I'm a bit uncertain if I'm heading off in the
2013 Mar 11
0
[LLVMdev] Bug in visitSIGN_EXTEND in DAGCombiner.cpp?
> > Line 4501 in trunk DAGCombiner.cpp… I changed the ISD::SELECT to the VT.isVector() ? ISD::VSELECT : ISD::SELECT... > Thanks. From the commit message I think that we should only run this optimization on scalars. >> Can you write down the input SDNode ? What types are inputs ? > > 0x107046d10: v2i8 = vselect 0x107046c10, 0x107046b10, 0x107045e10 [ID=-3]
2013 Aug 19
2
[LLVMdev] [X86] DAG Combine - VSELECT
Hi @ll, I am wondering about the use of !isBeforeLegalize in PerformSELECTCombine in the X86 backend. This defers all VSELECT related DAG combines until after the Legalizer has run. If the IR has already only legal types the second round of DAG combines is skipped and no VSELECT specified optimizations are performed at all. Is there a reason we don’t run the X86 DAG combiner before Type
2012 Sep 05
3
[LLVMdev] Unaligned vector memory access for ARM/NEON.
Hello Jim, Thank you for the response. I may be confused about the alignment rules here. I had been looking at the ARM RVCT Assembler Guide, which seems to indicate vld1.16 operates on 16-bit aligned data, unless I am misinterpreting their table (Table 5-11 in ARM DUI 0204H, pg 5-70,5-71). Prior to the table, It does mention the accesses need to be "element" aligned, where I took
2013 May 28
2
[LLVMdev] Error on VSELECT Dagcombiner with some architecture
Hi all, I met the error while compiling the code with vector type with some architecture. IR is as following. %cmp = icmp sgt <3 x i8> %x, zeroinitializer %sub = sub <3 x i8> zeroinitializer, %x %cond = select <3 x i1> %cmp, <3 x i8> %x, <3 x i8> %sub 'select' IR is converted to 'vselect' dag and is combined to 'sra (X, size(X)-1); xor
2013 Aug 20
1
[LLVMdev] [X86] DAG Combine - VSELECT
Can this optimization be moved to the lowering phase? LowerVSELECT() ? - Elena From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Eli Friedman Sent: Tuesday, August 20, 2013 03:56 To: Juergen Ributzka Cc: Benjamin Kramer; LLVM Developers Mailing List Subject: Re: [LLVMdev] [X86] DAG Combine - VSELECT On Mon, Aug 19, 2013 at 4:17 PM, Juergen
2012 Sep 05
2
[LLVMdev] Unaligned vector memory access for ARM/NEON.
Hello all, I am a first time writer here, but am a happy LLVM tinkerer. It is a pleasure to use :). We have come across some sub-optimal behavior when LLVM lowers loads for vectors with small integers, i.e. load <4 x i16>* %a, align 2, using a sequence of scalar loads rather than a single vld1 on armv7 linux with NEON. Looking at the code in svn, it appears the ARM backend is capable of
2012 Sep 06
2
[LLVMdev] Unaligned vector memory access for ARM/NEON.
Hello, Thanks again. We did try overestimating the alignment, and saw the vldr you reference here. It looks like a recent change (r161962?) did enable vld1 generation for this case (great!) on darwin, but not linux. I'm not sure if the effect of lowering load <4 x i16>* align 2 to vld1.16 this was intentional in this change or not. If so, my question is what is the preferable way to
2013 May 28
0
[LLVMdev] Error on VSELECT Dagcombiner with some architecture
Hi JinGu Kang, On 28/05/13 17:18, jingu kang wrote: > Hi all, > > I met the error while compiling the code with vector type with some > architecture. IR is as following. > > %cmp = icmp sgt <3 x i8> %x, zeroinitializer > %sub = sub <3 x i8> zeroinitializer, %x > %cond = select <3 x i1> %cmp, <3 x i8> %x, <3 x i8> %sub > >
2013 Aug 20
0
[LLVMdev] [X86] DAG Combine - VSELECT
On Mon, Aug 19, 2013 at 4:17 PM, Juergen Ributzka <juergen at apple.com> wrote: > I see. We still can use that shortcut to catch the simple case after type > legalization, but we could also do a more elaborate type check before type > legalization to enable it? > If you're going to write the code to check the types anyway, it's probably clearer to remove the
2013 Aug 19
3
[LLVMdev] [X86] DAG Combine - VSELECT
I see. We still can use that shortcut to catch the simple case after type legalization, but we could also do a more elaborate type check before type legalization to enable it? On Aug 19, 2013, at 4:13 PM, Eli Friedman <eli.friedman at gmail.com> wrote: > On Mon, Aug 19, 2013 at 3:34 PM, Juergen Ributzka <juergen at apple.com> wrote: > Hi @ll, > > I am wondering about the
2012 Sep 06
2
[LLVMdev] Unaligned vector memory access for ARM/NEON.
On Sep 6, 2012, at 2:48 PM, David Peixotto <dpeixott at codeaurora.org> wrote: > Hi Pete, > > We ran into the same issue with generating vector loads/stores for vectors > with less than word alignment. It seems we took a similar approach to > solving the problem by modifying the logic in allowsUnalignedMemoryAccesses. > > As you and Jim mentioned, it looks like the
2012 Sep 05
0
[LLVMdev] Unaligned vector memory access for ARM/NEON.
Hmmm. Well, it's entirely possible that it's LLVM that's confused about the alignment requirements here. :) I think I see, in general, where. I twiddled the IR to give it higher alignment (16 bytes) and get: extend: @ @extend @ BB#0: vldr d16, [r0] vmovl.s16 q8, d16 vstmia r1, {d16, d17} vldr d16, [r0, #8] add r0, r1, #16 vmovl.s16 q8, d16 vstmia
2013 Aug 19
0
[LLVMdev] [X86] DAG Combine - VSELECT
On Mon, Aug 19, 2013 at 3:34 PM, Juergen Ributzka <juergen at apple.com> wrote: > Hi @ll, > > I am wondering about the use of !isBeforeLegalize in PerformSELECTCombine > in the X86 backend. This defers all VSELECT related DAG combines until > after the Legalizer has run. If the IR has already only legal types the > second round of DAG combines is skipped and no VSELECT