Hello, I've added a little combining operation in DAGCombiner to fold a chain of insertelt nodes if that chain is proved to fully overwrite the very first source vector. In which case, I supposed a build_vector is better. It seems to be safe but I don't know if it is correctly implemented or if it is already done somewhere else. Please find attached the patch. Regards, Ivan -------------- next part -------------- A non-text attachment was scrubbed... Name: insertveltfold.patch Type: text/x-patch Size: 1144 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120217/cac3f349/attachment.bin>
Hi Ivan,> I've added a little combining operation in DAGCombiner to fold a chain of > insertelt nodes if that chain is proved to fully overwrite the very first source > vector. In which case, I supposed a build_vector is better. It seems to be safe > but I don't know if it is correctly implemented or if it is already done > somewhere else. Please find attached the patch.please also provide some testcases. Thanks, Duncan.
On Feb 17, 2012, at 12:50 AM, Ivan Llopard wrote:> Hello, > > I've added a little combining operation in DAGCombiner to fold a chain of insertelt nodes if that chain is proved to fully overwrite the very first source vector. In which case, I supposed a build_vector is better. It seems to be safe but I don't know if it is correctly implemented or if it is already done somewhere else. Please find attached the patch.Hi Ivan, This needs a testcase. -Chris
Hello Chris, Duncan,
This is a small test case I checked on:
define <2 x i16> @iveltfold(<2 x i16> %s1) nounwind {
entry:
   %0 = extractelement <2 x i16> %s1, i32 0
   %conv = sext i16 %0 to i32
   %mul = mul nsw i32 %conv, 5793
   %conv1 = trunc i32 %mul to i16
   %1 = insertelement <2 x i16> %s1, i16 %conv1, i32 0
   %2 = extractelement <2 x i16> %1, i32 1
   %conv2 = sext i16 %2 to i32
   %mul3 = mul nsw i32 %conv2, 5793
   %conv4 = trunc i32 %mul3 to i16
   %3 = insertelement <2 x i16> %1, i16 %conv4, i32 1
   ret <2 x i16> %3
}
the insertelement chain is replaced by one build_vector node. I have a 
custom BE but I've checked it also using the x86 one and I had the 
following results:
$llc -march=x86 iveltfold.ll -o -
Before patch
     .globl    iveltfold
     .align    16, 0x90
     .type    iveltfold, at function
iveltfold:                              # @iveltfold
# BB#0:                                 # %entry
     movd    %xmm0, %eax
     imull    $5793, %eax, %ecx       # imm = 0x16A1
     pextrw    $4, %xmm0, %eax
     pinsrd    $0, %ecx, %xmm0
     imull    $5793, %eax, %eax       # imm = 0x16A1
     pinsrd    $2, %eax, %xmm0
     ret
.Ltmp0:
     .size    iveltfold, .Ltmp0-iveltfold
     .section    ".note.GNU-stack","", at progbits
After patch
     .globl    iveltfold
     .align    16, 0x90
     .type    iveltfold, at function
iveltfold:                              # @iveltfold
# BB#0:                                 # %entry
     pextrw    $4, %xmm0, %eax
     imull    $5793, %eax, %eax       # imm = 0x16A1
     movd    %xmm0, %ecx
     imull    $5793, %ecx, %ecx       # imm = 0x16A1
     movd    %ecx, %xmm0
     pinsrd    $2, %eax, %xmm0
     ret
.Ltmp0:
     .size    iveltfold, .Ltmp0-iveltfold
     .section    ".note.GNU-stack","", at progbits
Ivan
Le 17/02/2012 10:19, Chris Lattner a écrit :> On Feb 17, 2012, at 12:50 AM, Ivan Llopard wrote:
>
>> Hello,
>>
>> I've added a little combining operation in DAGCombiner to fold a
chain of insertelt nodes if that chain is proved to fully overwrite the very
first source vector. In which case, I supposed a build_vector is better. It
seems to be safe but I don't know if it is correctly implemented or if it is
already done somewhere else. Please find attached the patch.
> Hi Ivan,
>
> This needs a testcase.
>
> -Chris
Maybe Matching Threads
- [LLVMdev] Folding an insertelt chain
- [LLVMdev] RFC: Should we have (something like) -extra-vectorizer-passes in -O2?
- [LLVMdev] i1* function argument on x86-64
- [LLVMdev] Types inference in tblgen: Multiple exceptions
- [LLVMdev] Types inference in tblgen: Multiple exceptions