Hello Chris, Duncan,
This is a small test case I checked on:
define <2 x i16> @iveltfold(<2 x i16> %s1) nounwind {
entry:
%0 = extractelement <2 x i16> %s1, i32 0
%conv = sext i16 %0 to i32
%mul = mul nsw i32 %conv, 5793
%conv1 = trunc i32 %mul to i16
%1 = insertelement <2 x i16> %s1, i16 %conv1, i32 0
%2 = extractelement <2 x i16> %1, i32 1
%conv2 = sext i16 %2 to i32
%mul3 = mul nsw i32 %conv2, 5793
%conv4 = trunc i32 %mul3 to i16
%3 = insertelement <2 x i16> %1, i16 %conv4, i32 1
ret <2 x i16> %3
}
the insertelement chain is replaced by one build_vector node. I have a
custom BE but I've checked it also using the x86 one and I had the
following results:
$llc -march=x86 iveltfold.ll -o -
Before patch
.globl iveltfold
.align 16, 0x90
.type iveltfold, at function
iveltfold: # @iveltfold
# BB#0: # %entry
movd %xmm0, %eax
imull $5793, %eax, %ecx # imm = 0x16A1
pextrw $4, %xmm0, %eax
pinsrd $0, %ecx, %xmm0
imull $5793, %eax, %eax # imm = 0x16A1
pinsrd $2, %eax, %xmm0
ret
.Ltmp0:
.size iveltfold, .Ltmp0-iveltfold
.section ".note.GNU-stack","", at progbits
After patch
.globl iveltfold
.align 16, 0x90
.type iveltfold, at function
iveltfold: # @iveltfold
# BB#0: # %entry
pextrw $4, %xmm0, %eax
imull $5793, %eax, %eax # imm = 0x16A1
movd %xmm0, %ecx
imull $5793, %ecx, %ecx # imm = 0x16A1
movd %ecx, %xmm0
pinsrd $2, %eax, %xmm0
ret
.Ltmp0:
.size iveltfold, .Ltmp0-iveltfold
.section ".note.GNU-stack","", at progbits
Ivan
Le 17/02/2012 10:19, Chris Lattner a écrit :> On Feb 17, 2012, at 12:50 AM, Ivan Llopard wrote:
>
>> Hello,
>>
>> I've added a little combining operation in DAGCombiner to fold a
chain of insertelt nodes if that chain is proved to fully overwrite the very
first source vector. In which case, I supposed a build_vector is better. It
seems to be safe but I don't know if it is correctly implemented or if it is
already done somewhere else. Please find attached the patch.
> Hi Ivan,
>
> This needs a testcase.
>
> -Chris