Displaying 4 results from an estimated 4 matches for "current_input".
2013 Aug 09
2
[LLVMdev] [RFC] Poor code generation for paired load
...+ 2 int_to_fp vs. 2 ldfloat
Scenario #2
Paired loads are available on the target. Truncate and shift instructions are useless (instructions 2., 4., and 5.).
Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair
** To Reproduce **
Here is a way to reproduce the poor code generation for x86-64.
opt -sroa current_input.ll -S -o - | llc -O3 -o -
You will see 2 vmovd and 1 shrq that can be avoided as illustrated with the next command.
Here is a nicer code produced by modifying the input so that SROA generates friendlier code for this case.
opt -sroa mod_input.ll -S -o - | llc -O3 -o -
Basically the difference b...
2013 Aug 12
2
[LLVMdev] [RFC] Poor code generation for paired load
.... Truncate and shift instructions
>> are useless (instructions 2., 4., and 5.).
>> Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair
>>
>>
>> ** To Reproduce **
>>
>> Here is a way to reproduce the poor code generation for x86-64.
>>
>> opt -sroa current_input.ll -S -o - | llc -O3 -o -
>>
>> You will see 2 vmovd and 1 shrq that can be avoided as illustrated with the
>> next command.
>>
>> Here is a nicer code produced by modifying the input so that SROA generates
>> friendlier code for this case.
>>
>> o...
2013 Aug 10
0
[LLVMdev] [RFC] Poor code generation for paired load
...Paired loads are available on the target. Truncate and shift instructions
> are useless (instructions 2., 4., and 5.).
> Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair
>
>
> ** To Reproduce **
>
> Here is a way to reproduce the poor code generation for x86-64.
>
> opt -sroa current_input.ll -S -o - | llc -O3 -o -
>
> You will see 2 vmovd and 1 shrq that can be avoided as illustrated with the
> next command.
>
> Here is a nicer code produced by modifying the input so that SROA generates
> friendlier code for this case.
>
> opt -sroa mod_input.ll -S -o - | llc...
2013 Aug 12
0
[LLVMdev] [RFC] Poor code generation for paired load
...Paired loads are available on the target. Truncate and shift instructions
> are useless (instructions 2., 4., and 5.).
> Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair
>
>
> ** To Reproduce **
>
> Here is a way to reproduce the poor code generation for x86-64.
>
> opt -sroa current_input.ll -S -o - | llc -O3 -o -
>
> You will see 2 vmovd and 1 shrq that can be avoided as illustrated with the
> next command.
>
> Here is a nicer code produced by modifying the input so that SROA generates
> friendlier code for this case.
>
> opt -sroa mod_input.ll -S -o - | llc...