thr3ads.net - search: "current

[LLVMdev] [RFC] Poor code generation for paired load

2013 Aug 09

2

[LLVMdev] [RFC] Poor code generation for paired load

...+ 2 int_to_fp vs. 2 ldfloat Scenario #2 Paired loads are available on the target. Truncate and shift instructions are useless (instructions 2., 4., and 5.). Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair ** To Reproduce ** Here is a way to reproduce the poor code generation for x86-64. opt -sroa current_input.ll -S -o - | llc -O3 -o - You will see 2 vmovd and 1 shrq that can be avoided as illustrated with the next command. Here is a nicer code produced by modifying the input so that SROA generates friendlier code for this case. opt -sroa mod_input.ll -S -o - | llc -O3 -o - Basically the difference b...

[LLVMdev] [RFC] Poor code generation for paired load

2013 Aug 12

2

[LLVMdev] [RFC] Poor code generation for paired load

.... Truncate and shift instructions >> are useless (instructions 2., 4., and 5.). >> Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair >> >> >> ** To Reproduce ** >> >> Here is a way to reproduce the poor code generation for x86-64. >> >> opt -sroa current_input.ll -S -o - | llc -O3 -o - >> >> You will see 2 vmovd and 1 shrq that can be avoided as illustrated with the >> next command. >> >> Here is a nicer code produced by modifying the input so that SROA generates >> friendlier code for this case. >> >> o...

[LLVMdev] [RFC] Poor code generation for paired load

2013 Aug 10

0

[LLVMdev] [RFC] Poor code generation for paired load

...Paired loads are available on the target. Truncate and shift instructions > are useless (instructions 2., 4., and 5.). > Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair > > > ** To Reproduce ** > > Here is a way to reproduce the poor code generation for x86-64. > > opt -sroa current_input.ll -S -o - | llc -O3 -o - > > You will see 2 vmovd and 1 shrq that can be avoided as illustrated with the > next command. > > Here is a nicer code produced by modifying the input so that SROA generates > friendlier code for this case. > > opt -sroa mod_input.ll -S -o - | llc...

[LLVMdev] [RFC] Poor code generation for paired load

2013 Aug 12

0

[LLVMdev] [RFC] Poor code generation for paired load

...Paired loads are available on the target. Truncate and shift instructions > are useless (instructions 2., 4., and 5.). > Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair > > > ** To Reproduce ** > > Here is a way to reproduce the poor code generation for x86-64. > > opt -sroa current_input.ll -S -o - | llc -O3 -o - > > You will see 2 vmovd and 1 shrq that can be avoided as illustrated with the > next command. > > Here is a nicer code produced by modifying the input so that SROA generates > friendlier code for this case. > > opt -sroa mod_input.ll -S -o - | llc...

search for: current_input