thr3ads.net - search: "ldfloat"

[LLVMdev] [RFC] Poor code generation for paired load

2013 Aug 09

2

[LLVMdev] [RFC] Poor code generation for paired load

...build field2 from chunk 6. %field2 = bitcast i32 %field2trunced to float // <— build field2 from chunk Scenario #1: Floating point registers are on another register bank and register bank moves are almost as expensive as loads (instructions 3. and 6.). Cost: ldi64 + 2 int_to_fp vs. 2 ldfloat Scenario #2 Paired loads are available on the target. Truncate and shift instructions are useless (instructions 2., 4., and 5.). Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair ** To Reproduce ** Here is a way to reproduce the poor code generation for x86-64. opt -sroa current_input.ll -S -o - |...

[LLVMdev] [RFC] Poor code generation for paired load

2013 Aug 12

2

[LLVMdev] [RFC] Poor code generation for paired load

...ld2trunced to float // <— build >> field2 from chunk >> >> Scenario #1: >> Floating point registers are on another register bank and register bank >> moves are almost as expensive as loads (instructions 3. and 6.). >> Cost: ldi64 + 2 int_to_fp vs. 2 ldfloat >> >> Scenario #2 >> Paired loads are available on the target. Truncate and shift instructions >> are useless (instructions 2., 4., and 5.). >> Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair >> >> >> ** To Reproduce ** >> >> Here is a wa...

[LLVMdev] [RFC] Poor code generation for paired load

2013 Aug 10

0

[LLVMdev] [RFC] Poor code generation for paired load

...field2 = bitcast i32 %field2trunced to float // <— build > field2 from chunk > > Scenario #1: > Floating point registers are on another register bank and register bank > moves are almost as expensive as loads (instructions 3. and 6.). > Cost: ldi64 + 2 int_to_fp vs. 2 ldfloat > > Scenario #2 > Paired loads are available on the target. Truncate and shift instructions > are useless (instructions 2., 4., and 5.). > Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair > > > ** To Reproduce ** > > Here is a way to reproduce the poor code generation for...

[LLVMdev] [RFC] Poor code generation for paired load

2013 Aug 12

0

[LLVMdev] [RFC] Poor code generation for paired load

...field2 = bitcast i32 %field2trunced to float // <— build > field2 from chunk > > Scenario #1: > Floating point registers are on another register bank and register bank > moves are almost as expensive as loads (instructions 3. and 6.). > Cost: ldi64 + 2 int_to_fp vs. 2 ldfloat > > Scenario #2 > Paired loads are available on the target. Truncate and shift instructions > are useless (instructions 2., 4., and 5.). > Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair > > > ** To Reproduce ** > > Here is a way to reproduce the poor code generation for...

search for: ldfloat