Displaying 4 results from an estimated 4 matches for "ldfloat".
2013 Aug 09
2
[LLVMdev] [RFC] Poor code generation for paired load
...build field2 from chunk
6. %field2 = bitcast i32 %field2trunced to float // <— build field2 from chunk
Scenario #1:
Floating point registers are on another register bank and register bank moves are almost as expensive as loads (instructions 3. and 6.).
Cost: ldi64 + 2 int_to_fp vs. 2 ldfloat
Scenario #2
Paired loads are available on the target. Truncate and shift instructions are useless (instructions 2., 4., and 5.).
Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair
** To Reproduce **
Here is a way to reproduce the poor code generation for x86-64.
opt -sroa current_input.ll -S -o - |...
2013 Aug 12
2
[LLVMdev] [RFC] Poor code generation for paired load
...ld2trunced to float // <— build
>> field2 from chunk
>>
>> Scenario #1:
>> Floating point registers are on another register bank and register bank
>> moves are almost as expensive as loads (instructions 3. and 6.).
>> Cost: ldi64 + 2 int_to_fp vs. 2 ldfloat
>>
>> Scenario #2
>> Paired loads are available on the target. Truncate and shift instructions
>> are useless (instructions 2., 4., and 5.).
>> Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair
>>
>>
>> ** To Reproduce **
>>
>> Here is a wa...
2013 Aug 10
0
[LLVMdev] [RFC] Poor code generation for paired load
...field2 = bitcast i32 %field2trunced to float // <— build
> field2 from chunk
>
> Scenario #1:
> Floating point registers are on another register bank and register bank
> moves are almost as expensive as loads (instructions 3. and 6.).
> Cost: ldi64 + 2 int_to_fp vs. 2 ldfloat
>
> Scenario #2
> Paired loads are available on the target. Truncate and shift instructions
> are useless (instructions 2., 4., and 5.).
> Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair
>
>
> ** To Reproduce **
>
> Here is a way to reproduce the poor code generation for...
2013 Aug 12
0
[LLVMdev] [RFC] Poor code generation for paired load
...field2 = bitcast i32 %field2trunced to float // <— build
> field2 from chunk
>
> Scenario #1:
> Floating point registers are on another register bank and register bank
> moves are almost as expensive as loads (instructions 3. and 6.).
> Cost: ldi64 + 2 int_to_fp vs. 2 ldfloat
>
> Scenario #2
> Paired loads are available on the target. Truncate and shift instructions
> are useless (instructions 2., 4., and 5.).
> Cost: ldi64 + 2 trunc + 1 shift vs. 1 ldpair
>
>
> ** To Reproduce **
>
> Here is a way to reproduce the poor code generation for...