Displaying 3 results from an estimated 3 matches for "performvmulcombin".
Did you mean:
performvmulcombine
2013 Feb 11
0
[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?
Hi Bob, Seb, Renalto,
My VMLA performance work was on Swift, rather than Cortex-A9.
Sebastian - is vmlx-forwarding really the only variable you changed between
your tests?
As far as I can see the VMLx forwarding attribute only exists to restrict
the application of one DAG combine optimization: PerformVMULCombine in
ARMISelLowering.cpp, which turns (A + B) * C into (A * C) + (B * C). This
combine only ever triggers when vmlx-forwarding is on. I'd usually expect
this to increase vmla formation, rather than decrease it, but under some
circumstances (e.g. when the (A * C) and (B * C) expressions have exis...
2013 Feb 11
2
[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?
In theory, the backend should choose the best instructions for the selected target processor. VMLA is not always the best choice. Lang Hames did some measurements a while back to come up with the current behavior, but I don't remember exactly what he found. CC'ing Lang.
On Feb 11, 2013, at 8:12 AM, Renato Golin <renato.golin at linaro.org> wrote:
> On 11 February 2013 15:51,
2013 Feb 12
2
[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?
...ng.
Best Regards
Seb
My VMLA performance work was on Swift, rather than Cortex-A9.
Sebastian - is vmlx-forwarding really the only variable you changed between your tests?
As far as I can see the VMLx forwarding attribute only exists to restrict the application of one DAG combine optimization: PerformVMULCombine in ARMISelLowering.cpp, which turns (A + B) * C into (A * C) + (B * C). This combine only ever triggers when vmlx-forwarding is on. I'd usually expect this to increase vmla formation, rather than decrease it, but under some circumstances (e.g. when the (A * C) and (B * C) expressions have exis...