Displaying 3 results from an estimated 3 matches for "perfetta".
Did you mean:
perfecta
2010 Apr 09
0
[LLVMdev] compiler-rt's arm vfp o<= implementation
On 8 April 2010 02:28, Rodolph Perfetta <rodolph.perfetta at arm.com> wrote:
> movhi means mov if unsigned Higher
>
> movls means mov if unsigned Lower or Same
>
>
>
> so depending on the comparison result r0 holds 1 or 0
>
Thanks. Now that I understand the assembly, I think there's another problem.
l...
2009 Nov 11
0
[LLVMdev] speed up memcpy intrinsic using ARM Neon registers
On Nov 11, 2009, at 3:27 AM, Rodolph Perfetta wrote:
>
> If you know about the alignment, maybe use structured load/store
> (vst1.64/vld1.64 {dn-dm}). You may also want to work on whole cache
> lines
> (64 bytes on A8). You can find more in this discussion:
> http://groups.google.com/group/beagleboard/browse_thread/thread/1...
2009 Nov 10
4
[LLVMdev] speed up memcpy intrinsic using ARM Neon registers
I tried to speed up Dhrystone on ARM Cortex-A8 by optimizing the
memcpy intrinsic. I used the Neon load multiple instruction to move up
to 48 bytes at a time . Over 15 scalar instructions collapsed down
into these 2 Neon instructions.
fldmiad r3, {d0, d1, d2, d3, d4, d5} @ SrcLine dhrystone.c 359
fstmiad r1, {d0, d1, d2, d3, d4, d5}
It seems like this should be faster. But I did