search for: rev64

Displaying 4 results from an estimated 4 matches for "rev64".

Did you mean: reg64
2018 Apr 26
1
[Constant Folder, InstCombine, ARM, AArch64] Question about constant folding of vector load
...ld like to optimize at the IR level. I'd like to turn an Arm/AArch64 table lookup intrinsic that takes a constant vector mask into a shufflevector instruction: vtbl1(V,mask) ~> shufflevector(V,undef,mask) The reason is that if the mask is {7,6,5,4,3,2,1,0}, then the backend will generate rev64 instructions instead. If the mask comes from a vld1 of a global constant I could fold it to allow the above instruction combining. My question is, does the constant folding of the vld1 seem a good thing to do in the general case, as a standalone transformation, or only when used as a mask for a t...
2017 Nov 17
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...// %entry > sub sp, sp, #16 // =16 > str x0, [sp, #8] > ldr x0, [sp, #8] > ld1 { v0.2s }, [x0] > add sp, sp, #16 // =16 > ret > > With global-isel off, there is a rev64 instruction between the ld1 and the add, which fixes up the endianness of the vector. > > Oliver > > From: Oliver Stannard > Sent: 17 November 2017 13:32 > To: 'qcolombet at apple.com <mailto:qcolombet at apple.com>' > Cc: llvm-dev at lists.llvm.org <mailt...
2017 Nov 27
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...// %entry > sub sp, sp, #16 // =16 > str x0, [sp, #8] > ldr x0, [sp, #8] > ld1 { v0.2s }, [x0] > add sp, sp, #16 // =16 > ret > > With global-isel off, there is a rev64 instruction between the ld1 and the add, which fixes up the endianness of the vector. > > Oliver > > From: Oliver Stannard > Sent: 17 November 2017 13:32 > To: 'qcolombet at apple.com <mailto:qcolombet at apple.com>' > Cc: llvm-dev at lists.llvm.org <mailt...
2017 Nov 14
6
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
To give an update here, we actually are not missing a mapping. The code complains because we are copying around a fp16 into a gpr32 and that shouldn’t be done with a copy (default mapping). I extended the repairing code to issue G_ANYEXT in those cases instead of asserting. However, now, I have to teach instruction select about those ANYEXT otherwise we’ll fallback in that case. But that’s a