search for: ld1w

Displaying 3 results from an estimated 3 matches for "ld1w".

Did you mean: ld1
2016 Nov 25
2
[RFC] Supporting ARM's SVE in LLVM
...ypass the need for that "constant" to be constant at all, ie, the use of `incw/incp`. Since you can fail half-way through, the width that you need to increment to the induction variable is not even known at run time! Meaning, that's not a constant at all! Example: a[i] = b[ c[i] ]; ld1w z0.s, p0/z, [ c, i, lsl 2 ] ld1w z1.s, p0/z, [ b, z0.s, stxw 2 ] Now, z0.s load may have failed with seg fault somewhere, and it's up to the FFR to tell brka/brkb how to deal with this. Each iteration will have: * The same vector length *per process* for accessing c[] * A potentially...
2016 Nov 22
3
[RFC] Supporting ARM's SVE in LLVM
Hi Renato, Sorry for the delay in responding. We've been busy rethinking some of our changes after the feedback we've received thus far (particularly from the devmeeting). The incremental patches will use our revised design(which should be less invasive), and I'll be updating our document to match. On 16/11/2016, 12:46, "Renato Golin" <renato.golin at linaro.org>
2016 Nov 04
2
[RFC] Supporting ARM's SVE in LLVM
...med into the `whilelo` instruction. \newpage ```nasm SimpleReduction: // BB#0: subs w9, w1, #1 b.lt .LBB0_4 // BB#1: add x9, x9, #1 mov x8, xzr whilelo p0.s, xzr, x9 mov z0.s, #0 .LBB0_2: ld1w {z1.s}, p0/z, [x0, x8, lsl #2] incw x8 add z0.s, p0/m, z0.s, z1.s whilelo p0.s, x8, x9 b.mi .LBB0_2 // BB#3: ptrue p0.s uaddv d0, p0, z0.s fmov w0, s0 ret .LBB0_4: mov...