search for: uint8x16

Displaying 2 results from an estimated 2 matches for "uint8x16".

Did you mean: uint16
2018 Apr 07
0
SCEV and LoopStrengthReduction Formulae
...ty much everywhere in DSP functions. It’d be pretty nice if the compiler could do it too. There is one alternate approach that I recall, which looks like this: Original code (example, pseudocode): int add_delta_256(uint8 *in1, uint8 *in2) { int accum = 0; for (int i = 0; i < 16; ++i) { uint8x16 a = load16(in1 + i *16); // NOTE: takes an extra addressing op because x86 uint8x16 b = load16(in2 + i *16); // NOTE: takes an extra addressing op because x86 accum += psadbw(a, b); } return accum; } end of loop: inc i cmp i, 16 jl loop LSR’d code: int add_delta_256(uint8 *in1, uint8 *...
2018 Apr 03
4
SCEV and LoopStrengthReduction Formulae
I am attempting to implement a minor loop strength reduction optimization for targets that support compare and jump fusion, specifically TTI::canMacroFuseCmp(). My approach might be wrong; however, I am soliciting the idea for feedback, so that I can implement this correctly. My plan is to add a Supplemental LSR formula to LoopStrengthReduce.cpp that optimizes the following case, but perhaps