Displaying 3 results from an estimated 3 matches for "ammort".
Did you mean:
amort
2013 Nov 14
0
[LLVMdev] Vectorization of loops with conditional dereferencing
...clusters is small, the worst case scenario will run twice as many
times, (the initial check can be vectorized, as it's a read-only loop), but
the later cannot. in the best case scenario, you'll have only a handful of
loops, which will run in parallel.
Worst case: n/VF + n
Best case: n/VF + ammortized n/VF
For VF == 2,
* best case is as fast as scalar code, considering overheads.
* worst case is 50% slower
For VF == 4,
* best case is 50% faster than scalar code
* worst case is 25% slower
And all that, depends on each workload, so it'll change for every different
set of arguments,...
2013 Nov 14
3
[LLVMdev] Vectorization of loops with conditional dereferencing
...ll, the worst case scenario will run twice as many times, (the initial check can be vectorized, as it's a read-only loop), but the later cannot. in the best case scenario, you'll have only a handful of loops, which will run in parallel.
>
> Worst case: n/VF + n
> Best case: n/VF + ammortized n/VF
>
> For VF == 2,
> * best case is as fast as scalar code, considering overheads.
> * worst case is 50% slower
>
> For VF == 4,
> * best case is 50% faster than scalar code
> * worst case is 25% slower
>
> And all that, depends on each workload, so it...
2013 Nov 01
6
[LLVMdev] Vectorization of loops with conditional dereferencing
Nadav, Arnold, et al.,
I have a number of loops that I would like us to be able to autovectorize (common, for example, in n-body inter-particle force kernels), and the problem is that they look like this:
for (int i = 0; i < N; ++i) {
if (r[i] > 0)
v += m[i]*...;
}
where, as written, m[i] is not accessed unless the condition is true. The general problem (as is noted by the loop