search for: a_i_j

Displaying 1 result from an estimated 1 matches for "a_i_j".

Did you mean: _i_
2016 Jul 11
2
extra loads in nested for-loop
...ould reuse %1 This loading from a[i][j] happens again for each iteration and seems quite inefficient. I changed the C code to explicitly do the load of a[i][j] outside of the innermost loop and that (as would be expected) eliminates the extra load: void f1( InArray c, InArray a, InArray b ) { int a_i_j; #pragma clang loop unroll_count(UNROLL_DIM) for(int i=0;i<DIM;i++){ #pragma clang loop unroll_count(UNROLL_DIM) for(int j=0;j<DIM;j++) { a_i_j = a[i][j]; #pragma clang loop unroll_count(UNROLL_DIM) for(int k=0;k<DIM;k++) { c[i][k] = c[i]...