thr3ads.net - search: "float

Displaying 2 results from an estimated 2 matches for "float_n".

Did you mean: float_h

LoopVectorize fails to vectorize more complex loops

2018 Jul 07

LoopVectorize fails to vectorize more complex loops

...the SVN repository from Jun 2018)? typedef short TYPE; TYPE data[1400][1200]; void kernel_covariance(int m, int n, TYPE mean[1200]) { int i, j, k; for (j = 0; j < m; j++) { mean[j] = 0.0; for (i = 0; i < n; i++) mean[j] += data[j][i]; //mean[j] /= float_n; } // This loop gets vectorized for (i = 0; i < n; i++) for (j = 0; j < m; j++) data[i][j] -= mean[j]; /* // This loop doesn't get vectorized either: for (i = 0; i < m; i++) for (j = i; j < m; j++) { cov[i][j] = 0.0; fo...

Vectorization width not correct using #pragma clang loop vectorize_width

2018 Sep 20

Vectorization width not correct using #pragma clang loop vectorize_width

Hello, I m trying to set vector width using #pragma clang loop vectorize_width(32) but i m getting width 8 for the following kernel; #define M 128 #define N 128 #define SQRT_FUN(x) sqrtf(x) int main(int argc, char** argv) { /* Variable declaration/allocation. */ double float_n = (double)N; double data[N*M]; double corr[M*M]; double mean[M]; double stddev[M]; uint32_t i,j,k; /*Initialize array(s). */ #pragma clang loop vectorize_width(1) //no vectorize for (i = 0; i < N*M; i++) { data[i] = (50.0)*i; } kernel_1: #pragm...

search for: float_n