Displaying 6 results from an estimated 6 matches for "__svml_sin_4".
Did you mean:
__svml_sin4
2018 Jul 04
2
[RFC][VECLIB] how should we legalize VECLIB calls?
...case where
> "-vectorizer-maximize-bandwidth" option is enabled and vectorizer is
> forced to generate the wider VF, and hence it may generate a call to
> __svml_sin_* which may not exist.
>
> Are you expecting the vectorizer to lower the calls i.e. __svml_sin_8 to
> two __svml_sin_4 calls ?
>
> Regards,
> Ashutosh
>
If an accurate cost model was in place (which there isn't), then an
"unsupported" vectorization factor should only be selected if it was
forced. However, in this case __svml_sin_8 is the same cost as
__svml_sin_4, so the loop vectorizer...
2018 Jun 29
2
[RFC][VECLIB] how should we legalize VECLIB calls?
...s.llvm.org
Subject: RE: [RFC][VECLIB] how should we legalize VECLIB calls?
Hi Saito,
At AMD we have our own version of vector library and faced similar problems, we followed the SVML path and from vectorizer generated the respective vector calls. When vectorizer generates the respective calls i.e __svml_sin_4 or __amdlibm_sin_4, later one can perform only string matching to identify the vector lib call. I'm not sure it's the proper way, may be instead of generating respective calls it's better to generate some standard call (may be intrinsics) and lower it later. A late IR pass can be introd...
2018 Jun 29
2
[RFC][VECLIB] how should we legalize VECLIB calls?
Illustrative Example:
clang -fveclib=SVML -O3 svml.c -mavx
#include <math.h>
void foo(double *a, int N){
int i;
#pragma clang loop vectorize_width(8)
for (i=0;i<N;i++){
a[i] = sin(i);
}
}
Currently, this results in a call to <8 x double> __svml_sin8(<8 x double>) after the vectorizer.
This is 8-element SVML sin() called with 8-element argument. On the surface,
2018 Jul 02
8
[RFC][VECLIB] how should we legalize VECLIB calls?
...
>
> At AMD we have our own version of vector library and faced
> similar problems, we followed the SVML path and from
> vectorizer generated the respective vector calls. When
> vectorizer generates the respective calls i.e __svml_sin_4
> or __amdlibm_sin_4, later one can perform only string
> matching to identify the vector lib call. I’m not sure
> it’s the proper way, may be instead of generating
> respective calls it’s better to generate some standard
>...
2018 Jul 02
2
[RFC][VECLIB] how should we legalize VECLIB calls?
...>
>> Hi Saito,
>>
>>
>>
>> At AMD we have our own version of vector library and faced similar
>> problems, we followed the SVML path and from vectorizer generated the
>> respective vector calls. When vectorizer generates the respective calls i.e
>> __svml_sin_4 or __amdlibm_sin_4, later one can perform only string matching
>> to identify the vector lib call. I’m not sure it’s the proper way, may be
>> instead of generating respective calls it’s better to generate some
>> standard call (may be intrinsics) and lower it later. A late IR pas...
2018 Jul 02
2
[RFC][VECLIB] how should we legalize VECLIB calls?
...egalize VECLIB calls?
>
>
>
> Hi Saito,
>
>
>
> At AMD we have our own version of vector library and faced similar
> problems, we followed the SVML path and from vectorizer generated the
> respective vector calls. When vectorizer generates the respective calls i.e
> __svml_sin_4 or __amdlibm_sin_4, later one can perform only string matching
> to identify the vector lib call. I’m not sure it’s the proper way, may be
> instead of generating respective calls it’s better to generate some
> standard call (may be intrinsics) and lower it later. A late IR pass can be
>...