Displaying 8 results from an estimated 8 matches for "__m256d".
Did you mean:
__m256i
2018 Jan 10
1
Suggestions on code generation for SIMD
Thanks Serge! This means for every new intrinsic set, a systematic change
should be made to LLVM to support the new intrinsic set, right? The change
should include frontend change, IR instruction set change, as well as low
level code generation changes?
On Tue, Jan 9, 2018 at 12:39 AM, serge guelton via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> > The vast majority of the
2017 Sep 13
2
RFC phantom memory intrinsic
Hi Michael,
>Interesting approach but how do you handle more complex offsets, e.g., when the pointer is part of an aggregate? Only one offset does not seem enough to handle generic cases.
Yes, correct, this a little bit changed example is not working.
#include <x86intrin.h>
__m256d vsht_d4_fold(const double* ptr, unsigned long long i) {
__m256d foo = (__m256d){ ptr[i], ptr[i+1], ptr[i+2], ptr[i+3] };
return __builtin_shufflevector( foo, foo, 3, 3, 2, 2 );
}
But with the aggregate case it is a new level of complexity, should we
we care about? There might be some logic that...
2017 Sep 13
2
RFC phantom memory intrinsic
...resting approach but how do you handle more complex offsets, e.g., when the pointer is part of an aggregate? Only one offset does not seem enough to handle generic cases.
>> Yes, correct, this a little bit changed example is not working.
>> #include <x86intrin.h>
>>
>> __m256d vsht_d4_fold(const double* ptr, unsigned long long i) {
>> __m256d foo = (__m256d){ ptr[i], ptr[i+1], ptr[i+2], ptr[i+3] };
>> return __builtin_shufflevector( foo, foo, 3, 3, 2, 2 );
>> }
>> But with the aggregate case it is a new level of complexity, should we
>>...
2017 Sep 26
0
RFC phantom memory intrinsic
...but how do you handle more complex offsets, e.g., when the pointer is part of an aggregate? Only one offset does not seem enough to handle generic cases.
>>> Yes, correct, this a little bit changed example is not working.
>>> #include <x86intrin.h>
>>>
>>> __m256d vsht_d4_fold(const double* ptr, unsigned long long i) {
>>> __m256d foo = (__m256d){ ptr[i], ptr[i+1], ptr[i+2], ptr[i+3] };
>>> return __builtin_shufflevector( foo, foo, 3, 3, 2, 2 );
>>> }
>>> But with the aggregate case it is a new level of complexity,...
2017 Sep 26
2
RFC phantom memory intrinsic
...n the pointer is part of an aggregate? Only one offset does not seem
>>>>> enough to handle generic cases.
>>>>
>>>> Yes, correct, this a little bit changed example is not working.
>>>> #include <x86intrin.h>
>>>>
>>>> __m256d vsht_d4_fold(const double* ptr, unsigned long long i) {
>>>> __m256d foo = (__m256d){ ptr[i], ptr[i+1], ptr[i+2], ptr[i+3] };
>>>> return __builtin_shufflevector( foo, foo, 3, 3, 2, 2 );
>>>> }
>>>> But with the aggregate case it is a new leve...
2017 Sep 26
0
RFC phantom memory intrinsic
...he pointer is part of an aggregate? Only one offset does not seem
>>>>>> enough to handle generic cases.
>>>>> Yes, correct, this a little bit changed example is not working.
>>>>> #include <x86intrin.h>
>>>>>
>>>>> __m256d vsht_d4_fold(const double* ptr, unsigned long long i) {
>>>>> __m256d foo = (__m256d){ ptr[i], ptr[i+1], ptr[i+2], ptr[i+3] };
>>>>> return __builtin_shufflevector( foo, foo, 3, 3, 2, 2 );
>>>>> }
>>>>> But with the aggregate cas...
2017 Sep 12
3
RFC phantom memory intrinsic
Hi,
For PR21780 solution, I plan to add a new functionality to restore
memory operations that was once deleted, in this particular case it is
the load operations that were deleted by InstCombine, please note that
once the load was removed there is no way to restore it back and that
prevents us from vectorizing the shuffle operation. There are probably
more similar issues where this approach could
2018 Nov 30
2
[RFC] Re-implementing -fveclib with OpenMP
Hi all,
I am submitting the following RFC [1] to re-implement -fveclib via OpenMP constructs. The RFC was discussed during a round table at the last LLVM developer meeting, and presented during the BoF [2].
The proposal is published on Phabricator, for the purpose of keeping track of the comments, and it now ready for a review from a wider audience after being polished by Hal Finkel and Hideki