Displaying 5 results from an estimated 5 matches for "int_fetch_add".
2012 Feb 07
2
[LLVMdev] Vectorization: Next Steps
...i] = 0;
>
> for (unsigned i = 0; i < n; i++)
> count[src[i]]++;
>
> start[0] = 0;
> for (unsigned i = 1; i < buckets; i++)
> start[i] = start[i - 1] + count[i - 1];
>
> #pragma assert parallel
> for (unsigned i = 0; i < n; i++) {
> unsigned loc = int_fetch_add(start + src[i], 1);
Should this be:
unsigned loc = int_fetch_add(start[src[i]], 1);
> dst[loc] = src[i];
> }
>
>
> The 1st loop is trivially parallel. I think Polly would recognize
> this and do good things.
This case is trivial.
But keep in mind that unsigned loop ivs a...
2012 Feb 07
0
[LLVMdev] Vectorization: Next Steps
...i = 0; i < n; i++)
>> count[src[i]]++;
>>
>> start[0] = 0;
>> for (unsigned i = 1; i < buckets; i++)
>> start[i] = start[i - 1] + count[i - 1];
>>
>> #pragma assert parallel
>> for (unsigned i = 0; i < n; i++) {
>> unsigned loc = int_fetch_add(start + src[i], 1);
>> dst[loc] = src[i];
>> }
> Should this be:
>
> unsigned loc = int_fetch_add(start[src[i]], 1);
Our intrinsic wants a pointer, so either int_fetch_add(start + src[i], 1)
or int_fetch_add(&start[src[i]], 1) wil work.
>> The 1st loop is tri...
2012 Feb 06
0
[LLVMdev] Vectorization: Next Steps
...r (unsigned i = 0; i < buckets; i++)
count[i] = 0;
for (unsigned i = 0; i < n; i++)
count[src[i]]++;
start[0] = 0;
for (unsigned i = 1; i < buckets; i++)
start[i] = start[i - 1] + count[i - 1];
#pragma assert parallel
for (unsigned i = 0; i < n; i++) {
unsigned loc = int_fetch_add(start + src[i], 1);
dst[loc] = src[i];
}
The 1st loop is trivially parallel. I think Polly would recognize this and
do good things.
The 2nd loop has a race condition that can be handled by using an atomic
increment provided by the architecture, if the compiler knows about such
things. I don...
2012 Feb 06
7
[LLVMdev] Vectorization: Next Steps
On Sat, Feb 4, 2012 at 2:27 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> On Fri, 2012-02-03 at 20:59 -0800, Preston Briggs wrote:
>> so are building a dependence graph for a complete function. Of
>> course, such a thing is useful for vectorization and all sorts of
>> other dependence-based loop transforms.
>>
>> I'm looking at the problem in two parts:
2012 Feb 08
1
[LLVMdev] Vectorization: Next Steps
...tools people commonly use annotations
to provide information about live-in, live-out values, the sizes of
arrays, ... It makes perfect sense to state the absence of dependences.
>>> #pragma assert parallel
>>> for (unsigned i = 0; i< n; i++) {
>>> unsigned loc = int_fetch_add(&start[src[i]], 1);
>>> dst[loc] = src[i];
>>> }
>>
>> As the int_fetch_add is side effect free, it is fully
>> polyhedral. It can be implemented with the relevant LLVM
>> atomicrmw instruction [1]. Polly does not yet allow atomicrmw instructions,
>...