thr3ads.net - llvm dev - [LLVMdev] Vectorizing global struct pointers [Feb 2013]

If this information is useful, please help other people find it:
Share via:

Renato Golin

2013-Feb-05 15:51 UTC

[LLVMdev] Vectorizing global struct pointers

Hi all,

One of the reasons the Livermore Loops couldn't be vectorized is that it
was using global structures to hold the arrays. Today, I'm investigating
why is that so and how to fix it.

My investigation brought me
to LoopVectorizationLegality::canVectorizeMemory():

      if (WriteObjects.count(*it)) {
        DEBUG(dbgs() << "LV: Found a possible read/write
reorder:"
              << **it <<"\n");
        return false;
      }

In the first pass, it registers all underlying objects for writes, than it
does it again for reads, if the value was already there, it's a conflict.

However, the read is from Foo.bl / Foo.cl and the write to Foo.al, so why
is GetUnderlyingObjects() returning the same objects/pointers?

A quick look at it revealed me the problem:

llvm::GetUnderlyingObject(Value *V, const DataLayout *TD, unsigned
MaxLookup) yields:

-> GEPOperator *GEP = dyn_cast<GEPOperator>(V)
-> V = GEP->getPointerOperand();
-> GlobalAlias *GA = dyn_cast<GlobalAlias>(V)
-> V = GA->getAliasee();
return V;

In this case, V is a reference to the structure, not the element. It seems
to me that assigning the pointer operand from GEP is too simplistic.
Either GetUnderlyingObject() should store the indices to return the correct
object, or GetUnderlyingObjects() should create a special case for it (as
it does with selects and phi nodes).

Does that make sense?

cheers,
--renato

PS:

A simplified version of the IR:

%struct.anon = type { [256 x i64], [256 x i64], [256 x i64] }

@Foo = common global %struct.anon zeroinitializer, align 8

...

  %arrayidx = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 1, i32
%idxprom
  %0 = load i64* %arrayidx, align 8
  %arrayidx2 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 2, i32
%idxprom
  %1 = load i64* %arrayidx2, align 8
  %mul = mul nsw i64 %1, %0
  %arrayidx4 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 0, i32
%idxprom
  store i64 %mul, i64* %arrayidx4, align 8
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130205/373cd8d0/attachment.html>

Arnold Schwaighofer

2013-Feb-05 16:38 UTC

head link

[LLVMdev] Vectorizing global struct pointers

If I understand you correctly, conceptually you want two different objects to be
returned for Foo.bl and Foo.al?

Here is my take on this (take this with a grain of salt, Dan is the expert on
this):

http://llvm.org/docs/GetElementPtr.html#what-happens-if-an-array-index-is-out-of-bounds

LLVM's semantic allows for arrays to be accessed out of bounds - this allows
you to walk from the first array  { [256 x i64] #<<, [256 x i64], [256 x
i64] }  to the second  { [256 x i64], [256 x i64]#<<, [256 x i64] }. I
believe, one reason for having it defined this way is to be able to handle
C's zero (variable) length arrays. LLVM's concept of memory is untyped
(http://llvm.org/docs/LangRef.html#pointeraliasing). You can get types through
TBAA.
We would need to strengthen TBAA to handle this for C.



On Feb 5, 2013, at 9:51 AM, Renato Golin <renato.golin at linaro.org>
wrote:
> Hi all,
> 
> One of the reasons the Livermore Loops couldn't be vectorized is that
it was using global structures to hold the arrays. Today, I'm investigating
why is that so and how to fix it.
> 
> My investigation brought me to
LoopVectorizationLegality::canVectorizeMemory():
> 
>      if (WriteObjects.count(*it)) {
>        DEBUG(dbgs() << "LV: Found a possible read/write
reorder:"
>              << **it <<"\n");
>        return false;
>      }
> 
> In the first pass, it registers all underlying objects for writes, than it
does it again for reads, if the value was already there, it's a conflict.
> 
> However, the read is from Foo.bl / Foo.cl and the write to Foo.al, so why
is GetUnderlyingObjects() returning the same objects/pointers?
> 
> A quick look at it revealed me the problem:
> 
> llvm::GetUnderlyingObject(Value *V, const DataLayout *TD, unsigned
MaxLookup) yields:
> 
> -> GEPOperator *GEP = dyn_cast<GEPOperator>(V)
> -> V = GEP->getPointerOperand();
> -> GlobalAlias *GA = dyn_cast<GlobalAlias>(V)
> -> V = GA->getAliasee();
> return V;
> 
> In this case, V is a reference to the structure, not the element. It seems
to me that assigning the pointer operand from GEP is too simplistic. Either
GetUnderlyingObject() should store the indices to return the correct object, or
GetUnderlyingObjects() should create a special case for it (as it does with
selects and phi nodes).
> 
> Does that make sense?
> 
> cheers,
> --renato
> 
> PS:
> 
> A simplified version of the IR:
> 
> %struct.anon = type { [256 x i64], [256 x i64], [256 x i64] }
> 
> @Foo = common global %struct.anon zeroinitializer, align 8
> 
> ...
> 
>  %arrayidx = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 1, i32
%idxprom
>  %0 = load i64* %arrayidx, align 8
>  %arrayidx2 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 2, i32
%idxprom
>  %1 = load i64* %arrayidx2, align 8
>  %mul = mul nsw i64 %1, %0
>  %arrayidx4 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 0, i32
%idxprom
>  store i64 %mul, i64* %arrayidx4, align 8
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Renato Golin

2013-Feb-05 17:12 UTC

head link

[LLVMdev] Vectorizing global struct pointers

On 5 February 2013 16:38, Arnold Schwaighofer <aschwaighofer at
apple.com>wrote:
> If I understand you correctly, conceptually you want two different objects
> to be returned for Foo.bl and Foo.al?
>
Not necessarily. The vectorizer is implemented expecting that the objects
will be different, but that's a limitation on the vectorizer itself.

However, changing the vectorizer to recognize GEP offsets is probably not
the best course of action, but by the time we get the object back, we have
no information if it was a GEP, and if so, what was its offset.

Maybe I could work back (via uses) and identify the GEP and store the
offset with the object locally, so that the list would not think different
members are the same object. But I wanted to know first if the underlying
object concept is correct. From the docs you sent me, it seems it is.

TBAA seems a bit too much for this case, though.

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130205/b46eb697/attachment.html>

Hal Finkel

2013-Feb-05 17:22 UTC

head link

[LLVMdev] Vectorizing global struct pointers

----- Arnold Schwaighofer <aschwaighofer at apple.com>
wrote:> If I understand you correctly, conceptually you want two different objects
to be returned for Foo.bl and Foo.al?
> 
> Here is my take on this (take this with a grain of salt, Dan is the expert
on this):
> 
>
http://llvm.org/docs/GetElementPtr.html#what-happens-if-an-array-index-is-out-of-bounds
> 
> LLVM's semantic allows for arrays to be accessed out of bounds - this
allows you to walk from the first array { [256 x i64] #<<, [256 x i64],
[256 x i64] } to the second { [256 x i64], [256 x i64]#<<, [256 x i64] }.
I believe, one reason for having it defined this way is to be able to handle
C's zero (variable) length arrays. LLVM's concept of memory is untyped
(http://llvm.org/docs/LangRef.html#pointeraliasing). You can get types through
TBAA.
> We would need to strengthen TBAA to handle this for C.
> 
I think that the potential for overlap is indeed there, but don't we already
insert runtime overlap checks as necessary? This seems like it would just be
another such case.

 -Hal
> 
> 
> On Feb 5, 2013, at 9:51 AM, Renato Golin <renato.golin at linaro.org>
wrote:
> 
> > Hi all,
> > 
> > One of the reasons the Livermore Loops couldn't be vectorized is
that it was using global structures to hold the arrays. Today, I'm
investigating why is that so and how to fix it.
> > 
> > My investigation brought me to
LoopVectorizationLegality::canVectorizeMemory():
> > 
> > if (WriteObjects.count(*it)) {
> > DEBUG(dbgs() << "LV: Found a possible read/write
reorder:"
> > << **it <<"n");
> > return false;
> > }
> > 
> > In the first pass, it registers all underlying objects for writes,
than it does it again for reads, if the value was already there, it's a
conflict.
> > 
> > However, the read is from Foo.bl / Foo.cl and the write to Foo.al, so
why is GetUnderlyingObjects() returning the same objects/pointers?
> > 
> > A quick look at it revealed me the problem:
> > 
> > llvm::GetUnderlyingObject(Value *V, const DataLayout *TD, unsigned
MaxLookup) yields:
> > 
> > -> GEPOperator *GEP = dyn_cast<GEPOperator>(V)
> > -> V = GEP->getPointerOperand();
> > -> GlobalAlias *GA = dyn_cast<GlobalAlias>(V)
> > -> V = GA->getAliasee();
> > return V;
> > 
> > In this case, V is a reference to the structure, not the element. It
seems to me that assigning the pointer operand from GEP is too simplistic.
Either GetUnderlyingObject() should store the indices to return the correct
object, or GetUnderlyingObjects() should create a special case for it (as it
does with selects and phi nodes).
> > 
> > Does that make sense?
> > 
> > cheers,
> > --renato
> > 
> > PS:
> > 
> > A simplified version of the IR:
> > 
> > %struct.anon = type { [256 x i64], [256 x i64], [256 x i64] }
> > 
> > @Foo = common global %struct.anon zeroinitializer, align 8
> > 
> > ...
> > 
> > %arrayidx = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 1,
i32 %idxprom
> > %0 = load i64* %arrayidx, align 8
> > %arrayidx2 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 2,
i32 %idxprom
> > %1 = load i64* %arrayidx2, align 8
> > %mul = mul nsw i64 %1, %0
> > %arrayidx4 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 0,
i32 %idxprom
> > store i64 %mul, i64* %arrayidx4, align 8
> > 
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Feb 2013 - [LLVMdev] Vectorizing global struct pointers

[LLVMdev] Vectorizing global struct pointers

[LLVMdev] Vectorizing global struct pointers

[LLVMdev] Vectorizing global struct pointers

[LLVMdev] Vectorizing global struct pointers

Apparently Analagous Threads