thr3ads.net - llvm dev - [LLVMdev] [cfe-dev] [RFC] Module Flags Metadata [Jan 2012]

If this information is useful, please help other people find it:
Share via:

Dan Gohman

2012-Jan-26 22:10 UTC

[LLVMdev] [cfe-dev] [RFC] Module Flags Metadata

On Jan 26, 2012, at 12:54 PM, Devang Patel wrote:> 
> On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote:
> 
>> or what optimizers must do to preserve it.
> 
> The number one reason behind metadata is to have a mechanism to track
values while being completely transparent to optimizer. If you want a guarantee
from the optimizer to preserve certain semantics about the way metadata is used
(e.g. say to describe range of values) then metadata is not appropriate
mechanism.
If the optimizer makes no guarantees whatsoever, then metadata is
not appropriate for anything.

For example, the metadata used by TBAA today is not safe. Imagine an
optimization pass which takes two allocas that are used in
non-overlaping regions and rewrites all uses of one to use the other,
to reduce the stack size. By LLVM IR rules alone, this would seem to
be a valid semantics-preserving transformation. But if the loads
and stores for the two allocas have different TBAA type tags, the
tags will say NoAlias for memory references that do in fact alias.

The only reason why TBAA doesn't have a problem with this today is
that LLVM doesn't happen to implement optimizations which break it
yet. But there are no guarantees.

Dan

Hal Finkel

2012-Jan-26 23:02 UTC

head link

[LLVMdev] [cfe-dev] [RFC] Module Flags Metadata

On Thu, 2012-01-26 at 14:10 -0800, Dan Gohman wrote:> On Jan 26, 2012, at 12:54 PM, Devang Patel wrote:
> > 
> > On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote:
> > 
> >> or what optimizers must do to preserve it.
> > 
> > The number one reason behind metadata is to have a mechanism to track
values while being completely transparent to optimizer. If you want a guarantee
from the optimizer to preserve certain semantics about the way metadata is used
(e.g. say to describe range of values) then metadata is not appropriate
mechanism.
> 
> If the optimizer makes no guarantees whatsoever, then metadata is
> not appropriate for anything.
> 
> For example, the metadata used by TBAA today is not safe. Imagine an
> optimization pass which takes two allocas that are used in
> non-overlaping regions and rewrites all uses of one to use the other,
> to reduce the stack size. By LLVM IR rules alone, this would seem to
> be a valid semantics-preserving transformation. But if the loads
> and stores for the two allocas have different TBAA type tags, the
> tags will say NoAlias for memory references that do in fact alias.
> 
> The only reason why TBAA doesn't have a problem with this today is
> that LLVM doesn't happen to implement optimizations which break it
> yet. But there are no guarantees.
On that thought, is there any way that my autovectorization pass could
invalidate the TBAA metadata (in a harmful way) when it fuses two
memory-adjacent loads or stores? Currently, it performs this fusion by
first cloning the first instruction (which I think will pick up its
metadata), then changing the instruction's type and operands as
necessary. This fusion will only take place if the two instructions have
the same LLVM type, but currently there is no check of the associated
metadata.

 -Hal
> 
> Dan
> 
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Dan Gohman

2012-Jan-27 18:40 UTC

head link

[LLVMdev] [cfe-dev] [RFC] Module Flags Metadata

On Jan 26, 2012, at 3:02 PM, Hal Finkel wrote:
> On Thu, 2012-01-26 at 14:10 -0800, Dan Gohman wrote:
>> 
>> If the optimizer makes no guarantees whatsoever, then metadata is
>> not appropriate for anything.
>> 
>> For example, the metadata used by TBAA today is not safe. Imagine an
>> optimization pass which takes two allocas that are used in
>> non-overlaping regions and rewrites all uses of one to use the other,
>> to reduce the stack size. By LLVM IR rules alone, this would seem to
>> be a valid semantics-preserving transformation. But if the loads
>> and stores for the two allocas have different TBAA type tags, the
>> tags will say NoAlias for memory references that do in fact alias.
>> 
>> The only reason why TBAA doesn't have a problem with this today is
>> that LLVM doesn't happen to implement optimizations which break it
>> yet. But there are no guarantees.
> 
> On that thought, is there any way that my autovectorization pass could
> invalidate the TBAA metadata (in a harmful way) when it fuses two
> memory-adjacent loads or stores? Currently, it performs this fusion by
> first cloning the first instruction (which I think will pick up its
> metadata), then changing the instruction's type and operands as
> necessary. This fusion will only take place if the two instructions have
> the same LLVM type, but currently there is no check of the associated
> metadata.
Yes, it sounds like this could indeed cause TBAA tags to become invalid,
because it extends the range of memory that a given TBAA tag is associated
with.

Dan

Devang Patel

2012-Jan-27 19:20 UTC

head link

[LLVMdev] [cfe-dev] [RFC] Module Flags Metadata

On Jan 26, 2012, at 2:10 PM, Dan Gohman wrote:
> On Jan 26, 2012, at 12:54 PM, Devang Patel wrote:
>> 
>> On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote:
>> 
>>> or what optimizers must do to preserve it.
>> 
>> The number one reason behind metadata is to have a mechanism to track
values while being completely transparent to optimizer. If you want a guarantee
from the optimizer to preserve certain semantics about the way metadata is used
(e.g. say to describe range of values) then metadata is not appropriate
mechanism.
> 
> If the optimizer makes no guarantees whatsoever, then metadata is
> not appropriate for anything.
Are you sure ? :)
> 
> For example, the metadata used by TBAA today is not safe. Imagine an
> optimization pass which takes two allocas that are used in
> non-overlaping regions and rewrites all uses of one to use the other,
> to reduce the stack size. By LLVM IR rules alone, this would seem to
> be a valid semantics-preserving transformation. But if the loads
> and stores for the two allocas have different TBAA type tags, the
> tags will say NoAlias for memory references that do in fact alias.
Then, this is a serious bug in the way TBAA is using MDNodes, not in the design
of MDNodes.  My understanding was, if any other passes change values tracked by
MDNode for TBAA then TBAA would make conservative decision. However, you're
saying that it may lead to miscompile code, which is unfortunate.

If you need a data structure to communicate some information and you need
guarantee from each transformation pass in between to preserve the correctness
of the information then you need some other explicit mechanism (may be the way
debug info used to be encoded old days?).

-
Devang

Dan Gohman

2012-Jan-27 21:31 UTC

head link

[LLVMdev] [cfe-dev] [RFC] Module Flags Metadata

On Jan 27, 2012, at 11:20 AM, Devang Patel wrote:
> 
> On Jan 26, 2012, at 2:10 PM, Dan Gohman wrote:
> 
>> On Jan 26, 2012, at 12:54 PM, Devang Patel wrote:
>>> 
>>> On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote:
>>> 
>>>> or what optimizers must do to preserve it.
>>> 
>>> The number one reason behind metadata is to have a mechanism to
track values while being completely transparent to optimizer. If you want a
guarantee from the optimizer to preserve certain semantics about the way
metadata is used (e.g. say to describe range of values) then metadata is not
appropriate mechanism.
>> 
>> If the optimizer makes no guarantees whatsoever, then metadata is
>> not appropriate for anything.
> 
> Are you sure ? :)
Show me an example of a supposedly valid use of metadata, and I'll show
you a valid optimization which breaks that metadata.
> 
>> 
>> For example, the metadata used by TBAA today is not safe. Imagine an
>> optimization pass which takes two allocas that are used in
>> non-overlaping regions and rewrites all uses of one to use the other,
>> to reduce the stack size. By LLVM IR rules alone, this would seem to
>> be a valid semantics-preserving transformation. But if the loads
>> and stores for the two allocas have different TBAA type tags, the
>> tags will say NoAlias for memory references that do in fact alias.
> 
> Then, this is a serious bug in the way TBAA is using MDNodes, not in the
design of MDNodes.  My understanding was, if any other passes change values
tracked by MDNode for TBAA then TBAA would make conservative decision. However,
you're saying that it may lead to miscompile code, which is unfortunate.
It's not possible to do metadata-based TBAA and avoid this problem.
> If you need a data structure to communicate some information and you need
guarantee from each transformation pass in between to preserve the correctness
of the information then you need some other explicit mechanism (may be the way
debug info used to be encoded old days?).

Any other explicit annotation mechanism would have the same problem as
metadata. If the optimizer doesn't know about it, the optimizer is
liable to make changes that break it.

Dan

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Jan 2012 - [LLVMdev] [cfe-dev] [RFC] Module Flags Metadata

[LLVMdev] [cfe-dev] [RFC] Module Flags Metadata

[LLVMdev] [cfe-dev] [RFC] Module Flags Metadata

[LLVMdev] [cfe-dev] [RFC] Module Flags Metadata

[LLVMdev] [cfe-dev] [RFC] Module Flags Metadata

[LLVMdev] [cfe-dev] [RFC] Module Flags Metadata

Apparently Analagous Threads