On Jan 26, 2012, at 12:54 PM, Devang Patel wrote:> > On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote: > >> or what optimizers must do to preserve it. > > The number one reason behind metadata is to have a mechanism to track values while being completely transparent to optimizer. If you want a guarantee from the optimizer to preserve certain semantics about the way metadata is used (e.g. say to describe range of values) then metadata is not appropriate mechanism.If the optimizer makes no guarantees whatsoever, then metadata is not appropriate for anything. For example, the metadata used by TBAA today is not safe. Imagine an optimization pass which takes two allocas that are used in non-overlaping regions and rewrites all uses of one to use the other, to reduce the stack size. By LLVM IR rules alone, this would seem to be a valid semantics-preserving transformation. But if the loads and stores for the two allocas have different TBAA type tags, the tags will say NoAlias for memory references that do in fact alias. The only reason why TBAA doesn't have a problem with this today is that LLVM doesn't happen to implement optimizations which break it yet. But there are no guarantees. Dan
On Thu, 2012-01-26 at 14:10 -0800, Dan Gohman wrote:> On Jan 26, 2012, at 12:54 PM, Devang Patel wrote: > > > > On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote: > > > >> or what optimizers must do to preserve it. > > > > The number one reason behind metadata is to have a mechanism to track values while being completely transparent to optimizer. If you want a guarantee from the optimizer to preserve certain semantics about the way metadata is used (e.g. say to describe range of values) then metadata is not appropriate mechanism. > > If the optimizer makes no guarantees whatsoever, then metadata is > not appropriate for anything. > > For example, the metadata used by TBAA today is not safe. Imagine an > optimization pass which takes two allocas that are used in > non-overlaping regions and rewrites all uses of one to use the other, > to reduce the stack size. By LLVM IR rules alone, this would seem to > be a valid semantics-preserving transformation. But if the loads > and stores for the two allocas have different TBAA type tags, the > tags will say NoAlias for memory references that do in fact alias. > > The only reason why TBAA doesn't have a problem with this today is > that LLVM doesn't happen to implement optimizations which break it > yet. But there are no guarantees.On that thought, is there any way that my autovectorization pass could invalidate the TBAA metadata (in a harmful way) when it fuses two memory-adjacent loads or stores? Currently, it performs this fusion by first cloning the first instruction (which I think will pick up its metadata), then changing the instruction's type and operands as necessary. This fusion will only take place if the two instructions have the same LLVM type, but currently there is no check of the associated metadata. -Hal> > Dan > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev-- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory
On Jan 26, 2012, at 3:02 PM, Hal Finkel wrote:> On Thu, 2012-01-26 at 14:10 -0800, Dan Gohman wrote: >> >> If the optimizer makes no guarantees whatsoever, then metadata is >> not appropriate for anything. >> >> For example, the metadata used by TBAA today is not safe. Imagine an >> optimization pass which takes two allocas that are used in >> non-overlaping regions and rewrites all uses of one to use the other, >> to reduce the stack size. By LLVM IR rules alone, this would seem to >> be a valid semantics-preserving transformation. But if the loads >> and stores for the two allocas have different TBAA type tags, the >> tags will say NoAlias for memory references that do in fact alias. >> >> The only reason why TBAA doesn't have a problem with this today is >> that LLVM doesn't happen to implement optimizations which break it >> yet. But there are no guarantees. > > On that thought, is there any way that my autovectorization pass could > invalidate the TBAA metadata (in a harmful way) when it fuses two > memory-adjacent loads or stores? Currently, it performs this fusion by > first cloning the first instruction (which I think will pick up its > metadata), then changing the instruction's type and operands as > necessary. This fusion will only take place if the two instructions have > the same LLVM type, but currently there is no check of the associated > metadata.Yes, it sounds like this could indeed cause TBAA tags to become invalid, because it extends the range of memory that a given TBAA tag is associated with. Dan
On Jan 26, 2012, at 2:10 PM, Dan Gohman wrote:> On Jan 26, 2012, at 12:54 PM, Devang Patel wrote: >> >> On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote: >> >>> or what optimizers must do to preserve it. >> >> The number one reason behind metadata is to have a mechanism to track values while being completely transparent to optimizer. If you want a guarantee from the optimizer to preserve certain semantics about the way metadata is used (e.g. say to describe range of values) then metadata is not appropriate mechanism. > > If the optimizer makes no guarantees whatsoever, then metadata is > not appropriate for anything.Are you sure ? :)> > For example, the metadata used by TBAA today is not safe. Imagine an > optimization pass which takes two allocas that are used in > non-overlaping regions and rewrites all uses of one to use the other, > to reduce the stack size. By LLVM IR rules alone, this would seem to > be a valid semantics-preserving transformation. But if the loads > and stores for the two allocas have different TBAA type tags, the > tags will say NoAlias for memory references that do in fact alias.Then, this is a serious bug in the way TBAA is using MDNodes, not in the design of MDNodes. My understanding was, if any other passes change values tracked by MDNode for TBAA then TBAA would make conservative decision. However, you're saying that it may lead to miscompile code, which is unfortunate. If you need a data structure to communicate some information and you need guarantee from each transformation pass in between to preserve the correctness of the information then you need some other explicit mechanism (may be the way debug info used to be encoded old days?). - Devang
On Jan 27, 2012, at 11:20 AM, Devang Patel wrote:> > On Jan 26, 2012, at 2:10 PM, Dan Gohman wrote: > >> On Jan 26, 2012, at 12:54 PM, Devang Patel wrote: >>> >>> On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote: >>> >>>> or what optimizers must do to preserve it. >>> >>> The number one reason behind metadata is to have a mechanism to track values while being completely transparent to optimizer. If you want a guarantee from the optimizer to preserve certain semantics about the way metadata is used (e.g. say to describe range of values) then metadata is not appropriate mechanism. >> >> If the optimizer makes no guarantees whatsoever, then metadata is >> not appropriate for anything. > > Are you sure ? :)Show me an example of a supposedly valid use of metadata, and I'll show you a valid optimization which breaks that metadata.> >> >> For example, the metadata used by TBAA today is not safe. Imagine an >> optimization pass which takes two allocas that are used in >> non-overlaping regions and rewrites all uses of one to use the other, >> to reduce the stack size. By LLVM IR rules alone, this would seem to >> be a valid semantics-preserving transformation. But if the loads >> and stores for the two allocas have different TBAA type tags, the >> tags will say NoAlias for memory references that do in fact alias. > > Then, this is a serious bug in the way TBAA is using MDNodes, not in the design of MDNodes. My understanding was, if any other passes change values tracked by MDNode for TBAA then TBAA would make conservative decision. However, you're saying that it may lead to miscompile code, which is unfortunate.It's not possible to do metadata-based TBAA and avoid this problem.> If you need a data structure to communicate some information and you need guarantee from each transformation pass in between to preserve the correctness of the information then you need some other explicit mechanism (may be the way debug info used to be encoded old days?).Any other explicit annotation mechanism would have the same problem as metadata. If the optimizer doesn't know about it, the optimizer is liable to make changes that break it. Dan