On Jan 27, 2012, at 11:20 AM, Devang Patel wrote:> > On Jan 26, 2012, at 2:10 PM, Dan Gohman wrote: > >> On Jan 26, 2012, at 12:54 PM, Devang Patel wrote: >>> >>> On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote: >>> >>>> or what optimizers must do to preserve it. >>> >>> The number one reason behind metadata is to have a mechanism to track values while being completely transparent to optimizer. If you want a guarantee from the optimizer to preserve certain semantics about the way metadata is used (e.g. say to describe range of values) then metadata is not appropriate mechanism. >> >> If the optimizer makes no guarantees whatsoever, then metadata is >> not appropriate for anything. > > Are you sure ? :)Show me an example of a supposedly valid use of metadata, and I'll show you a valid optimization which breaks that metadata.> >> >> For example, the metadata used by TBAA today is not safe. Imagine an >> optimization pass which takes two allocas that are used in >> non-overlaping regions and rewrites all uses of one to use the other, >> to reduce the stack size. By LLVM IR rules alone, this would seem to >> be a valid semantics-preserving transformation. But if the loads >> and stores for the two allocas have different TBAA type tags, the >> tags will say NoAlias for memory references that do in fact alias. > > Then, this is a serious bug in the way TBAA is using MDNodes, not in the design of MDNodes. My understanding was, if any other passes change values tracked by MDNode for TBAA then TBAA would make conservative decision. However, you're saying that it may lead to miscompile code, which is unfortunate.It's not possible to do metadata-based TBAA and avoid this problem.> If you need a data structure to communicate some information and you need guarantee from each transformation pass in between to preserve the correctness of the information then you need some other explicit mechanism (may be the way debug info used to be encoded old days?).Any other explicit annotation mechanism would have the same problem as metadata. If the optimizer doesn't know about it, the optimizer is liable to make changes that break it. Dan
[ removing cfe-dev from the cc list ] On Jan 27, 2012, at 1:31 PM, Dan Gohman wrote:> On Jan 27, 2012, at 11:20 AM, Devang Patel wrote: > >> >> On Jan 26, 2012, at 2:10 PM, Dan Gohman wrote: >> >>> On Jan 26, 2012, at 12:54 PM, Devang Patel wrote: >>>> >>>> On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote: >>>> >>>>> or what optimizers must do to preserve it. >>>> >>>> The number one reason behind metadata is to have a mechanism to track values while being completely transparent to optimizer. If you want a guarantee from the optimizer to preserve certain semantics about the way metadata is used (e.g. say to describe range of values) then metadata is not appropriate mechanism. >>> >>> If the optimizer makes no guarantees whatsoever, then metadata is >>> not appropriate for anything. >> >> Are you sure ? :) > > Show me an example of a supposedly valid use of metadata, and I'll show > you a valid optimization which breaks that metadata.Your argument of "If TBAA can not use then nobody can" neither helps discussion nor helps avoid confusion. If nobody uses MDNode the way it is designed then eventually they will disappear from llvm. Don't worry. It is simple, MDNodes are meant to convey information about values that is optional. In other words, if a optimization pass is not aware if certain information communicated through MDNode then the pass should never lead to mis-compilation of the code. One example, is !notemoral hints. I could give other examples, but that is not the point.>>> For example, the metadata used by TBAA today is not safe. Imagine an >>> optimization pass which takes two allocas that are used in >>> non-overlaping regions and rewrites all uses of one to use the other, >>> to reduce the stack size. By LLVM IR rules alone, this would seem to >>> be a valid semantics-preserving transformation. But if the loads >>> and stores for the two allocas have different TBAA type tags, the >>> tags will say NoAlias for memory references that do in fact alias. >> >> Then, this is a serious bug in the way TBAA is using MDNodes, not in the design of MDNodes. My understanding was, if any other passes change values tracked by MDNode for TBAA then TBAA would make conservative decision. However, you're saying that it may lead to miscompile code, which is unfortunate. > > It's not possible to do metadata-based TBAA and avoid this problem.... well you know more about TBAA, and you realize that TBAA is not _required_ to use MDNodes.>> If you need a data structure to communicate some information and you need guarantee from each transformation pass in between to preserve the correctness of the information then you need some other explicit mechanism (may be the way debug info used to be encoded old days?). > > > Any other explicit annotation mechanism would have the same problem as > metadata. If the optimizer doesn't know about it, the optimizer is > liable to make changes that break it.- Devang
On Jan 27, 2012, at 3:40 PM, Devang Patel <dpatel at apple.com> wrote:> [ removing cfe-dev from the cc list ] > > On Jan 27, 2012, at 1:31 PM, Dan Gohman wrote: > >> On Jan 27, 2012, at 11:20 AM, Devang Patel wrote: >> >>> >>> On Jan 26, 2012, at 2:10 PM, Dan Gohman wrote: >>> >>>> On Jan 26, 2012, at 12:54 PM, Devang Patel wrote: >>>>> >>>>> On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote: >>>>> >>>>>> or what optimizers must do to preserve it. >>>>> >>>>> The number one reason behind metadata is to have a mechanism to track values while being completely transparent to optimizer. If you want a guarantee from the optimizer to preserve certain semantics about the way metadata is used (e.g. say to describe range of values) then metadata is not appropriate mechanism. >>>> >>>> If the optimizer makes no guarantees whatsoever, then metadata is >>>> not appropriate for anything. >>> >>> Are you sure ? :) >> >> Show me an example of a supposedly valid use of metadata, and I'll show >> you a valid optimization which breaks that metadata. > > Your argument of "If TBAA can not use then nobody can" neither helps discussion nor helps avoid confusion. If nobody uses MDNode the way it is designed then eventually they will disappear from llvm. Don't worry. > > It is simple, MDNodes are meant to convey information about values that is optional. In other words, if a optimization pass is not aware if certain information communicated through MDNode then the pass should never lead to mis-compilation of the code. One example, is !notemoral hints. I could give other examples, but that is not the point.My understanding of metadata was that it can be discarded by optimizations, or any other transformations, without affecting correctness; however, discarded != ignored. That is, the optimization is expected to either update the metadata appropriated for whatever transformations it performed or delete the metadata entirely. Is that incorrect? -Jim> >>>> For example, the metadata used by TBAA today is not safe. Imagine an >>>> optimization pass which takes two allocas that are used in >>>> non-overlaping regions and rewrites all uses of one to use the other, >>>> to reduce the stack size. By LLVM IR rules alone, this would seem to >>>> be a valid semantics-preserving transformation. But if the loads >>>> and stores for the two allocas have different TBAA type tags, the >>>> tags will say NoAlias for memory references that do in fact alias. >>> >>> Then, this is a serious bug in the way TBAA is using MDNodes, not in the design of MDNodes. My understanding was, if any other passes change values tracked by MDNode for TBAA then TBAA would make conservative decision. However, you're saying that it may lead to miscompile code, which is unfortunate. >> >> It's not possible to do metadata-based TBAA and avoid this problem. > > ... well you know more about TBAA, and you realize that TBAA is not _required_ to use MDNodes. > >>> If you need a data structure to communicate some information and you need guarantee from each transformation pass in between to preserve the correctness of the information then you need some other explicit mechanism (may be the way debug info used to be encoded old days?). >> >> >> Any other explicit annotation mechanism would have the same problem as >> metadata. If the optimizer doesn't know about it, the optimizer is >> liable to make changes that break it. > > - > Devang > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Jan 27, 2012, at 3:40 PM, Devang Patel wrote:> [ removing cfe-dev from the cc list ] > > On Jan 27, 2012, at 1:31 PM, Dan Gohman wrote: > >> On Jan 27, 2012, at 11:20 AM, Devang Patel wrote: >> >>> >>> On Jan 26, 2012, at 2:10 PM, Dan Gohman wrote: >>> >>>> On Jan 26, 2012, at 12:54 PM, Devang Patel wrote: >>>>> >>>>> On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote: >>>>> >>>>>> or what optimizers must do to preserve it. >>>>> >>>>> The number one reason behind metadata is to have a mechanism to track values while being completely transparent to optimizer. If you want a guarantee from the optimizer to preserve certain semantics about the way metadata is used (e.g. say to describe range of values) then metadata is not appropriate mechanism. >>>> >>>> If the optimizer makes no guarantees whatsoever, then metadata is >>>> not appropriate for anything. >>> >>> Are you sure ? :) >> >> Show me an example of a supposedly valid use of metadata, and I'll show >> you a valid optimization which breaks that metadata. > > Your argument of "If TBAA can not use then nobody can" neither helps discussion nor helps avoid confusion. If nobody uses MDNode the way it is designed then eventually they will disappear from llvm. Don't worry.TBAA is one example. My value-range example was another. Here are some more: dbg: If someone mutates an instruction in place, it may not produce the value that the metadata was intended for. fpaccuracy: If the optimizer CSE's two operations with different accuracies, it might happen to keep the less accurate one. prof: If a branch condition is inverted, the frequency data is backward. nontemporal: If the optimizer cache-blocks a loop, it could greatly reduce the amount of time before a stored value is reloaded. Not all of these are miscompiles, but the point is that valid but naive optimizations can cause them to be actively misleading -- profiling data that says the exact opposite of what the code does, or debuggers that think they know what the value of a variable is at a given location but have the wrong value. Dan