On Jan 24, 2012, at 3:39 PM, Devang Patel wrote:> >>> I have only one real comment -- this violates the contract and spirit of LLVM's metadata design. You're specifically encoding semantics in metadata, but the principle of metadata is that a program with all metadata stripped has the same behavior as one with the metadata still in place. > > > This is a simplified understanding of semantics. As I understand, the expected metadata design behavior is that optimizer/transformations are not responsible to preserve any _relationship_ between a User and a MDNode. For example, if a MDNode is "using" a User then optimizer can remove the User without bothering about what happens to the MDNode.Right.> Same way, If MDNode is attached to an Instruction then optimizer can mutate, delete or replace the Instruction while completely ignoring attached MDNode.However, this isn't necessarily true. For example, it would seem to be within the spirit of LLVM's metadata design to describe the range of values that a given instruction might have. However, if the optimizer mutates the instruction (and preserves program correctness by mutating its operand instructions to compensate), then that metadata could easily become incorrect. Right now, there aren't any rules about what metadata can do, or what optimizers must do to preserve it. It's sort of the "head in the sand" level of conceptual maturity. Dan
On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote:> > On Jan 24, 2012, at 3:39 PM, Devang Patel wrote: > >> >>>> I have only one real comment -- this violates the contract and spirit of LLVM's metadata design. You're specifically encoding semantics in metadata, but the principle of metadata is that a program with all metadata stripped has the same behavior as one with the metadata still in place. >> >> >> This is a simplified understanding of semantics. As I understand, the expected metadata design behavior is that optimizer/transformations are not responsible to preserve any _relationship_ between a User and a MDNode. For example, if a MDNode is "using" a User then optimizer can remove the User without bothering about what happens to the MDNode. > > Right. > >> Same way, If MDNode is attached to an Instruction then optimizer can mutate, delete or replace the Instruction while completely ignoring attached MDNode. > > However, this isn't necessarily true. For example, it would seem to be > within the spirit of LLVM's metadata design to describe the range of values > that a given instruction might have. However, if the optimizer mutates the > instruction (and preserves program correctness by mutating its operand > instructions to compensate), then that metadata could easily become > incorrect. Right now, there aren't any rules about what metadata can do,> or what optimizers must do to preserve it.The number one reason behind metadata is to have a mechanism to track values while being completely transparent to optimizer. If you want a guarantee from the optimizer to preserve certain semantics about the way metadata is used (e.g. say to describe range of values) then metadata is not appropriate mechanism. - Devang
On Jan 26, 2012, at 12:54 PM, Devang Patel wrote:> > On Jan 26, 2012, at 11:15 AM, Dan Gohman wrote: > >> or what optimizers must do to preserve it. > > The number one reason behind metadata is to have a mechanism to track values while being completely transparent to optimizer. If you want a guarantee from the optimizer to preserve certain semantics about the way metadata is used (e.g. say to describe range of values) then metadata is not appropriate mechanism.If the optimizer makes no guarantees whatsoever, then metadata is not appropriate for anything. For example, the metadata used by TBAA today is not safe. Imagine an optimization pass which takes two allocas that are used in non-overlaping regions and rewrites all uses of one to use the other, to reduce the stack size. By LLVM IR rules alone, this would seem to be a valid semantics-preserving transformation. But if the loads and stores for the two allocas have different TBAA type tags, the tags will say NoAlias for memory references that do in fact alias. The only reason why TBAA doesn't have a problem with this today is that LLVM doesn't happen to implement optimizations which break it yet. But there are no guarantees. Dan