On Thu, Oct 16, 2008 at 12:35 PM, Mike Stump <mrs at apple.com> wrote:>> It doesn't really change the issue, though; we want the merging to be >> a front-end option, and we still need a solution which handles >> variables that gets marked by the optimizer. > > I think so. If we could get C/C++ to just bless merging and then just > support that and ignore legacy standards and legacy code, we might be > able to leave it as is.The only allowance I can think of that's general enough to allow everything LLVM knows how to do at the moment is "the result of an equality comparison between two pointers to objects with distinct base objects is undefined". (See http://lists.cs.uiuc.edu/pipermail/llvmdev/2008-October/017769.html and http://lists.cs.uiuc.edu/pipermail/llvmdev/2008-October/017747.html for the examples I'm thinking of.) I strongly doubt we can get away with that. Here's a more concrete version of the solution I'm proposing: we add a new optional marking to constant globals, say "mergeable". There are two reasonable semantics: one is that the result of equality comparisons of a pointer into this global with a pointer into any other similarly marked global is undefined. This is actually slightly more aggressive than what the current standard seems to allow even for string constants, but it seems reasonable. The more conservative definition of the semantics is just to say that the compiler chooses whether any pair of mergeable globals are distinct objects, which is roughly how the current C99 standard defines string merging. This has the following effects on current optimizers: constmerge only merges globals marked mergeable. Only mergeable constants are emitted into mergeable sections in assembly. If we use the conservative definition of mergeable, fix any code that assumes distinct globals don't get merged, like the code that folds equality comparisons between distinct globals to false. Optionally, add a new optimization step which marks constants as mergable when their address isn't taken. -Eli
On Oct 16, 2008, at 1:57 PM, Eli Friedman wrote:> I strongly doubt we can get away with that.Yeah, we agree on that one. I was just thinking about the const case.> Here's a more concrete version of the solution I'm proposing: we add a > new optional marking to constant globals, say "mergeable". There are > two reasonable semantics: one is that the result of equality > comparisons of a pointer into this global with a pointer into any > other similarly marked global is undefined.undefined means the code is free to rm -rf /, is this what you meant? I'm against that. The usual wording is sufficient: It is unspecified whether such a variable has an address distinct from that of any other object in the program to reuse the concept and words from 8.4p8 (n2461). This isn't quite right, but we know what is meant by it. The implementation is free to merge it with any other mergable object.> This has the following effects on current optimizers: constmerge only > merges globals marked mergeable. Only mergeable constants are emitted > into mergeable sections in assembly. If we use the conservative > definition of mergeable, fix any code that assumes distinct globals > don't get merged, like the code that folds equality comparisons > between distinct globals to false. Optionally, add a new optimization > step which marks constants as mergable when their address isn't taken.:-) Sounds like a plan.
On Thu, Oct 16, 2008 at 3:26 PM, Mike Stump <mrs at apple.com> wrote:> On Oct 16, 2008, at 1:57 PM, Eli Friedman wrote: >> I strongly doubt we can get away with that. > > Yeah, we agree on that one. I was just thinking about the const case. > >> Here's a more concrete version of the solution I'm proposing: we add a >> new optional marking to constant globals, say "mergeable". There are >> two reasonable semantics: one is that the result of equality >> comparisons of a pointer into this global with a pointer into any >> other similarly marked global is undefined. > > undefined means the code is free to rm -rf /, is this what you meant? > I'm against that. > > The usual wording is sufficient: > > It is unspecified whether such a variable has an address distinct > from that of any other object in the programOkay... works, but there are some strange edge cases; take the following program: static const int a = 0, b = &a == &b; (a and b get used later, including their addresses) In C mode, the front-end must evaluate &a == &b because complex expressions in initializers generally can't be trusted to the backend. However, once it evaluates this, it can't mark either a or b mergeable; if it did, it would introduce a logical inconsistency with other code that compared the addresses of a and b. -Eli