On 15.10.2008, at 16.43, Duncan Sands wrote:>> True, but note that it is the address of a variable that is used, not >> the value. > > Yes, but why do you think they should get a different address? I can > understand that it is surprising that they do, but determining whether > this is legal or not requires reading the language standard. > Hopefully > a language lawyer can chime in and say whether this transform is valid > or not.I agree the whole construction is a litle bit strange (stupid even). It is however common way to specify context identity in one Objective- C pattern (although I don't think anyone actually uses initialized const variables, I was just playing with them to see how compilers put stuff in segments). I do think however that it's bit dangerous to combine static constants across compilation units. --- Tatu Vaajalahti Tampere, Finland
On Oct 15, 2008, at 6:58 AM, Tatu Vaajalahti wrote:>> Yes, but why do you think they should get a different address? I can >> understand that it is surprising that they do, but determining >> whether >> this is legal or not requires reading the language standard. >> Hopefully >> a language lawyer can chime in and say whether this transform is >> valid >> or not. > > > I agree the whole construction is a litle bit strange (stupid even). > It is however common way to specify context identity in one Objective- > C pattern (although I don't think anyone actually uses initialized > const variables, I was just playing with them to see how compilers put > stuff in segments). > > I do think however that it's bit dangerous to combine static constants > across compilation units.GCC does the same things with strings in some cases. You shouldn't depend on this behavior if you want portable code. If you avoid marking the global variable const, you should have better luck. -Chris
On 15.10.2008, at 18.28, Chris Lattner wrote:> GCC does the same things with strings in some cases. You shouldn't > depend on this behavior if you want portable code. If you avoid > marking the global variable const, you should have better luck. >ACK! Thank you all for your answers! --- Tatu Vaajalahti Tampere, Finland
On Oct 15, 2008, at 8:28 AM, Chris Lattner wrote:> On Oct 15, 2008, at 6:58 AM, Tatu Vaajalahti wrote: >>> Yes, but why do you think they should get a different address? I >>> can >>> understand that it is surprising that they do, but determining >>> whether >>> this is legal or not requires reading the language standard. >>> Hopefully >>> a language lawyer can chime in and say whether this transform is >>> valid >>> or not. >> >> >> I agree the whole construction is a litle bit strange (stupid even). >> It is however common way to specify context identity in one >> Objective- >> C pattern (although I don't think anyone actually uses initialized >> const variables, I was just playing with them to see how compilers >> put >> stuff in segments). >> >> I do think however that it's bit dangerous to combine static >> constants >> across compilation units. > > GCC does the same things with strings in some cases. You shouldn't > depend on this behavior if you want portable code. If you avoid > marking the global variable const, you should have better luck.You all are wrong. Amazingly so. First, String literals and objects are different. String literals are defined like this: 2 Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation-defined. That applies _only_ to string literals, absolutely nothing else. Objects are defined like so: Two pointers of the same type compare equal if and only if they are both null, both point to the same object or function, or both point one past the end of the same array. This means they _must_ compare !=, if they are different objects. Wether are the same object or or not is answered by the notion of linkage: 8 An identifier used in more than one translation unit can potentially refer to the same entity in these translation units depending on the linkage (_basic.link_) of the identifier specified in each translation unit. 2 A name is said to have linkage when it might denote the same object, reference, function, type, template, namespace or value as a name introduced by a declaration in another scope: to be pedantically clear, entity includes objects: 3 An entity is a value, object, subobject, base class subobject, array element, variable, function, instance of a function, enumerator, type, class member, template, or namespace. Now, you ask, how can we be sure these have no linkage across translation units, because: 3 A name having namespace scope (_basic.scope.namespace_) has internal linkage if it is the name of --an object, reference, function or function template that is explicitly declared static or, We know that they do not denote the same object because the rules that guide us when they do are not met: 9 Two names that are the same (clause _basic_) and that are declared in different scopes shall denote the same object, reference, function, type, enumerator, template or namespace if --both names have external linkage or else both names have internal linkage and are declared in the same translation unit; and --both names refer to members of the same namespace or to members, not by inheritance, of the same class; and --when both names denote functions, the function types are identical for purposes of overloading; and --when both names denote function templates, the signatures (_temp.over.link_) are the same. We know that they cannot have linkage across translation units because: --When a name has external linkage, the entity it denotes can be referred to by names from scopes of other translation units or from other scopes of the same translation unit. --When a name has internal linkage, the entity it denotes can be referred to by names from other scopes in the same translation unit. Welcome to C and C++ 101. I'm amazed that this isn't as plan as day to anyone that works on a compiler. Kinda basic stuff. Ignorance of the rules doesn't mean you can't just read the words of the standard. You don't have to guess. The standard is meant to be fairly accessible: Every byte has a unique address. 1 The fundamental storage unit in the C++ memory model is the byte. 5 Unless it is a bit-field (_class.bit_), a most derived object shall have a non-zero size and shall occupy one or more bytes of storage. So, let me state is this way, the address _must_ be different. If you can't tell they are not, you are free to have them be the same.
On Wed, Oct 15, 2008 at 8:28 AM, Chris Lattner <clattner at apple.com> wrote:>> I do think however that it's bit dangerous to combine static constants >> across compilation units. > > GCC does the same things with strings in some cases. You shouldn't > depend on this behavior if you want portable code.Combining is explicitly allowed for strings in C: 6.5.2.5p8: "String literals, and compound literals with const-qualified types, need not designate distinct objects." This isn't allowed for distinct declarations. 6.5.9p6: "Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space." 6.2.4: "An object exists, has a constant address, and retains its last-stored value throughout its lifetime." 6.7: "A definition of an identifier is a declaration for that identifier that: — for an object, causes storage to be reserved for that object;" There isn't any other reasonable interpretation of the standard. Also, the only semantics that "const" has in C is that writing to an object with a const-qualified type is illegal.> If you avoid > marking the global variable const, you should have better luck.Not marking the variable const only makes the problem more obscure. Testcase in C: static char x = 1, y = 1;int c() {char* u = &x; char* v = &y; return u == v;} int d() {return x+y;} Running this through "llvm-gcc -O0 -emit-llvm -c -x c - -o - | opt -mem2reg -globalopt -constmerge -instcombine" has precisely the same effect: c returns 1. It's conceivable that globalopt+constmerge could do even crazier stuff. Potential example: suppose we have two mallocs, and store the allocated pointers into globals. GlobalOpt knows how to turn mallocs into statically allocated globals. Then suppose there's exactly one store to each of these mallocs: GlobalOpt knows how to turn these into constant globals. Then, ConstMerge or the AsmPrinter will actually merge them, so the computed address ends up being the same. Therefore, we conclude that malloc(1) == malloc(1) can be true in some situations! Now, this doesn't actually work because malloc elimination transformation isn't quite aggressive enough, but making this work wouldn't require any controversial changes. This bug actually manifests itself in two places: one is ConstantMerge, the other is the AsmPrinter. It's non-trivial to fix because it's really a design bug: we assume that constant==mergeable, which simply isn't true. There are a few different ways of fixing this; however, I think the only real option is to add a new "mergeable" linkage type. -Eli