Hi Jianzhou,> I misunderstood C99 ISO, such behaviors are defined not when types > have the same sizes, but when they are same (compatible) types with > signed or qualified extension (this is much stronger than being of > same sizes), or reading char by char: > > 7 An object shall have its stored value accessed only by an lvalue > expression that has one of > the following types: > — a type compatible with the effective type of the object, > [...] > — a type that is the signed or unsigned type corresponding to the > effective type of the > object, > [...] > — an aggregate or union type that includes one of the aforementioned > types among its > members (including, recursively, a member of a subaggregate or > contained union), or > — a character type. > (sec 6.5, items 6 and 7, page 67-68, > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf)LLVM does not have any such restrictions.> If LLVM IR is weaker than these C restrictions, then I have the > following questions about when GEP is undefined:In your examples, it is not GEP that would be undefined, but a load or store from the GEP. GEP just offsets the memory address. In C too it is not invalid to offset or cast a pointer; it is loading from or storing to the cast or offset pointer that may be invalid.> 1) Can I load a value partially or overlapped with other stored > values? For example, if the stored values are of type [10*i32], and we > cast i32* to {i8, i4, float} *, can we successfully load each fields > via the addresses from GEPs?Yes, except that as previously mentioned this is invalid for the i4 if the original value was not set by performing an i4 store. Since IR allows to define data layout of> targets (size and alignment for types), does whether such GEPs > undefined depend on its data layout?As I mentioned, there is no problem with GEPs being undefined.> 2) C allows characters as the least granularity when loading. Does > LLVM have the same assumption?LLVM doesn't have a notion of "character". Currently all processors that LLVM targets are capable of addressing an octet (8 bits), but nothing smaller. This means that the smallest granularity is currently i8. Ciao, Duncan.
On Fri, Jul 9, 2010 at 3:44 AM, Duncan Sands <baldrick at free.fr> wrote:> Hi Jianzhou, > >> I misunderstood C99 ISO, such behaviors are defined not when types >> have the same sizes, but when they are same (compatible) types with >> signed or qualified extension (this is much stronger than being of >> same sizes), or reading char by char: >> >> 7 An object shall have its stored value accessed only by an lvalue >> expression that has one of >> the following types: >> — a type compatible with the effective type of the object, >> [...] >> — a type that is the signed or unsigned type corresponding to the >> effective type of the >> object, >> [...] >> — an aggregate or union type that includes one of the aforementioned >> types among its >> members (including, recursively, a member of a subaggregate or >> contained union), or >> — a character type. >> (sec 6.5, items 6 and 7, page 67-68, >> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf) > > LLVM does not have any such restrictions. > >> If LLVM IR is weaker than these C restrictions, then I have the >> following questions about when GEP is undefined: > > In your examples, it is not GEP that would be undefined, but a load or > store from the GEP. GEP just offsets the memory address. In C too it > is not invalid to offset or cast a pointer; it is loading from or storing > to the cast or offset pointer that may be invalid. > >> 1) Can I load a value partially or overlapped with other stored >> values? For example, if the stored values are of type [10*i32], and we >> cast i32* to {i8, i4, float} *, can we successfully load each fields >> via the addresses from GEPs? > > Yes, except that as previously mentioned this is invalid for the i4 if > the original value was not set by performing an i4 store. > > Since IR allows to define data layout of >> targets (size and alignment for types), does whether such GEPs >> undefined depend on its data layout? > > As I mentioned, there is no problem with GEPs being undefined. > >> 2) C allows characters as the least granularity when loading. Does >> LLVM have the same assumption? > > LLVM doesn't have a notion of "character". Currently all processors that LLVM > targets are capable of addressing an octet (8 bits), but nothing smaller. This > means that the smallest granularity is currently i8.Thanks. To understand how load/store work in LLVM, I looked into the interpreter, where I found the target information is defined in TargetData class, and aggregate types (like struct and array) compute the correct padding and alignment from TargetData first before memory access. But I still run into one question, according to the code, the visitLoad/visitStore functions used by the interpreter does not allow accessing aggregate types, only simple types are legal. On the other hand, the GenericValue used by interpreter to store values in memory only considers simple types (int, float, typ*) too, and each 'getOperandValue' also takes aggregate constants illegal when converting them into GenericValue. Actually my input *.ll files have load/store instructions on struct (used in functions), which are taken as return values to ensure not to be deleted. Interpreters (lli -force-interpreter ) can still interpreter the code without any error/warning about meeting aggregate values for memory accessing. I was wondering if lli (2.7) does any transformation to flatten aggregate types into simple types, and split such loads/stores into a sequence on primitive values. Thanks.> > Ciao, > > Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Jianzhou
On Fri, Jul 9, 2010 at 2:35 PM, Jianzhou Zhao <jianzhou at seas.upenn.edu> wrote:> On Fri, Jul 9, 2010 at 3:44 AM, Duncan Sands <baldrick at free.fr> wrote: >> Hi Jianzhou, >> >>> I misunderstood C99 ISO, such behaviors are defined not when types >>> have the same sizes, but when they are same (compatible) types with >>> signed or qualified extension (this is much stronger than being of >>> same sizes), or reading char by char: >>> >>> 7 An object shall have its stored value accessed only by an lvalue >>> expression that has one of >>> the following types: >>> — a type compatible with the effective type of the object, >>> [...] >>> — a type that is the signed or unsigned type corresponding to the >>> effective type of the >>> object, >>> [...] >>> — an aggregate or union type that includes one of the aforementioned >>> types among its >>> members (including, recursively, a member of a subaggregate or >>> contained union), or >>> — a character type. >>> (sec 6.5, items 6 and 7, page 67-68, >>> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf) >> >> LLVM does not have any such restrictions. >> >>> If LLVM IR is weaker than these C restrictions, then I have the >>> following questions about when GEP is undefined: >> >> In your examples, it is not GEP that would be undefined, but a load or >> store from the GEP. GEP just offsets the memory address. In C too it >> is not invalid to offset or cast a pointer; it is loading from or storing >> to the cast or offset pointer that may be invalid. >> >>> 1) Can I load a value partially or overlapped with other stored >>> values? For example, if the stored values are of type [10*i32], and we >>> cast i32* to {i8, i4, float} *, can we successfully load each fields >>> via the addresses from GEPs? >> >> Yes, except that as previously mentioned this is invalid for the i4 if >> the original value was not set by performing an i4 store. >> >> Since IR allows to define data layout of >>> targets (size and alignment for types), does whether such GEPs >>> undefined depend on its data layout? >> >> As I mentioned, there is no problem with GEPs being undefined. >> >>> 2) C allows characters as the least granularity when loading. Does >>> LLVM have the same assumption? >> >> LLVM doesn't have a notion of "character". Currently all processors that LLVM >> targets are capable of addressing an octet (8 bits), but nothing smaller. This >> means that the smallest granularity is currently i8. > > Thanks. To understand how load/store work in LLVM, I looked into the > interpreter, where I found the target information is defined in > TargetData class, and aggregate types (like struct and array) compute > the correct padding and alignment from TargetData first before memory > access. > > But I still run into one question, according to the code, the > visitLoad/visitStore functions used by the interpreter does not allow > accessing aggregate types, only simple types are legal. On the other > hand, the GenericValue used by interpreter to store values in memory > only considers simple types (int, float, typ*) too, and each > 'getOperandValue' also takes aggregate constants illegal when > converting them into GenericValue. > > Actually my input *.ll files have load/store instructions on struct > (used in functions), which are taken as return values to ensure not to > be deleted. Interpreters (lli -force-interpreter ) can still > interpreter the code without any error/warning about meeting aggregate > values for memory accessing. I was wondering if lli (2.7) does any > transformation to flatten aggregate types into simple types, and split > such loads/stores into a sequence on primitive values. Thanks.Sorry. lli doesn't allow such loads/stores with -force-interpreter. (I had just gave a wrong option...) --- LLVM ERROR: Cannot load value of type { i32, float }!> >> >> Ciao, >> >> Duncan. >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > > > > -- > Jianzhou >-- Jianzhou