Rodney M. Bates
2014-Apr-17 19:41 UTC
[LLVMdev] More Qs about llvm IR to access struct fields
Suppose I have compiled a record equivalent to: struct ST { short s; char c3, c4; } STvar; and llvm IR: %STvar = alloca { i16, i8, i8 } According to "The Often Misunderstood GEP Instruction", I can construct an artificial llvm type to supply to GEP. (The "ugly GEP"?) So I can think of three ways to access STvar.c3. These are in order of increasing easiness for me to produce, but decreasing, by my guess, useful information content for llvm. Also, I understand 3) would be the only way, if c3 didn't start and end on byte boundaries. ; 1) Use a struct field number: %1 = getelementptr { i16, i8, i8 }* %STvar i32 0, i32 1 ; {i8*}: &struct_member_no_1 %2 = load i8* %1 ; 2) Use a byte number: %3 = getelementptr [ 4 x i8 ]* %STvar i32 0, i32 2 ; {i8*}: &byte_number_2 %4 = load i8* %3 ; 3) Use a word number, then shift & mask within: %5 = getelementptr [ 1 x i32 ]* %STvar i32 0, i32 0 ; {i32*}: &word_number_0 %6 = load i32* %5 ; {i32}: word_number_0 %7 = lshr i32 %6 8 ; {i32}: right_justified %8 = and i32 %7 0xff ; {i32}: irrelevant_bits_zeroed Question 1: Does this choice affect the optimization possibilities? Now suppose I had provided debug metadata describing the original struct, and the llvm infrastructure decided to keep STvar.c3 in a hardware machine register for a while. Question 2: Would the final debug information reflect that STvar.c3 is in a register, for the different ways of coding the llvm IR? -- Rodney Bates rodney.m.bates at acm.org
Krzysztof Parzyszek
2014-Apr-18 12:40 UTC
[LLVMdev] More Qs about llvm IR to access struct fields
On 4/17/2014 2:41 PM, Rodney M. Bates wrote:> > Question 1: > Does this choice affect the optimization possibilities?Depends on how advanced the alias analysis is these days (it is mostly the alias analysis that could suffer). I haven't looked into the AA code much, but from what I remember, it could be affected by how the data addresses are constructed. Look into the TBAA metadata and whether there is any expectation of correlation between the GEP structure and the TBAA information. (Sorry, I don't have exact answers off the top of my head.) -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
On Fri, Apr 18, 2014 at 5:40 AM, Krzysztof Parzyszek < kparzysz at codeaurora.org> wrote:> On 4/17/2014 2:41 PM, Rodney M. Bates wrote: > >> >> Question 1: >> Does this choice affect the optimization possibilities? >> > > Depends on how advanced the alias analysis is these days (it is mostly the > alias analysis that could suffer). I haven't looked into the AA code much, > but from what I remember, it could be affected by how the data addresses > are constructed. Look into the TBAA metadata and whether there is any > expectation of correlation between the GEP structure and the TBAA > information. (Sorry, I don't have exact answers off the top of my head.) >LLVM's builtin type system provides MustAlias information, and TBAA metadata provides NoAlias information. The two complement each other, and do not conceptually interact (though they do interact in practice somewhat due to sharing the same AliasAnalysis API). If two memory accesses have the same base and the same gep indices, then they're accessing the same memory, and you can do the MustAlias family of optimizations -- eliminating one of the accesses. But if two memory accesses have different bases or different gep indices, you can't immediately prove anything. If two memory accesses have TBAA tags which indicate non-aliasing, then they're accessing different memory, and you can do the NoAlias family of optimizations -- reordering the accesses with respect to each other. But if the two memory accesses have the same tag, you can't immediately prove anything. The AliasAnalysis difference between 1) and 2) above is that 1) has stronger LLVM type information, so it has more MustAlias information. However, these days BasicAA can compute MustAlias information by analyzing everything as byte offsets too, so there's probably no difference in practice. The main practical advantage of 1) is that it gives SROA a suggestion for how to split up an aggregate in memory. There's no semantic difference, and in theory SROA could infer where the fields are by looking at the actual memory access (and in theory this would make it more powerful), but relying on the type is a convenient heuristic. 3) is less nice because the load is wider, which could produce false dependencies if there are nearby accesses to other fields of the struct. But the optimizer will probably convert 3) into something like 2) or 1) pretty quickly, so it's probably not hugely different in practice. Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140418/27be022d/attachment.html>