> On Thu, Aug 10, 2017 at 12:22 AM, Nema, Ashutosh via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> I’m not sure how transforming GEP offset to i8 type will help alias >> analysis & SROA for the mentioned test case. > > It should neither help nor hinder AA or SROA -- the two GEPs (the complex one and the simple one) are equivalent. > Since memory isn't typed in LLVM, having the GEP in terms of %struct.ABC does not provide any extra information.Memory is somewhat typed, since if you store something with a type and load the same location with a different type that's not valid (let's call it poison). Also, BasicAA has the following rule, with constants c1 and c2, and arbitrary values x, y: a[x][c1] no-alias a[y][c2] if: the distance between c1 and c2 is sufficient to guarantee that the accesses will be disjoint due to ending up in different array slots. For this rule it's important to know what's the size of each array element. This information is lost if GEPs are flattened. But I agree that LLVM itself doesn't exploit types for AA extensively. For example, a pointer based in a struct field may alias another field of the same struct, even if at C/C++ level that's probably not allowed. Nuno
Thanks Nuno & Sanjoy for the inputs. As you mentioned the flattened GEPs should neither help nor hinder AA & SROA. It's good to keep type based GEPs. I'll make the change and submit for review. Regards, Ashutosh -----Original Message----- From: Nuno Lopes [mailto:nunoplopes at sapo.pt] Sent: Thursday, August 10, 2017 11:28 PM To: 'Sanjoy Das' <sanjoy at google.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; 'llvm-dev' <llvm-dev at lists.llvm.org> Subject: RE: [llvm-dev] InstCombine GEP> On Thu, Aug 10, 2017 at 12:22 AM, Nema, Ashutosh via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> I’m not sure how transforming GEP offset to i8 type will help alias >> analysis & SROA for the mentioned test case. > > It should neither help nor hinder AA or SROA -- the two GEPs (the complex one and the simple one) are equivalent. > Since memory isn't typed in LLVM, having the GEP in terms of %struct.ABC does not provide any extra information.Memory is somewhat typed, since if you store something with a type and load the same location with a different type that's not valid (let's call it poison). Also, BasicAA has the following rule, with constants c1 and c2, and arbitrary values x, y: a[x][c1] no-alias a[y][c2] if: the distance between c1 and c2 is sufficient to guarantee that the accesses will be disjoint due to ending up in different array slots. For this rule it's important to know what's the size of each array element. This information is lost if GEPs are flattened. But I agree that LLVM itself doesn't exploit types for AA extensively. For example, a pointer based in a struct field may alias another field of the same struct, even if at C/C++ level that's probably not allowed. Nuno
Hi, On Thu, Aug 10, 2017 at 10:58 AM, Nuno Lopes via llvm-dev <llvm-dev at lists.llvm.org> wrote:>> On Thu, Aug 10, 2017 at 12:22 AM, Nema, Ashutosh via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>> I’m not sure how transforming GEP offset to i8 type will help alias >>> analysis & SROA for the mentioned test case. >> >> It should neither help nor hinder AA or SROA -- the two GEPs (the complex one and the simple one) are equivalent. > Since memory isn't typed in LLVM, having the GEP in terms of %struct.ABC does not provide any extra information. > > Memory is somewhat typed, since if you store something with a type and load the same location with a different type that's not valid (let's call it poison).That may be true in C++, but I'm not sure if we want that to be true in LLVM IR. We would not be able to inline memcpy's if that were true, for one thing (e.g. https://godbolt.org/g/2VVJHU). Unless you're talking about TBAA metadata?> Also, BasicAA has the following rule, with constants c1 and c2, and arbitrary values x, y: > a[x][c1] no-alias a[y][c2] if: > the distance between c1 and c2 is sufficient to guarantee that the accesses will be disjoint due to ending up in different array slots. > For this rule it's important to know what's the size of each array element. This information is lost if GEPs are flattened.Do you mean to say that in LLVM IR we will conclude ptr0 and ptr1 don't alias: int a[4][4]; ptr0 = &a[x][3]; ptr1 = &a[y][7]; If so, that doesn't match my understanding -- I was under the impression that in LLVM IR x = 2, y = 1 will give us must-alias between ptr0 and ptr1. -- Sanjoy> > But I agree that LLVM itself doesn't exploit types for AA extensively. For example, a pointer based in a struct field may alias another field of the same struct, even if at C/C++ level that's probably not allowed. > > Nuno > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> On Thu, Aug 10, 2017 at 10:58 AM, Nuno Lopes wrote: >>> On Thu, Aug 10, 2017 at 12:22 AM, Nema, Ashutosh via llvm-dev >>> <llvm-dev at lists.llvm.org> wrote: >>>> I’m not sure how transforming GEP offset to i8 type will help alias >>>> analysis & SROA for the mentioned test case. >>> >>> It should neither help nor hinder AA or SROA -- the two GEPs (the >>> complex one and the simple one) are equivalent. >> Since memory isn't typed in LLVM, having the GEP in terms of %struct.ABC >> does not provide any extra information. >> >> Memory is somewhat typed, since if you store something with a type and >> load the same location with a different type that's not valid (let's call >> it poison). > > That may be true in C++, but I'm not sure if we want that to be true > in LLVM IR. We would not be able to inline memcpy's if that were > true, for one thing (e.g. https://godbolt.org/g/2VVJHU). Unless > you're talking about TBAA metadata?Ah, that's a very good point. This is a simplified version of your example: https://godbolt.org/g/RyZYga memcpy is transformed into a store of an int, which is then loaded as float. Well, at least according to LLVM semantics, memory records the last stored type size, such that it's invalid to store an i12 and load an i13. Not sure why this restriction in the semantics is actually needed, though. If you read a smaller/larger type than what was stored, you may end up with some padding bits (poison). That's it.>> Also, BasicAA has the following rule, with constants c1 and c2, and >> arbitrary values x, y: >> a[x][c1] no-alias a[y][c2] if: >> the distance between c1 and c2 is sufficient to guarantee that the >> accesses will be disjoint due to ending up in different array slots. >> For this rule it's important to know what's the size of each array >> element. This information is lost if GEPs are flattened. > > Do you mean to say that in LLVM IR we will conclude ptr0 and ptr1 don't > alias: > > int a[4][4]; > ptr0 = &a[x][3]; > ptr1 = &a[y][7]; > > If so, that doesn't match my understanding -- I was under the > impression that in LLVM IR x = 2, y = 1 will give us must-alias > between ptr0 and ptr1.No, in this case it won't conclude no-alias, since 3 % 4 == 7 % 4. LLVM is not that aggressive in exploiting UB. Anyway, concluding no-alias here was only possible if the GEP index had the inrange attribute. The example is more like this: int a[4][5]; p = &a[x][0]; q = &a[y][1]; With access sizes sp, sq, respectively: If the access size through p ends before q (q >= sp) and the access through q doesn't go beyond the array limit (sq <= 5*sizeof(int) - 1*sizeof(int)), then it's no-alias. By flattening a GEP, you lose the information of the size of the each of array/struct constituents. Hence this proof rule doesn't apply and you would get may-alias for the example above. Another interesting conclusion is that LLVM is being quite nice by allowing accesses to multiple array/struct fields through the address of one of them. The code is here: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?revision=310766&view=markup#l1349 (you may need to scroll back to line 1294 or even to the beginning of that function to see where all the data comes from) Nuno