Wojciech Matyjewicz
2007-Nov-13 22:13 UTC
[LLVMdev] BasicAliasAnalysis and out-of-bound GEP indices
Hi! While investigating into the PR1782 I spent some time analyzing BasicAliasAnalysis.cpp. While the mentioned problem should be fixed now (I hope), I have discovered some other possibilities for a bug to occur. In the case of checking for aliasing of two pointer values, where at least one of them is a GEP instruction with out-of-bound indices, BasicAliasAnalysis can return NoAlias, even if the values point the same memory cell. I'm not sure if it's an error, as out-of-bound indexing can bring undefined result (as stated in LangRef.html). Please, take a look at this code (BasicAA results are in comments). --- target datalayout "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32" %ty1 = type [2 x [2 x i8]] %ty2 = type [2 x {i8, [2 x i8]}] @G = external global %ty1 @H = external global %ty2 ; alias(%a, %b, 1, 1) returns NoAlias, ; although they point to the same memory cell define void @fun1() { entry: %a = getelementptr %ty1* @G, i32 0, i32 1, i32 -2 %b = bitcast %ty1* @G to i8* ret void } ; alias(%c, %d, 1, 1) returns NoAlias, ; although they may point to the same memory cell (if %x == -2) define void @fun2(i32 %x) { entry: %c = getelementptr %ty1* @G, i32 0, i32 1, i32 %x %d = bitcast %ty1* @G to i8* ret void } ; alias(%e, %f, 1, 1) returns NoAlias, ; although they may point to the same memory cell (if %z == 2) define void @fun3(i32 %z) { entry: %e = getelementptr %ty2* @H, i32 0, i32 1, i32 0 %f = getelementptr %ty2* @H, i32 0, i32 0, i32 1, i32 %z ret void } --- Should we treat this as a bug (I may attempt to prepare a patch then)?. I suppose these are real corner cases, rather improbable to occur in real-world programs. However, something similiar (but no so weird) can be produced from the attached C source. It is a reduced version of preprocessed ehci-hcd.c file (from Linux kernel; compilation of this file discovered bug #1782). --Wojtek -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ehci-hcd-reduced.i URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20071113/f3bdbf58/attachment.ksh>
Owen Anderson
2007-Nov-13 23:19 UTC
[LLVMdev] BasicAliasAnalysis and out-of-bound GEP indices
It's an optimization opportunity! When behavior is undefined, we're free to interpret it to be "whatever makes optimization easiest." If the two do actually happen to alias, well, it's the programmer's fault anyways, because they were doing something undefined! --Owen On Nov 13, 2007, at 4:13 PM, Wojciech Matyjewicz wrote:> Hi! > > While investigating into the PR1782 I spent some time analyzing > BasicAliasAnalysis.cpp. While the mentioned problem should be fixed > now > (I hope), I have discovered some other possibilities for a bug to > occur. > > In the case of checking for aliasing of two pointer values, where at > least one of them is a GEP instruction with out-of-bound indices, > BasicAliasAnalysis can return NoAlias, even if the values point the > same > memory cell. I'm not sure if it's an error, as out-of-bound indexing > can > bring undefined result (as stated in LangRef.html). > > Please, take a look at this code (BasicAA results are in comments). > --- > target datalayout > "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32- > f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32" > > %ty1 = type [2 x [2 x i8]] > %ty2 = type [2 x {i8, [2 x i8]}] > > @G = external global %ty1 > @H = external global %ty2 > > ; alias(%a, %b, 1, 1) returns NoAlias, > ; although they point to the same memory cell > define void @fun1() { > entry: > %a = getelementptr %ty1* @G, i32 0, i32 1, i32 -2 > %b = bitcast %ty1* @G to i8* > ret void > } > > ; alias(%c, %d, 1, 1) returns NoAlias, > ; although they may point to the same memory cell (if %x == -2) > define void @fun2(i32 %x) { > entry: > %c = getelementptr %ty1* @G, i32 0, i32 1, i32 %x > %d = bitcast %ty1* @G to i8* > ret void > } > > ; alias(%e, %f, 1, 1) returns NoAlias, > ; although they may point to the same memory cell (if %z == 2) > define void @fun3(i32 %z) { > entry: > %e = getelementptr %ty2* @H, i32 0, i32 1, i32 0 > %f = getelementptr %ty2* @H, i32 0, i32 0, i32 1, i32 %z > ret void > } > --- > > Should we treat this as a bug (I may attempt to prepare a patch > then)?. > I suppose these are real corner cases, rather improbable to occur in > real-world programs. However, something similiar (but no so weird) can > be produced from the attached C source. It is a reduced version of > preprocessed ehci-hcd.c file (from Linux kernel; compilation of this > file discovered bug #1782). > > --Wojtek > struct device; > > struct usb_bus { > struct device *controller; > }; > > struct usb_hcd { > struct usb_bus self; > > unsigned long hcd_priv[0]; > }; > > struct ehci_hcd; > > static inline __attribute__((always_inline)) struct ehci_hcd > *hcd_to_ehci (struct usb_hcd *hcd) > { > return (struct ehci_hcd *) (hcd->hcd_priv); > } > > static inline __attribute__((always_inline)) struct usb_hcd > *ehci_to_hcd (struct ehci_hcd *ehci) > { > return ({ const typeof( ((struct usb_hcd *)0)->hcd_priv ) *__mptr = > ((void *) ehci); (struct usb_hcd *)( (char *)__mptr - > __builtin_offsetof(struct usb_hcd,hcd_priv) );}); > } > > struct device *x; > > static int ehci_pci_setup(struct usb_hcd *hcd) > { > struct ehci_hcd *ehci = hcd_to_ehci(hcd); > // [...] > // store of expression added in order to prevent it from being DCE'ed > x = ehci_to_hcd(ehci)->self.controller; > // [...] > } > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2555 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20071113/4dcec5ff/attachment.bin>
Daniel Berlin
2007-Nov-15 07:36 UTC
[LLVMdev] BasicAliasAnalysis and out-of-bound GEP indices
Sadly, this will break a very common idiom. In GCC, we discovered it to be common enough that it broke a *bunch* of C code. In particular, you will break struct foo { int a; char name[0]; } bar = malloc(sizeof (struct foo) + strlen("thisismyname") + 1); strcpy(bar->name, "thisismyname"); It only started turning up when we started doing higher level loop opts and used alias info in dependence testing. It would end up reversing or interchanging loops around these things which while legal, broke enough software that we got yelled at. So we special case the [0] at end of struct case. On 11/13/07, Owen Anderson <resistor at mac.com> wrote:> It's an optimization opportunity! > > When behavior is undefined, we're free to interpret it to be "whatever > makes optimization easiest." If the two do actually happen to alias, > well, it's the programmer's fault anyways, because they were doing > something undefined! > > --Owen > > On Nov 13, 2007, at 4:13 PM, Wojciech Matyjewicz wrote: > > > Hi! > > > > While investigating into the PR1782 I spent some time analyzing > > BasicAliasAnalysis.cpp. While the mentioned problem should be fixed > > now > > (I hope), I have discovered some other possibilities for a bug to > > occur. > > > > In the case of checking for aliasing of two pointer values, where at > > least one of them is a GEP instruction with out-of-bound indices, > > BasicAliasAnalysis can return NoAlias, even if the values point the > > same > > memory cell. I'm not sure if it's an error, as out-of-bound indexing > > can > > bring undefined result (as stated in LangRef.html). > > > > Please, take a look at this code (BasicAA results are in comments). > > --- > > target datalayout > > "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32- > > f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32" > > > > %ty1 = type [2 x [2 x i8]] > > %ty2 = type [2 x {i8, [2 x i8]}] > > > > @G = external global %ty1 > > @H = external global %ty2 > > > > ; alias(%a, %b, 1, 1) returns NoAlias, > > ; although they point to the same memory cell > > define void @fun1() { > > entry: > > %a = getelementptr %ty1* @G, i32 0, i32 1, i32 -2 > > %b = bitcast %ty1* @G to i8* > > ret void > > } > > > > ; alias(%c, %d, 1, 1) returns NoAlias, > > ; although they may point to the same memory cell (if %x == -2) > > define void @fun2(i32 %x) { > > entry: > > %c = getelementptr %ty1* @G, i32 0, i32 1, i32 %x > > %d = bitcast %ty1* @G to i8* > > ret void > > } > > > > ; alias(%e, %f, 1, 1) returns NoAlias, > > ; although they may point to the same memory cell (if %z == 2) > > define void @fun3(i32 %z) { > > entry: > > %e = getelementptr %ty2* @H, i32 0, i32 1, i32 0 > > %f = getelementptr %ty2* @H, i32 0, i32 0, i32 1, i32 %z > > ret void > > } > > --- > > > > Should we treat this as a bug (I may attempt to prepare a patch > > then)?. > > I suppose these are real corner cases, rather improbable to occur in > > real-world programs. However, something similiar (but no so weird) can > > be produced from the attached C source. It is a reduced version of > > preprocessed ehci-hcd.c file (from Linux kernel; compilation of this > > file discovered bug #1782). > > > > --Wojtek > > struct device; > > > > struct usb_bus { > > struct device *controller; > > }; > > > > struct usb_hcd { > > struct usb_bus self; > > > > unsigned long hcd_priv[0]; > > }; > > > > struct ehci_hcd; > > > > static inline __attribute__((always_inline)) struct ehci_hcd > > *hcd_to_ehci (struct usb_hcd *hcd) > > { > > return (struct ehci_hcd *) (hcd->hcd_priv); > > } > > > > static inline __attribute__((always_inline)) struct usb_hcd > > *ehci_to_hcd (struct ehci_hcd *ehci) > > { > > return ({ const typeof( ((struct usb_hcd *)0)->hcd_priv ) *__mptr > > ((void *) ehci); (struct usb_hcd *)( (char *)__mptr - > > __builtin_offsetof(struct usb_hcd,hcd_priv) );}); > > } > > > > struct device *x; > > > > static int ehci_pci_setup(struct usb_hcd *hcd) > > { > > struct ehci_hcd *ehci = hcd_to_ehci(hcd); > > // [...] > > // store of expression added in order to prevent it from being DCE'ed > > x = ehci_to_hcd(ehci)->self.controller; > > // [...] > > } > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > >
Seemingly Similar Threads
- [LLVMdev] BasicAliasAnalysis and out-of-bound GEP indices
- [LLVMdev] BasicAliasAnalysis and out-of-bound GEP indices
- [LLVMdev] BasicAliasAnalysis and out-of-bound GEP indices
- [LLVMdev] BasicAliasAnalysis and out-of-bound GEP indices
- [PATCH] EHCI/Xen: propagate controller reset information to hypervisor