Stephan Falke
2011-Aug-10 14:37 UTC
[LLVMdev] Handling of pointer difference in llvm-gcc and clang
Hi, We are developing a bounded model checker for C/C++ programs (http://baldur.iti.kit.edu/llbmc/) that operates on LLVM's intermediate representation. While checking a C++ program that uses STL containers we noticed that llvm-gcc and clang handle pointer differences in disagreeing ways. Consider the following C function: int f(int *p, int *q) { return q - p; } Here's the LLVM code generated by llvm-gcc (2.9): define i32 @f(i32* %p, i32* %q) nounwind readnone { entry: %0 = ptrtoint i32* %q to i32 %1 = ptrtoint i32* %p to i32 %2 = sub nsw i32 %0, %1 %3 = ashr exact i32 %2, 2 ret i32 %3 } And here is what clang (2.9) produces: define i32 @f(i32* %p, i32* %q) nounwind readnone { %1 = ptrtoint i32* %q to i32 %2 = ptrtoint i32* %p to i32 %3 = sub i32 %1, %2 %4 = ashr exact i32 %3, 2 ret i32 %4 } Thus, llvm-gcc added the nsw flag to the sub, whereas clang didn't. We think that clang is right and llvm-gcc is wrong: it could be the case that p and q point into the same array, that q is 0x80000000, and that p is 0x7FFFFFFE. Then the sub results in a signed overflow, i.e., sub with nsw is a trap value. Is this a bug in llvm-gcc? --Stephan
Duncan Sands
2011-Aug-10 16:13 UTC
[LLVMdev] Handling of pointer difference in llvm-gcc and clang
Hi Stephan,> We are developing a bounded model checker for C/C++ programs > (http://baldur.iti.kit.edu/llbmc/) that operates on LLVM's intermediate > representation. While checking a C++ program that uses STL containers > we noticed that llvm-gcc and clang handle pointer differences in > disagreeing ways. > > Consider the following C function: > int f(int *p, int *q) > { > return q - p; > } > > Here's the LLVM code generated by llvm-gcc (2.9): > define i32 @f(i32* %p, i32* %q) nounwind readnone { > entry: > %0 = ptrtoint i32* %q to i32 > %1 = ptrtoint i32* %p to i32 > %2 = sub nsw i32 %0, %1 > %3 = ashr exact i32 %2, 2 > ret i32 %3 > } > > And here is what clang (2.9) produces: > define i32 @f(i32* %p, i32* %q) nounwind readnone { > %1 = ptrtoint i32* %q to i32 > %2 = ptrtoint i32* %p to i32 > %3 = sub i32 %1, %2 > %4 = ashr exact i32 %3, 2 > ret i32 %4 > } > > Thus, llvm-gcc added the nsw flag to the sub, whereas clang didn't. > > We think that clang is right and llvm-gcc is wrong: it could be the > case that p and q point into the same array, that q is 0x80000000, and > that p is 0x7FFFFFFE. Then the sub results in a signed overflow, i.e., > sub with nsw is a trap value. > > Is this a bug in llvm-gcc?in llvm-gcc (and dragonegg) this is coming directly from GCC's gimple: f (int * p, int * q) { long int D.2718; long int D.2717; long int p.1; long int q.0; int D.2714; <bb 2>: q.0_2 = (long int) q_1(D); p.1_4 = (long int) p_3(D); D.2717_5 = q.0_2 - p.1_4; D.2718_6 = D.2717_5 /[ex] 4; D.2714_7 = (int) D.2718_6; return D.2714_7; } Signed overflow in the difference of two long int (ptrdiff_t) values results in undefined behaviour according to the GCC type system, which is where the nsw flag comes from. The C front-end generates this gimple in the pointer_diff routine. The above is basically a direct transcription of what pointer_diff does. In short, I don't know if this is right or wrong; but if it is wrong it seems to be a bug in GCC's C frontend. Ciao, Duncan.
Florian Merz
2011-Aug-10 16:43 UTC
[LLVMdev] Handling of pointer difference in llvm-gcc and clang
Hi Duncan, Am Wednesday, 10. August 2011, 18:13:16 schrieb Duncan Sands:> Hi Stephan, > > > We are developing a bounded model checker for C/C++ programs > > (http://baldur.iti.kit.edu/llbmc/) that operates on LLVM's intermediate > > representation. While checking a C++ program that uses STL containers > > we noticed that llvm-gcc and clang handle pointer differences in > > disagreeing ways. > > > > Consider the following C function: > > int f(int *p, int *q) > > { > > > > return q - p; > > > > } > > > > Here's the LLVM code generated by llvm-gcc (2.9): > > define i32 @f(i32* %p, i32* %q) nounwind readnone { > > > > entry: > > %0 = ptrtoint i32* %q to i32 > > %1 = ptrtoint i32* %p to i32 > > %2 = sub nsw i32 %0, %1 > > %3 = ashr exact i32 %2, 2 > > ret i32 %3 > > > > } > > > > And here is what clang (2.9) produces: > > define i32 @f(i32* %p, i32* %q) nounwind readnone { > > > > %1 = ptrtoint i32* %q to i32 > > %2 = ptrtoint i32* %p to i32 > > %3 = sub i32 %1, %2 > > %4 = ashr exact i32 %3, 2 > > ret i32 %4 > > > > } > > > > Thus, llvm-gcc added the nsw flag to the sub, whereas clang didn't. > > > > We think that clang is right and llvm-gcc is wrong: it could be the > > case that p and q point into the same array, that q is 0x80000000, and > > that p is 0x7FFFFFFE. Then the sub results in a signed overflow, i.e., > > sub with nsw is a trap value. > > > > Is this a bug in llvm-gcc? > > in llvm-gcc (and dragonegg) this is coming directly from GCC's gimple: > > f (int * p, int * q) > { > long int D.2718; > long int D.2717; > long int p.1; > long int q.0; > int D.2714; > > <bb 2>: > q.0_2 = (long int) q_1(D); > p.1_4 = (long int) p_3(D); > D.2717_5 = q.0_2 - p.1_4; > D.2718_6 = D.2717_5 /[ex] 4; > D.2714_7 = (int) D.2718_6; > return D.2714_7; > > } > > Signed overflow in the difference of two long int (ptrdiff_t) values > results in undefined behaviour according to the GCC type system, which is > where the nsw flag comes from. > > The C front-end generates this gimple in the pointer_diff routine. The > above is basically a direct transcription of what pointer_diff does. > > In short, I don't know if this is right or wrong; but if it is wrong it > seems to be a bug in GCC's C frontend. > > Ciao, Duncan.In section 6.5.6 (Additive operators) the C standard says that the result of subtracting two pointers should have type ptrdiff_t, which is signed. I could not find any clear statement about this particular case in the standard, but to me it does not make much sense to treat the subtraction as signed just because the result is. As Stephan mentioned above treating the subtraction as signed declares the result of subtracting pointers smaller than 0x7F...F from pointers larger than 0x80...0 as undefined. This can hardly be intendend. Regards, Florian
Jack Howarth
2011-Aug-10 17:12 UTC
[LLVMdev] Handling of pointer difference in llvm-gcc and clang
On Wed, Aug 10, 2011 at 06:13:16PM +0200, Duncan Sands wrote:> Hi Stephan, > > > We are developing a bounded model checker for C/C++ programs > > (http://baldur.iti.kit.edu/llbmc/) that operates on LLVM's intermediate > > representation. While checking a C++ program that uses STL containers > > we noticed that llvm-gcc and clang handle pointer differences in > > disagreeing ways. > > > > Consider the following C function: > > int f(int *p, int *q) > > { > > return q - p; > > } > > > > Here's the LLVM code generated by llvm-gcc (2.9): > > define i32 @f(i32* %p, i32* %q) nounwind readnone { > > entry: > > %0 = ptrtoint i32* %q to i32 > > %1 = ptrtoint i32* %p to i32 > > %2 = sub nsw i32 %0, %1 > > %3 = ashr exact i32 %2, 2 > > ret i32 %3 > > } > > > > And here is what clang (2.9) produces: > > define i32 @f(i32* %p, i32* %q) nounwind readnone { > > %1 = ptrtoint i32* %q to i32 > > %2 = ptrtoint i32* %p to i32 > > %3 = sub i32 %1, %2 > > %4 = ashr exact i32 %3, 2 > > ret i32 %4 > > } > > > > Thus, llvm-gcc added the nsw flag to the sub, whereas clang didn't. > > > > We think that clang is right and llvm-gcc is wrong: it could be the > > case that p and q point into the same array, that q is 0x80000000, and > > that p is 0x7FFFFFFE. Then the sub results in a signed overflow, i.e., > > sub with nsw is a trap value. > > > > Is this a bug in llvm-gcc? > > in llvm-gcc (and dragonegg) this is coming directly from GCC's gimple: > > f (int * p, int * q) > { > long int D.2718; > long int D.2717; > long int p.1; > long int q.0; > int D.2714; > > <bb 2>: > q.0_2 = (long int) q_1(D); > p.1_4 = (long int) p_3(D); > D.2717_5 = q.0_2 - p.1_4; > D.2718_6 = D.2717_5 /[ex] 4; > D.2714_7 = (int) D.2718_6; > return D.2714_7; > > } > > Signed overflow in the difference of two long int (ptrdiff_t) values results in > undefined behaviour according to the GCC type system, which is where the nsw > flag comes from. > > The C front-end generates this gimple in the pointer_diff routine. The above is > basically a direct transcription of what pointer_diff does. > > In short, I don't know if this is right or wrong; but if it is wrong it seems > to be a bug in GCC's C frontend.Shouldn't we cc this over to the gcc mailing list for clarification then? Jack> > Ciao, Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev