Zvonimir Rakamaric
2014-Apr-01 04:37 UTC
[LLVMdev] LLVM is doing something a bit weird in this example (which messes up DSA)
Thanks for your help John... Yup, I looked at Local.cpp even before I fired off my question to the mailing list. Take a peek here at line 464: https://github.com/llvm-mirror/poolalloc/blob/master/lib/DSA/Local.cpp Based on my understanding of this line, if AtomicCmpXchgInst does not return a pointer type, nothing gets merged. And in the example I posted, a pointer value is indeed not returned (but rather an i64 value "masking" an i32* value). And so DSA does not do the right thing in this case... -- Zvonimir -- http://zvonimir.info http://soarlab.org/ On Mon, Mar 31, 2014 at 10:29 PM, John Criswell <criswell at illinois.edu> wrote:> On 3/31/14, 6:09 PM, Zvonimir Rakamaric wrote: >> >> Hi all, >> >> I have yet another DSA-related question :), and I would appreciate >> your help. Actually, the following example generates some interesting >> potential issues in the LLVM IR too. >> >> Here is the example in C: >> #define CAS(x,y,z) __atomic_compare_exchange_n(x,&(y),z,true,0,0) >> >> int main() { >> int *x = 0; >> int y = 0; >> int *z = x; >> CAS(&z,x,&y); // if (z == x) z = &y; >> assert(*z == y); >> return 0; >> } >> >> Now, when compiled into LLVM IR 3.4 (with -mem2reg), an interesting >> thing happens in this LLVM IR excerpt: >> %1 = bitcast i32** %z to i64* >> %2 = bitcast i32** %x to i64* >> %3 = bitcast i32** %.atomictmp to i64* >> %4 = load i64* %2, align 8 >> %5 = load i64* %3, align 8 >> %6 = cmpxchg i64* %1, i64 %4, i64 %5 monotonic >> >> More specifically, there is this %2 bitcast and the subsequent %4 load >> that effectively turned an i32* pointer value into an i64 integer >> value without using ptrtoint instruction. >> My first question is whether that is even allowed in LLVM IR? >> It feels like ptrtoint gets bypassed somehow, which does not seem right. > > > I believe this optimization is correct (although it obscures source-level > types information) on systems for which pointers are 64 bits in size (e.g., > x86_64). > > > >> >> Now, if this LLVM IR code is indeed fine, then we have a problem with >> DSA when it gets to this cmpxchg instruction since DSA then does not >> know that the second and third arguments to cmpxchg are in fact >> pointers. Messy stuff... > > > Have you looked at the local DSGraphs to verify that DSA is not inferring > the pointer type for the fields accessed by the cmpxchg? I would think that > DSA can handle this sort of thing, although I can think of two reasons that > it may not: > > a) It may not properly understand the semantics of the cmpxchg instruction. > A quick check into the Local.cpp file could confirm this. > > b) DSA may be "losing" the pointer through the bitcast. I think we've > modified DSA to not do that, but again, it's possible. > > Can you verify that DSA is not inferring a link for offset 0 in %1? > > -- John T. > >> >> Your help would be greatly appreciated... >> >> Thanks! >> >> Cheers, >> -- Zvonimir >> >> -- >> http://zvonimir.info >> http://soarlab.org/ >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >
Zvonimir Rakamaric
2014-Apr-04 04:56 UTC
[LLVMdev] LLVM is doing something a bit weird in this example (which messes up DSA)
Hi John, Sorry for bugging you again, but I would just like to know if you are maybe looking into this to give us a hand. Coming back to our code: %1 = bitcast i32** %z to i64* %2 = bitcast i32** %x to i64* %3 = bitcast i32** %.atomictmp to i64* %4 = load i64* %2, align 8 %5 = load i64* %3, align 8 %6 = cmpxchg i64* %1, i64 %4, i64 %5 monotonic And after studying DSA some more, I think that DSA "loses" a pointer in the load instruction %4 (or maybe bitcast %2). Anyhow, I am not a DSA expert, but rather just an avid user :), and so your help would be greatly appreciated. One potential solution that comes to my mind would be to mark any pointers for which one level of indirection gets removed with bitcast using DSA's PtrToIntMarker (since that is what can ultimately happen with such pointers). Do you think that would help? Thanks! Best, -- Zvonimir -- http://zvonimir.info http://soarlab.org/ On Mon, Mar 31, 2014 at 10:37 PM, Zvonimir Rakamaric <zvonimir at cs.utah.edu> wrote:> Thanks for your help John... > > Yup, I looked at Local.cpp even before I fired off my question to the > mailing list. > > Take a peek here at line 464: > https://github.com/llvm-mirror/poolalloc/blob/master/lib/DSA/Local.cpp > > Based on my understanding of this line, if AtomicCmpXchgInst does not > return a pointer type, nothing gets merged. And in the example I > posted, a pointer value is indeed not returned (but rather an i64 > value "masking" an i32* value). And so DSA does not do the right thing > in this case... > > -- Zvonimir > > > -- > http://zvonimir.info > http://soarlab.org/ > > > On Mon, Mar 31, 2014 at 10:29 PM, John Criswell <criswell at illinois.edu> wrote: >> On 3/31/14, 6:09 PM, Zvonimir Rakamaric wrote: >>> >>> Hi all, >>> >>> I have yet another DSA-related question :), and I would appreciate >>> your help. Actually, the following example generates some interesting >>> potential issues in the LLVM IR too. >>> >>> Here is the example in C: >>> #define CAS(x,y,z) __atomic_compare_exchange_n(x,&(y),z,true,0,0) >>> >>> int main() { >>> int *x = 0; >>> int y = 0; >>> int *z = x; >>> CAS(&z,x,&y); // if (z == x) z = &y; >>> assert(*z == y); >>> return 0; >>> } >>> >>> Now, when compiled into LLVM IR 3.4 (with -mem2reg), an interesting >>> thing happens in this LLVM IR excerpt: >>> %1 = bitcast i32** %z to i64* >>> %2 = bitcast i32** %x to i64* >>> %3 = bitcast i32** %.atomictmp to i64* >>> %4 = load i64* %2, align 8 >>> %5 = load i64* %3, align 8 >>> %6 = cmpxchg i64* %1, i64 %4, i64 %5 monotonic >>> >>> More specifically, there is this %2 bitcast and the subsequent %4 load >>> that effectively turned an i32* pointer value into an i64 integer >>> value without using ptrtoint instruction. >>> My first question is whether that is even allowed in LLVM IR? >>> It feels like ptrtoint gets bypassed somehow, which does not seem right. >> >> >> I believe this optimization is correct (although it obscures source-level >> types information) on systems for which pointers are 64 bits in size (e.g., >> x86_64). >> >> >> >>> >>> Now, if this LLVM IR code is indeed fine, then we have a problem with >>> DSA when it gets to this cmpxchg instruction since DSA then does not >>> know that the second and third arguments to cmpxchg are in fact >>> pointers. Messy stuff... >> >> >> Have you looked at the local DSGraphs to verify that DSA is not inferring >> the pointer type for the fields accessed by the cmpxchg? I would think that >> DSA can handle this sort of thing, although I can think of two reasons that >> it may not: >> >> a) It may not properly understand the semantics of the cmpxchg instruction. >> A quick check into the Local.cpp file could confirm this. >> >> b) DSA may be "losing" the pointer through the bitcast. I think we've >> modified DSA to not do that, but again, it's possible. >> >> Can you verify that DSA is not inferring a link for offset 0 in %1? >> >> -- John T. >> >>> >>> Your help would be greatly appreciated... >>> >>> Thanks! >>> >>> Cheers, >>> -- Zvonimir >>> >>> -- >>> http://zvonimir.info >>> http://soarlab.org/ >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >>
Zvonimir Rakamaric
2014-Apr-04 22:39 UTC
[LLVMdev] LLVM is doing something a bit weird in this example (which messes up DSA)
Quick update on this: it appears that DSA is correctly marking the nodes in question (in my example below) as IntToPtr nodes even though inttoptr operation is in fact happening implicitly through bitcast. So we are now leveraging (in our SMACK verifier) the IntToPtr flag to compute conservative alias information and hence preserve soundness of the verifier in these kinds of situations. Hopefully that will be good enough... -- http://zvonimir.info http://soarlab.org/ On Thu, Apr 3, 2014 at 10:56 PM, Zvonimir Rakamaric <zvonimir at cs.utah.edu> wrote:> Hi John, > > Sorry for bugging you again, but I would just like to know if you are > maybe looking into this to give us a hand. > > Coming back to our code: > %1 = bitcast i32** %z to i64* > %2 = bitcast i32** %x to i64* > %3 = bitcast i32** %.atomictmp to i64* > %4 = load i64* %2, align 8 > %5 = load i64* %3, align 8 > %6 = cmpxchg i64* %1, i64 %4, i64 %5 monotonic > > And after studying DSA some more, I think that DSA "loses" a pointer > in the load instruction %4 (or maybe bitcast %2). > > Anyhow, I am not a DSA expert, but rather just an avid user :), and so > your help would be greatly appreciated. > > One potential solution that comes to my mind would be to mark any > pointers for which one level of indirection gets removed with bitcast > using DSA's PtrToIntMarker (since that is what can ultimately happen > with such pointers). Do you think that would help? > > Thanks! > > Best, > -- Zvonimir > > > -- > http://zvonimir.info > http://soarlab.org/ > > > On Mon, Mar 31, 2014 at 10:37 PM, Zvonimir Rakamaric > <zvonimir at cs.utah.edu> wrote: >> Thanks for your help John... >> >> Yup, I looked at Local.cpp even before I fired off my question to the >> mailing list. >> >> Take a peek here at line 464: >> https://github.com/llvm-mirror/poolalloc/blob/master/lib/DSA/Local.cpp >> >> Based on my understanding of this line, if AtomicCmpXchgInst does not >> return a pointer type, nothing gets merged. And in the example I >> posted, a pointer value is indeed not returned (but rather an i64 >> value "masking" an i32* value). And so DSA does not do the right thing >> in this case... >> >> -- Zvonimir >> >> >> -- >> http://zvonimir.info >> http://soarlab.org/ >> >> >> On Mon, Mar 31, 2014 at 10:29 PM, John Criswell <criswell at illinois.edu> wrote: >>> On 3/31/14, 6:09 PM, Zvonimir Rakamaric wrote: >>>> >>>> Hi all, >>>> >>>> I have yet another DSA-related question :), and I would appreciate >>>> your help. Actually, the following example generates some interesting >>>> potential issues in the LLVM IR too. >>>> >>>> Here is the example in C: >>>> #define CAS(x,y,z) __atomic_compare_exchange_n(x,&(y),z,true,0,0) >>>> >>>> int main() { >>>> int *x = 0; >>>> int y = 0; >>>> int *z = x; >>>> CAS(&z,x,&y); // if (z == x) z = &y; >>>> assert(*z == y); >>>> return 0; >>>> } >>>> >>>> Now, when compiled into LLVM IR 3.4 (with -mem2reg), an interesting >>>> thing happens in this LLVM IR excerpt: >>>> %1 = bitcast i32** %z to i64* >>>> %2 = bitcast i32** %x to i64* >>>> %3 = bitcast i32** %.atomictmp to i64* >>>> %4 = load i64* %2, align 8 >>>> %5 = load i64* %3, align 8 >>>> %6 = cmpxchg i64* %1, i64 %4, i64 %5 monotonic >>>> >>>> More specifically, there is this %2 bitcast and the subsequent %4 load >>>> that effectively turned an i32* pointer value into an i64 integer >>>> value without using ptrtoint instruction. >>>> My first question is whether that is even allowed in LLVM IR? >>>> It feels like ptrtoint gets bypassed somehow, which does not seem right. >>> >>> >>> I believe this optimization is correct (although it obscures source-level >>> types information) on systems for which pointers are 64 bits in size (e.g., >>> x86_64). >>> >>> >>> >>>> >>>> Now, if this LLVM IR code is indeed fine, then we have a problem with >>>> DSA when it gets to this cmpxchg instruction since DSA then does not >>>> know that the second and third arguments to cmpxchg are in fact >>>> pointers. Messy stuff... >>> >>> >>> Have you looked at the local DSGraphs to verify that DSA is not inferring >>> the pointer type for the fields accessed by the cmpxchg? I would think that >>> DSA can handle this sort of thing, although I can think of two reasons that >>> it may not: >>> >>> a) It may not properly understand the semantics of the cmpxchg instruction. >>> A quick check into the Local.cpp file could confirm this. >>> >>> b) DSA may be "losing" the pointer through the bitcast. I think we've >>> modified DSA to not do that, but again, it's possible. >>> >>> Can you verify that DSA is not inferring a link for offset 0 in %1? >>> >>> -- John T. >>> >>>> >>>> Your help would be greatly appreciated... >>>> >>>> Thanks! >>>> >>>> Cheers, >>>> -- Zvonimir >>>> >>>> -- >>>> http://zvonimir.info >>>> http://soarlab.org/ >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>>