via llvm-dev
2018-Apr-27 15:53 UTC
[llvm-dev] [DbgInfo] Potential bug in location list address ranges
As Adrian said, we'd need to see the source of foo() to assess what the location-list for bar ought to be. Without actually going to look, I would guess that 'poplt' is considered a conditional move, therefore r4's contents are not guaranteed after it executes (i.e. it is a clobber). If one operand of 'poplt' is 'pc' then of course it is also a conditional indirect branch (which is probably but not necessarily a return). This combination might be worth handling differently for location-list purposes. But this is a tricky area, and we'd need to consider the consequences carefully. --paulr From: aprantl at apple.com [mailto:aprantl at apple.com] Sent: Friday, April 27, 2018 11:22 AM To: Son Tuan VU Cc: Robinson, Paul; Vedant Kumar; dblaikie at gmail.com; llvm-dev Subject: Re: [DbgInfo] Potential bug in location list address ranges On Apr 27, 2018, at 7:48 AM, Son Tuan VU <sontuan.vu119 at gmail.com<mailto:sontuan.vu119 at gmail.com>> wrote: Hi all, Consider this ARM assembly code of a C function: 00008124 <foo>: 8124: push {r4, r6, r7, lr} 8126: add r7, sp, #8 8128: mov r4, r0 812a: ldrsb.w r0, [r2] 812e: cmp r0, #1 8130: itt lt 8132: movlt r0, #85 ; 0x55 8134: poplt {r4, r6, r7, pc} // a function return 8136: ldrb.w ip, [r1, #3] 813a: ldrb.w lr, [r4, #3] 813e: movs r0, #85 ; 0x55 8140: cmp lr, ip 8142: bne.n 8168 <foo+0x44> 8144: ldrb.w ip, [r1, #2] 8148: ldrb r3, [r4, #2] 814a: cmp r3, ip 814c: it ne 814e: popne {r4, r6, r7, pc} // a function return 8150: ldrb.w ip, [r1, #1] 8154: ldrb r3, [r4, #1] 8156: cmp r3, ip 8158: bne.n 8168 <foo+0x44> 815a: ldrb r1, [r1, #0] 815c: ldrb r3, [r4, #0] 815e: cmp r3, r1 8160: ittt eq 8162: moveq r0, #3 8164: strbeq r0, [r2, #0] 8166: moveq r0, #170 ; 0xaa 8168: pop {r4, r6, r7, pc} // a function return I have a variable bar and here's its corresponding DWARF DIE: <2><3b>: Abbrev Number: 3 (DW_TAG_formal_parameter) <3c> DW_AT_location : 0x0 (location list) <40> DW_AT_name : (indirect string, offset: 0x9e): bar <44> DW_AT_decl_file : 1 <45> DW_AT_decl_line : 34 <46> DW_AT_type : <0x153> // Its location list 00000000 00008124 0000812a (DW_OP_reg0 (r0)) 0000000b 0000812a 00008136 (DW_OP_reg4 (r4)) 00000016 <End of list> As you can see, it says that we can find bar in r4 from 0x812a to 0x8134 (poplt). However, this is only true when the cmp instruction at 0x812e yields less than (lt). So if the value in r0 is greater than 1 (which is the case of my input), we should still be able to read the value of bar from r4 in the remaining of the function. I don't know if we can consider this a bug, because I don't even know what should be the correct location information for bar. However, in this case, since the conditional instruction that clobbers r4 is a function return, I'd expect to read the value of bar from r4 in the remaining of the function. I can't tell for sure whether the debug info is correct without also seeing the source code, but as a general point: Debug information is must-information that holds over all paths through the program. Debug information that is only accurate for some paths is a bug. A serious bug, because if the user can't rely on the debug info to be correct in some cases, they can't rely on any of the debug info to be correct. -- adrian If the conditional instruction poplt was addlt r4, r0, 3 for example, what should be the correct location list of bar? For now, my only idea is to check if the clobbering MI is a conditional return in DbgValueHistoryCalculator which computes the end address of a location llist entry. But I do not feel like this is the correct fix though. Looking forward to hearing your thoughts on this, Thank you for reading this, Son Tuan Vu -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180427/8f76bc77/attachment-0001.html>
Son Tuan VU via llvm-dev
2018-Apr-27 17:29 UTC
[llvm-dev] [DbgInfo] Potential bug in location list address ranges
Thank you all for taking a look at this. I pasted the C source then deleted it because I was afraid that it was too long to read... Here's the code of *foo*. Its real name is *verifyPIN*. The variable *bar* is *userPin*. int *verifyPIN*(char **userPin*, char *cardPin, int *cpt) { int i; int status; int diff; if (*cpt > 0) { status = 0x55; diff = 0x55; for (i = 0; i < 4; i++) { if (*userPin*[i] != cardPin[i]) { diff = 0xAA; } } if (diff == 0x55) { status = 0xAA; } else { status = 0x55; } if (status == 0xAA) { *cpt = 3; return 0xAA; } else { *cpt--; return 0x55; } } return 0x55; } @paul: Yes you are right, I have investigated the backend and it all starts at *IfConversionPass*. *r4* is clobbered by *poplt*, and there's no logic to handle conditional instruction in *DbgValueHistoryCalculator*, thus the issue at the binary level. Son Tuan Vu On Fri, Apr 27, 2018 at 5:53 PM, <paul.robinson at sony.com> wrote:> As Adrian said, we'd need to see the source of foo() to assess what the > location-list for bar ought to be. > > Without actually going to look, I would guess that 'poplt' is considered a > conditional move, therefore r4's contents are not guaranteed after it > executes (i.e. it is a clobber). If one operand of 'poplt' is 'pc' then of > course it is also a conditional indirect branch (which is probably but not > necessarily a return). This combination might be worth handling > differently for location-list purposes. > > But this is a tricky area, and we'd need to consider the consequences > carefully. > > --paulr > > > > *From:* aprantl at apple.com [mailto:aprantl at apple.com] > *Sent:* Friday, April 27, 2018 11:22 AM > *To:* Son Tuan VU > *Cc:* Robinson, Paul; Vedant Kumar; dblaikie at gmail.com; llvm-dev > *Subject:* Re: [DbgInfo] Potential bug in location list address ranges > > > > > > > > On Apr 27, 2018, at 7:48 AM, Son Tuan VU <sontuan.vu119 at gmail.com> wrote: > > > > Hi all, > > > > Consider this ARM assembly code of a C function: > > > > 00008124 <foo>: > > 8124: push {r4, r6, r7, lr} > > 8126: add r7, sp, #8 > > 8128: mov r4, r0 > > 812a: ldrsb.w r0, [r2] > > 812e: cmp r0, #1 > > 8130: itt lt > > 8132: movlt r0, #85 ; 0x55 > > 8134: poplt {r4, r6, r7, pc} // a > function return > > > > 8136: ldrb.w ip, [r1, #3] > > 813a: ldrb.w lr, [r4, #3] > > 813e: movs r0, #85 ; 0x55 > > 8140: cmp lr, ip > > 8142: bne.n 8168 <foo+0x44> > > > > 8144: ldrb.w ip, [r1, #2] > > 8148: ldrb r3, [r4, #2] > > 814a: cmp r3, ip > > 814c: it ne > > 814e: popne {r4, r6, r7, pc} // a > function return > > > > 8150: ldrb.w ip, [r1, #1] > > 8154: ldrb r3, [r4, #1] > > 8156: cmp r3, ip > > 8158: bne.n 8168 <foo+0x44> > > > > 815a: ldrb r1, [r1, #0] > > 815c: ldrb r3, [r4, #0] > > 815e: cmp r3, r1 > > 8160: ittt eq > > 8162: moveq r0, #3 > > 8164: strbeq r0, [r2, #0] > > 8166: moveq r0, #170 ; 0xaa > > 8168: pop {r4, r6, r7, pc} // a > function return > > > > I have a variable *bar* and here's its corresponding DWARF DIE: > > > > <2><3b>: Abbrev Number: 3 (DW_TAG_formal_parameter) > > <3c> DW_AT_location : 0x0 (location list) > > <40> DW_AT_name : (indirect string, offset: 0x9e): *bar* > > <44> DW_AT_decl_file : 1 > > <45> DW_AT_decl_line : 34 > > <46> DW_AT_type : <0x153> > > > > // *Its location list* > > 00000000 00008124 0000812a (DW_OP_reg0 (r0)) > > 0000000b 0000812a 00008136 (DW_OP_reg4 (r4)) > > 00000016 <End of list> > > > > As you can see, it says that we can find *bar *in *r4* from *0x812a *to *0x8134 > (poplt)*. However, this is only true when the *cmp *instruction at > *0x812e* yields *less than (lt)*. So if the value in *r0 *is greater > than 1 (which is the case of my input), we should still be able to read the > value of *bar* from *r4* in the remaining of the function. > > > > I don't know if we can consider this a bug, because I don't even know what > should be the correct location information for *bar*. However, in this > case, since the conditional instruction that clobbers *r4* is a function > return, I'd expect to read the value of *bar* from * r4* in the remaining > of the function. > > > > I can't tell for sure whether the debug info is correct without also > seeing the source code, but as a general point: Debug information is > *must*-information that holds over all paths through the program. Debug > information that is only accurate for some paths is a bug. A serious bug, > because if the user can't rely on the debug info to be correct in *some* > cases, they can't rely on *any* of the debug info to be correct. > > > > -- adrian > > > > > > If the conditional instruction *poplt *was *addlt r4, r0, 3* for example, > what should be the correct location list of *bar*? > > > > For now, my only idea is to check if the clobbering MI is a * conditional > return* in *DbgValueHistoryCalculator* which computes the end address of > a location llist entry. But I do not feel like this is the correct fix > though. > > > > Looking forward to hearing your thoughts on this, > > > > Thank you for reading this, > > > > Son Tuan Vu > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180427/ccd60f7b/attachment.html>
Son Tuan VU via llvm-dev
2018-May-07 16:16 UTC
[llvm-dev] [DbgInfo] Potential bug in location list address ranges
Hello, Has anyone taken a look at this bug? I really want to fix this, but as Paul pointed out, this requires a lot of care... Thank you for your help Son Tuan Vu On Fri, Apr 27, 2018 at 7:29 PM, Son Tuan VU <sontuan.vu119 at gmail.com> wrote:> Thank you all for taking a look at this. I pasted the C source then > deleted it because I was afraid that it was too long to read... > > Here's the code of *foo*. Its real name is *verifyPIN*. The variable *bar* > is *userPin*. > > int *verifyPIN*(char **userPin*, char *cardPin, int *cpt) > { > int i; > int status; > int diff; > > if (*cpt > 0) { > status = 0x55; > diff = 0x55; > > for (i = 0; i < 4; i++) { > if (*userPin*[i] != cardPin[i]) { > diff = 0xAA; > } > } > > if (diff == 0x55) { > status = 0xAA; > } > else { > status = 0x55; > } > > if (status == 0xAA) { > *cpt = 3; > return 0xAA; > } else { > *cpt--; > return 0x55; > } > } > > return 0x55; > } > > @paul: Yes you are right, I have investigated the backend and it all > starts at *IfConversionPass*. *r4* is clobbered by *poplt*, and there's > no logic to handle conditional instruction in *DbgValueHistoryCalculator*, > thus the issue at the binary level. > > Son Tuan Vu > > On Fri, Apr 27, 2018 at 5:53 PM, <paul.robinson at sony.com> wrote: > >> As Adrian said, we'd need to see the source of foo() to assess what the >> location-list for bar ought to be. >> >> Without actually going to look, I would guess that 'poplt' is considered >> a conditional move, therefore r4's contents are not guaranteed after it >> executes (i.e. it is a clobber). If one operand of 'poplt' is 'pc' then of >> course it is also a conditional indirect branch (which is probably but not >> necessarily a return). This combination might be worth handling >> differently for location-list purposes. >> >> But this is a tricky area, and we'd need to consider the consequences >> carefully. >> >> --paulr >> >> >> >> *From:* aprantl at apple.com [mailto:aprantl at apple.com] >> *Sent:* Friday, April 27, 2018 11:22 AM >> *To:* Son Tuan VU >> *Cc:* Robinson, Paul; Vedant Kumar; dblaikie at gmail.com; llvm-dev >> *Subject:* Re: [DbgInfo] Potential bug in location list address ranges >> >> >> >> >> >> >> >> On Apr 27, 2018, at 7:48 AM, Son Tuan VU <sontuan.vu119 at gmail.com> wrote: >> >> >> >> Hi all, >> >> >> >> Consider this ARM assembly code of a C function: >> >> >> >> 00008124 <foo>: >> >> 8124: push {r4, r6, r7, lr} >> >> 8126: add r7, sp, #8 >> >> 8128: mov r4, r0 >> >> 812a: ldrsb.w r0, [r2] >> >> 812e: cmp r0, #1 >> >> 8130: itt lt >> >> 8132: movlt r0, #85 ; 0x55 >> >> 8134: poplt {r4, r6, r7, pc} // a >> function return >> >> >> >> 8136: ldrb.w ip, [r1, #3] >> >> 813a: ldrb.w lr, [r4, #3] >> >> 813e: movs r0, #85 ; 0x55 >> >> 8140: cmp lr, ip >> >> 8142: bne.n 8168 <foo+0x44> >> >> >> >> 8144: ldrb.w ip, [r1, #2] >> >> 8148: ldrb r3, [r4, #2] >> >> 814a: cmp r3, ip >> >> 814c: it ne >> >> 814e: popne {r4, r6, r7, pc} // a >> function return >> >> >> >> 8150: ldrb.w ip, [r1, #1] >> >> 8154: ldrb r3, [r4, #1] >> >> 8156: cmp r3, ip >> >> 8158: bne.n 8168 <foo+0x44> >> >> >> >> 815a: ldrb r1, [r1, #0] >> >> 815c: ldrb r3, [r4, #0] >> >> 815e: cmp r3, r1 >> >> 8160: ittt eq >> >> 8162: moveq r0, #3 >> >> 8164: strbeq r0, [r2, #0] >> >> 8166: moveq r0, #170 ; 0xaa >> >> 8168: pop {r4, r6, r7, pc} // a >> function return >> >> >> >> I have a variable *bar* and here's its corresponding DWARF DIE: >> >> >> >> <2><3b>: Abbrev Number: 3 (DW_TAG_formal_parameter) >> >> <3c> DW_AT_location : 0x0 (location list) >> >> <40> DW_AT_name : (indirect string, offset: 0x9e): *bar* >> >> <44> DW_AT_decl_file : 1 >> >> <45> DW_AT_decl_line : 34 >> >> <46> DW_AT_type : <0x153> >> >> >> >> // *Its location list* >> >> 00000000 00008124 0000812a (DW_OP_reg0 (r0)) >> >> 0000000b 0000812a 00008136 (DW_OP_reg4 (r4)) >> >> 00000016 <End of list> >> >> >> >> As you can see, it says that we can find *bar *in *r4* from *0x812a *to *0x8134 >> (poplt)*. However, this is only true when the *cmp *instruction at >> *0x812e* yields *less than (lt)*. So if the value in *r0 *is greater >> than 1 (which is the case of my input), we should still be able to read the >> value of *bar* from *r4* in the remaining of the function. >> >> >> >> I don't know if we can consider this a bug, because I don't even know >> what should be the correct location information for *bar*. However, in >> this case, since the conditional instruction that clobbers *r4* is a >> function return, I'd expect to read the value of *bar* from * r4* in the >> remaining of the function. >> >> >> >> I can't tell for sure whether the debug info is correct without also >> seeing the source code, but as a general point: Debug information is >> *must*-information that holds over all paths through the program. Debug >> information that is only accurate for some paths is a bug. A serious bug, >> because if the user can't rely on the debug info to be correct in *some* >> cases, they can't rely on *any* of the debug info to be correct. >> >> >> >> -- adrian >> >> >> >> >> >> If the conditional instruction *poplt *was *addlt r4, r0, 3* for >> example, what should be the correct location list of *bar*? >> >> >> >> For now, my only idea is to check if the clobbering MI is a * >> conditional return* in *DbgValueHistoryCalculator* which computes the >> end address of a location llist entry. But I do not feel like this is the >> correct fix though. >> >> >> >> Looking forward to hearing your thoughts on this, >> >> >> >> Thank you for reading this, >> >> >> >> Son Tuan Vu >> >> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180507/08b1811f/attachment.html>
Possibly Parallel Threads
- [DbgInfo] Potential bug in location list address ranges
- [DbgInfo] Potential bug in location list address ranges
- [DbgInfo] Potential bug in location list address ranges
- [DbgInfo] Potential bug in location list address ranges
- [DbgInfo] Potential bug in location list address ranges