Compiling a simple automaton created by GNU bison with -O1 or -O2 resulted in the following machine code: 1300 /*-----------------------------. 1301 | yyreduce -- Do a reduction. | 1302 `-----------------------------*/ 1303 yyreduce: 1304 /* yyn is the number of a rule to reduce with. */ 1305 yylen = yyr2[yyn]; 0x0000000000400c14 <rpcalc_parse+628>: mov r15d,r14d 0x0000000000400c17 <rpcalc_parse+631>: movzx r12d,BYTE PTR [r15+0x4015e2] 0x0000000000400c1f <rpcalc_parse+639>: mov eax,0x1 0x0000000000400c24 <rpcalc_parse+644>: mov r13,rax 0x0000000000400c27 <rpcalc_parse+647>: sub r13,r12 0x0000000000400c2a <rpcalc_parse+650>: mov eax,r13d // assignment to zero-extends into rax 1306 1307 /* If YYLEN is nonzero, implement the default value of the action: 1308 `$$ = $1'. 1309 1310 Otherwise, the following line sets YYVAL to garbage. 1311 This behavior is undocumented and Bison 1312 users should not rely upon it. Assigning to YYVAL 1313 unconditionally makes the parser a bit smaller, and it avoids a 1314 GCC warning that YYVAL may be used uninitialized. */ 1315 yyval = yyvsp[1-yylen]; => 0x0000000000400c2d <rpcalc_parse+653>: movsd xmm0,QWORD PTR [rbx+rax*8] 0x0000000000400c32 <rpcalc_parse+658>: movsd QWORD PTR [rbp-0x808],xmm0 As far as I understand it, assigning to eax zero-extends to rax. However, eax holds the result of "1-yylen" which is expected to be negative, so it should be sign-extended before using its value as rax. Indexing "in the wrong direction" causes a segfault at the instruction indicated by '=>' Here's the disassembly from -O0, which does a sign extension (movsxd): 1300 /*-----------------------------. 1301 | yyreduce -- Do a reduction. | 1302 `-----------------------------*/ 1303 yyreduce: 1304 /* yyn is the number of a rule to reduce with. */ 1305 yylen = yyr2[yyn]; 0x0000000000401069 <+1945>: movsxd rax,DWORD PTR [rbp-0x80c] 0x0000000000401070 <+1952>: movzx ecx,BYTE PTR [rax*1+0x401f0f] 0x0000000000401078 <+1960>: mov DWORD PTR [rbp-0x824],ecx 0x000000000040107e <+1966>: mov ecx,0x1 1306 1307 /* If YYLEN is nonzero, implement the default value of the action: 1308 `$$ = $1'. 1309 1310 Otherwise, the following line sets YYVAL to garbage. 1311 This behavior is undocumented and Bison 1312 users should not rely upon it. Assigning to YYVAL 1313 unconditionally makes the parser a bit smaller, and it avoids a 1314 GCC warning that YYVAL may be used uninitialized. */ 1315 yyval = yyvsp[1-yylen]; 0x0000000000401083 <+1971>: sub ecx,DWORD PTR [rbp-0x824] 0x0000000000401089 <+1977>: movsxd rax,ecx 0x000000000040108c <+1980>: mov rdx,QWORD PTR [rbp-0x800] 0x0000000000401093 <+1987>: movsd xmm0,QWORD PTR [rdx+rax*8] 0x0000000000401098 <+1992>: movsd QWORD PTR [rbp-0x820],xmm0 yylen is of type YYSIZE_T, which is a macro that expands to size_t or 'unsigned int'. Perhaps clang/LLVM considers "1-yylen" to be unsigned? Am I completely off-base? This is clang version 3.0 (trunk 127463) Target: x86_64-unknown-linux-gnu Thread model: posix Csaba -- GCS a+ e++ d- C++ ULS$ L+$ !E- W++ P+++$ w++$ tv+ b++ DI D++ 5++ The Tao of math: The numbers you can count are not the real numbers. Life is complex, with real and imaginary parts. "Ok, it boots. Which means it must be bug-free and perfect. " -- Linus Torvalds "People disagree with me. I just ignore them." -- Linus Torvalds
On Sat, Mar 19, 2011 at 1:44 AM, Csaba Raduly <rcsaba at gmail.com> wrote:> Compiling a simple automaton created by GNU bison with -O1 or -O2 > resulted in the following machine code: > > 1300 /*-----------------------------. > 1301 | yyreduce -- Do a reduction. | > 1302 `-----------------------------*/ > 1303 yyreduce: > 1304 /* yyn is the number of a rule to reduce with. */ > 1305 yylen = yyr2[yyn]; > 0x0000000000400c14 <rpcalc_parse+628>: mov r15d,r14d > 0x0000000000400c17 <rpcalc_parse+631>: movzx r12d,BYTE PTR > [r15+0x4015e2] > 0x0000000000400c1f <rpcalc_parse+639>: mov eax,0x1 > 0x0000000000400c24 <rpcalc_parse+644>: mov r13,rax > 0x0000000000400c27 <rpcalc_parse+647>: sub r13,r12 > 0x0000000000400c2a <rpcalc_parse+650>: mov eax,r13d // > assignment to zero-extends into rax > > 1306 > 1307 /* If YYLEN is nonzero, implement the default value of the action: > 1308 `$$ = $1'. > 1309 > 1310 Otherwise, the following line sets YYVAL to garbage. > 1311 This behavior is undocumented and Bison > 1312 users should not rely upon it. Assigning to YYVAL > 1313 unconditionally makes the parser a bit smaller, and it avoids a > 1314 GCC warning that YYVAL may be used uninitialized. */ > 1315 yyval = yyvsp[1-yylen]; > => 0x0000000000400c2d <rpcalc_parse+653>: movsd xmm0,QWORD PTR > [rbx+rax*8] > 0x0000000000400c32 <rpcalc_parse+658>: movsd QWORD PTR > [rbp-0x808],xmm0 > > > As far as I understand it, assigning to eax zero-extends to rax. > However, eax holds the result of "1-yylen" which is expected to be > negative, so it should be sign-extended before using its value as rax. > Indexing "in the wrong direction" causes a segfault at the instruction > indicated by '=>' > > Here's the disassembly from -O0, which does a sign extension (movsxd): > > 1300 /*-----------------------------. > 1301 | yyreduce -- Do a reduction. | > 1302 `-----------------------------*/ > 1303 yyreduce: > 1304 /* yyn is the number of a rule to reduce with. */ > 1305 yylen = yyr2[yyn]; > 0x0000000000401069 <+1945>: movsxd rax,DWORD PTR [rbp-0x80c] > 0x0000000000401070 <+1952>: movzx ecx,BYTE PTR [rax*1+0x401f0f] > 0x0000000000401078 <+1960>: mov DWORD PTR [rbp-0x824],ecx > 0x000000000040107e <+1966>: mov ecx,0x1 > > 1306 > 1307 /* If YYLEN is nonzero, implement the default value of the action: > 1308 `$$ = $1'. > 1309 > 1310 Otherwise, the following line sets YYVAL to garbage. > 1311 This behavior is undocumented and Bison > 1312 users should not rely upon it. Assigning to YYVAL > 1313 unconditionally makes the parser a bit smaller, and it avoids a > 1314 GCC warning that YYVAL may be used uninitialized. */ > 1315 yyval = yyvsp[1-yylen]; > 0x0000000000401083 <+1971>: sub ecx,DWORD PTR [rbp-0x824] > 0x0000000000401089 <+1977>: movsxd rax,ecx > 0x000000000040108c <+1980>: mov rdx,QWORD PTR [rbp-0x800] > 0x0000000000401093 <+1987>: movsd xmm0,QWORD PTR [rdx+rax*8] > 0x0000000000401098 <+1992>: movsd QWORD PTR [rbp-0x820],xmm0 > > yylen is of type YYSIZE_T, which is a macro that expands to size_t or > 'unsigned int'. Perhaps clang/LLVM considers "1-yylen" to be unsigned? > Am I completely off-base? > > > > This is > clang version 3.0 (trunk 127463) > Target: x86_64-unknown-linux-gnu > Thread model: posixPlease file a bug in Bugzilla, attach the complete preprocessed source, and show the full steps required to reproduce the issue. You might be right, but it's hard to tell without context. -Eli
Created http://llvm.org/bugs/show_bug.cgi?id=9512 Csaba -- GCS a+ e++ d- C++ ULS$ L+$ !E- W++ P+++$ w++$ tv+ b++ DI D++ 5++ The Tao of math: The numbers you can count are not the real numbers. Life is complex, with real and imaginary parts. "Ok, it boots. Which means it must be bug-free and perfect. " -- Linus Torvalds "People disagree with me. I just ignore them." -- Linus Torvalds
Apparently Analagous Threads
- [LLVMdev] Apparent optimizer bug on X86_64
- [PATCH] libxl: Tolerate xl config files missing trailing newline
- [TESTDAY] xl cpupool-create segfaults if given invalid configuration
- [LLVMdev] problem trying to write an LLVM register-allocation pass
- [LLVMdev] problem trying to write an LLVM register-allocation pass