Katsuhiro Ueno
2013-Aug-02 18:26 UTC
[LLVMdev] bug of tail call optimization on x86 target
Dear LLVM developers,
I am a developer of SML#, an ML-style functional programming language
developed at Tohoku University. Currently we are intending to use
LLVM as the backend of our SML# compiler in our upcoming release, and
have rewritten our frontend and runtime so that they can cooperate
with LLVM. LLVM works extremely fine with our SML# compiler. We are
grateful to LLVM community for providing such an excellent software.
However, when generating x86 code with enabling tail call optimization,
unfortunately I found a bug that destroys callee-save registers.
The following is the bug I found and the attached file is the fix of
this bug.
The following is the LLVM IR code that causes the bug:
; bug.ll
target triple = "i686-apple-darwin12.4.0"
declare fastcc void @foo(i32, i32, i32, i32, i32, i32)
declare i32* @bar(i32*)
define fastcc void @hoge(i32 %b) nounwind {
%a = alloca i32
store i32 0, i32* %a
%d = tail call i32* @bar(i32* %a) nounwind
store i32 %b, i32* %d
tail call fastcc void @foo(i32 1, i32 2, i32 3, i32 4, i32 5, i32 6) nounwind
ret void
}
LLVM produces the following x86 code against the above source:
% llc -tailcallopt -filetype=asm -disable-cfi -o - bug.ll
...
_hoge: ## @hoge
## BB#0:
subl $16, %esp
pushl %esi ;;;; this preserves %esi, but
subl $40, %esp
movl %ecx, %esi
movl $0, 40(%esp) ;;;; this overwrites preserved %esi
leal 40(%esp), %eax
movl %eax, (%esp)
calll _bar
movl %esi, (%eax)
movl 60(%esp), %eax
movl $6, 60(%esp)
movl $5, 56(%esp)
movl $4, 52(%esp)
movl $3, 48(%esp)
movl %eax, 44(%esp)
movl $1, %ecx
movl $2, %edx
addl $40, %esp
popl %esi ;;;; this always fails to restore %esi
jmp _foo ## TAILCALL
In the above code, 2nd instruction "pushl %esi" saves a callee-save
register %esi to the stack, but 5th instruction "movl $0, 40(%esp)"
overwrites the saved %esi. Eventually at the epilogue "popl %esi"
fails to restore the previous %esi.
This bug is due to wrong computation of stack object indices by the
x86 backend. The attached patch indicates the wrong points. Due to
integral promotion, explicit conversion to signed integer is needed
at those points.
Eliminating tail calls (with arbitrary number of arguments) is mandatory
for practical functional languages. I would appreciate it if this bug
would be fixed in the next release.
Best regards,
Katsuhiro Ueno
-------------- next part --------------
A non-text attachment was scrubbed...
Name: x86-tailcallopt-fix.diff
Type: application/octet-stream
Size: 1790 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130803/b8ef92ae/attachment.obj>
Hi Katsuhiro, Thanks very much for taking the time to dig into this and produce a patch. Is there any chance you could make a couple of tweaks?> This bug is due to wrong computation of stack object indices by the > x86 backend. The attached patch indicates the wrong points. Due to > integral promotion, explicit conversion to signed integer is needed > at those points.I'm not convinced that's the best solution, at least conceptually. SlotSize really is an unsigned quantity, and though it's unlikely we'd like 0x80000000 to be interpreted as positive, rather than negative if it ever does occur. Also, CreateFixedObject seems to take an int64_t I'd suggest: + ISelLowering line 2459: cast FPDiff to int64_t. + ISelLowering line 3327: cast SlotSize to int64_t. + FrameLowering: declare TailCallReturnAddrDelta as int64_t and subtract SlotSize? It would also be really good if you could convert your IR into a test-case (like the others in "test/CodeGen/X86/"). We like to have a test-case for every commit if it's humanly possible. Cheers. Tim.
Katsuhiro Ueno
2013-Aug-03 03:08 UTC
[LLVMdev] bug of tail call optimization on x86 target
Hi Tim, Thank you for your quick response.> I'm not convinced that's the best solution, at least conceptually. > SlotSize really is an unsigned quantity, and though it's unlikely we'd > like 0x80000000 to be interpreted as positive, rather than negative if > it ever does occur.I totally agree with you. I rewrote my fix and made a test case according to your suggestion. All of them are included in the attached file. Thanks and regards, Katsuhiro Ueno -------------- next part -------------- A non-text attachment was scrubbed... Name: x86-tailcallopt-fix-2.diff Type: application/octet-stream Size: 2947 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130803/901297c7/attachment.obj>