NAKAMURA Takumi
2011-May-17 13:28 UTC
[LLVMdev] [cfe-dev] x86_64-pc-win32 ABI var arg code gen bug? Is the bitcode correct? Or is it the code gen?
Andrew, That is not a clang issue. I think, in practice, {rcx, rdx, r8, r9} might not need to be spilled to "home area" in that case, because va_arg would not touch former 4 args. Lemme know if you had issues. I know it must be suboptimal, "home area" would be vacant in any cases afaik. It would be better to 4 args were spilled into the home area. To work on this, it might be harder, I guess, thank you. ...Takumi 2011/5/17 Andrew Fish <afish at apple.com>:> It looks like for x86_64-pc-win32 the compiler does not generate the correct > code? It looks like the spill of the argument registers to the 32-byte > callers shadow space is not in the bitcode? > I have some code (attached as v.c): > int > ShellPrintHiiEx ( > int Col, > int Row, > const char *Language, > const void *HiiFormatStringId, > const void *HiiFormatHandle, > ... > ) > { > VA_LIST Marker; > int Value; > VA_START (Marker, HiiFormatHandle); > Value = ReturnMarker (Marker); > VA_END(Marker); > > return Value; > } > clang -ccc-host-triple x86_64-pc-win32 -emit-llvm -S v.c > declare void @llvm.va_start(i8*) nounwind > declare void @llvm.va_end(i8*) nounwind > define i32 @ShellPrintHiiEx(i32 %Col, i32 %Row, i8* %Language, i8* > %HiiFormatStringId, i8* %HiiFormatHandle, ...) nounwind { > %1 = alloca i32, align 4 > %2 = alloca i32, align 4 > %3 = alloca i8*, align 8 > %4 = alloca i8*, align 8 > %5 = alloca i8*, align 8 > %Marker = alloca i8*, align 8 > %Value = alloca i32, align 4 > store i32 %Col, i32* %1, align 4 > store i32 %Row, i32* %2, align 4 > store i8* %Language, i8** %3, align 8 > store i8* %HiiFormatStringId, i8** %4, align 8 > store i8* %HiiFormatHandle, i8** %5, align 8 > %6 = bitcast i8** %Marker to i8* > call void @llvm.va_start(i8* %6) > %7 = load i8** %Marker, align 8 > %8 = call i32 @ReturnMarker(i8* %7) > store i32 %8, i32* %Value, align 4 > %9 = bitcast i8** %Marker to i8* > call void @llvm.va_end(i8* %9) > %10 = load i32* %Value, align 4 > ret i32 %10 > } > So for x86_64-pc-win32 Col (%rcx), Row (%rdx), Language (%r8), and > HiiFormatStringId (%r9) should be spilled to the 32-byte space allocated on > the callers stack? Looks like they are being spilled locally? > Does this mean the bitcode needs to be generated differently for > x86_64-pc-win32, or does magic occur when code is generated and there is a > bug in that chunk of code? > > clang -ccc-host-triple x86_64-pc-win32 -S v.c > .globl ShellPrintHiiEx > .align 16, 0x90 > ShellPrintHiiEx: # @ShellPrintHiiEx > # BB#0: > pushq %rbp > .Ltmp4: > movq %rsp, %rbp > .Ltmp5: > subq $80, %rsp > .Ltmp6: > movq 48(%rbp), %rax > movl %ecx, -4(%rbp) > movl %edx, -8(%rbp) > movq %r8, -16(%rbp) > movq %r9, -24(%rbp) > movq %rax, -32(%rbp) > leaq 48(%rbp), %rax > movq %rax, -40(%rbp) > movq %rax, %rcx > callq ReturnMarker > movl %eax, -44(%rbp) > addq $80, %rsp > popq %rbp > ret > Col (%rcx), Row (%rdx), Language (%r8), and HiiFormatStringId (%r9), are > spilled to wrong location. > Thanks, > Andrew Fish > > > > > cc.exe /FAcs output showing spill to callers stack: > _TEXT SEGMENT > Value$ = 32 > Marker$ = 40 > Col$ = 64 > Row$ = 72 > Language$ = 80 > HiiFormatStringId$ = 88 > HiiFormatHandle$ = 96 > ShellPrintHiiEx PROC NEAR > ; 78 : { > $LN3: > 00030 4c 89 4c 24 20 mov QWORD PTR [rsp+32], r9 > 00035 4c 89 44 24 18 mov QWORD PTR [rsp+24], r8 > 0003a 89 54 24 10 mov DWORD PTR [rsp+16], edx > 0003e 89 4c 24 08 mov DWORD PTR [rsp+8], ecx > 00042 48 83 ec 38 sub rsp, 56 ; 00000038H > ; 79 : VA_LIST Marker; > ; 80 : int Value; > ; 81 : > ; 82 : VA_START (Marker, HiiFormatHandle); > 00046 48 8d 44 24 68 lea rax, QWORD PTR HiiFormatHandle$[rsp+8] > 0004b 48 89 44 24 28 mov QWORD PTR Marker$[rsp], rax > ; 83 : Value = ReturnMarker (Marker); > 00050 48 8b 4c 24 28 mov rcx, QWORD PTR Marker$[rsp] > 00055 e8 00 00 00 00 call ReturnMarker > 0005a 89 44 24 20 mov DWORD PTR Value$[rsp], eax > ; 84 : VA_END(Marker); > 0005e 48 c7 44 24 28 > 00 00 00 00 mov QWORD PTR Marker$[rsp], 0 > ; 85 : > ; 86 : return Value; > 00067 8b 44 24 20 mov eax, DWORD PTR Value$[rsp] > ; 87 : } > > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > >
Andrew Fish
2011-May-17 19:26 UTC
[LLVMdev] [cfe-dev] x86_64-pc-win32 ABI var arg code gen bug? Is the bitcode correct? Or is it the code gen?
On May 17, 2011, at 6:28 AM, NAKAMURA Takumi wrote:> Andrew, > > That is not a clang issue. > > I think, in practice, {rcx, rdx, r8, r9} might not need to be spilled > to "home area" in that case, > because va_arg would not touch former 4 args. > Lemme know if you had issues. >I'm seeing a code gen issue with x86_64-pc-win32-darwin for this test case. So I was looking at the assembly for code gen and ABI issues. I agree that for this test case the spill is code that could optimized out. It looks like my bug was related to the sizeof (va_list) being incorrect for my triple, probably was set to struct __va_list_tag and not char*. I tried a local fix, but I think this fix was broken. I noticed that the top of tree has fixed the issue. Thank you for the quick response, it was helpful for me to fully understand what is going on. Thanks, Andrew