Hi everyone, I'm interested in variadic functions and how llvm handles them. I discovered that the Clang frontend is doing a great job at lowering the va_arg (precisely __builtin_va_arg) function into target dependent code. I have also seen the va_arg function that exist at IR level. I found some information about va_arg (IR one) that currently does not support all platform. But since 2009, it seems that Windows 64 bits is partially supported. So I tried to play with it and reached the following issue: On Windows 64 bits, when passing arguments through a variadic function, the first four parameters are passed by registers and the others onto the stack. Therefore, the stack is 8 bytes aligned (I guess it's related to the ABI). For example, by debugging the IR code at the end, here's the result right before the call. We clearly see the 8 bytes alignment. rcx : <i64> -6778056391233182162 rdx : <i8*> 0x13E1A4 r8 : <i64*> 0x50f070 r9 : <i64*> 0x50d830 0x2EE070 : <i64*> 0x50d830 0x2EE078 : <i32> 16 0x2EE080 : <i32> 10 0x2EE088 : <i32> 10 0x2EE090 : <i64*> 0x50ee40 When using va_arg (IR) to retrieve these parameters, it does not respect the alignement and tries to access the parameters like they were contiguous in memory. %0 = va_arg i8* %ap2, i64* ; OK %1 = va_arg i8* %ap2, i64* ; OK %2 = va_arg i8* %ap2, i64* ; OK (0x2EE070) %3 = va_arg i8* %ap2, i32 ; OK (0x2EE078) %4 = va_arg i8* %ap2, i32 ; Wrong ! 0x2EE07C %5 = va_arg i8* %ap2, i32 ; Wrong ! 0x2EE080 %6 = va_arg i8* %ap2, i64* ; Wrong ! 0x2EE084 The result can be experienced by running the IR code at then end. E:\test>clang test.ll -o test.exe E:\test>test.exe values : n2 = 16, dna = 0, dnb = 10 n2, dna and dnb are respectively the three i32 variables. Does anyone know how to fix this? Alignment attribute on the variadic function do nothing and the VAArgInst does not support setAlignment() like the AllocaInstr. During my research, I found that when a VAArgInst is being lowered in SelectionDAG::expandVAARG(), the alignment information is retrieved from the va_arg SDNode and the lowering is wrong (in this case). The alignment is set in SelectionDAGBuilder::visitVAArg() where it creates a VAArg DAG using DL.getABITypeAligment(I.getType()) which seems to be the alignment information. DL.getABITypeAligment(I.getType()) returns 4 if the type is i32 and 8 for i64 type. For testing, I forced it to 8 and the IR example below worked fine. Is there some kind of attributes to force function parameters to be aligned contiguously? Or could it be that va_arg alignment is wrongly made using DL.getABITypeAlignment? Thank's in advance for your help. Regards, Gaël Here's is the IR code for testing: target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-pc-windows-msvc18.0.0" %struct.va_list = type { i8* } $"str" = comdat any @"str" = linkonce_odr unnamed_addr constant [38 x i8] c"values : n2 %d, dna = %d, dnb = %d\0A\00", comdat, align 1 declare i32 @printf(i8*, ...) #1 declare void @llvm.va_start(i8*) declare void @llvm.va_end(i8*) ; Function Attrs: nounwind uwtable define i32 @main() #0 { %r = alloca i64 %a = alloca i64 %b = alloca i64 %t = alloca i64 %rPty = alloca i64* %aPty = alloca i64* %bPty = alloca i64* %tPty = alloca i64* store i64* %r, i64** %rPty store i64* %a, i64** %aPty store i64* %b, i64** %bPty store i64* %t, i64** %tPty %rLoad = load i64*, i64** %rPty %aLoad = load i64*, i64** %aPty %bLoad = load i64*, i64** %bPty %tLoad = load i64*, i64** %tPty %ret = alloca i64 %retPty = alloca i64* store i64* %ret, i64** %retPty %load = load i64*, i64** %retPty %bit = bitcast i64* %load to i8* call void (i64, i8*, ...) @variadiquefunc(i64 -6778056391233182162, i8* %bit, i64* %rLoad, i64* %aLoad, i64* %bLoad, i32 16, i32 10, i32 10, i64* %tLoad) ret i32 0 } define internal void @variadiquefunc(i64 %p, i8* %pp, ...) { entry: %ap = alloca %struct.va_list %ap2 = bitcast %struct.va_list* %ap to i8* call void @llvm.va_start(i8* %ap2) %0 = va_arg i8* %ap2, i64* %1 = va_arg i8* %ap2, i64* %2 = va_arg i8* %ap2, i64* %3 = va_arg i8* %ap2, i32 %4 = va_arg i8* %ap2, i32 %5 = va_arg i8* %ap2, i32 %6 = va_arg i8* %ap2, i64* %7 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([38 x i8], [38 x i8]* @"str", i32 0, i32 0), i32 %3, i32 %4, i32 %5) call void @llvm.va_end(i8* %ap2) ret void } attributes #0 = { nounwind uwtable "disable-tail-calls"="false" "less- precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp- math"="false" "no-nans-fp-math"="false" "stack-protector-buffer- size"="8" "target-cpu"="x86-64" "target- features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft- float"="false" } attributes #1 = { "disable-tail-calls"="false" "less-precise- fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp- math"="false" "no-nans-fp-math"="false" "stack-protector-buffer- size"="8" "target-cpu"="x86-64" "target- features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft- float"="false" }
Hi Gaël, On 20 April 2016 at 13:23, Gaël Jobin via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Is there some kind of attributes to force function parameters to be > aligned contiguously? Or could it be that va_arg alignment is wrongly > made using DL.getABITypeAlignment?It looks like support for the Windows ABI hasn't really been added to LLVM's va_arg instruction at all, so it's falling back to the generic implementation in SelectionDAG. To change that, you'll probably want to hook into X86TargetLowering::LowerVAARG. You ought to be able to override the alignment of 32-bit types there. (Beware, it'll never be able to handle the full breadth of C/C++ use-cases, there's a reason Clang implements it directly and it's not purely for efficiency). Alternatively, you might be able to get away with always doing an i64 va_arg and truncating the result if you control the front-end and don't want to fully expand va_arg. Cheers. Tim.
Both LLVM and GCC have a workaround to support va_list with MS ABI on x64 platform. Use __builtin_ms_va_list ap; __builtin_ms_va_start (ap, n); __builtin_ms_va_end (ap); instead of __builtin_va_list ap; __builtin_va_start (ap, n); __builtin_va_end (ap); I use above workaround in my Uefi firmware code building, and it works. You can see detail info in below links: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093778.html https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818 CLANG3.8 is better than GCC7.0 on this issue because CLANG3.8 will give a compiler error message like below to explicitly ban the va_list builtins with MS ABI, but GCC has no warning and continue to confuse user. test.c:15:3: error: 'va_start' used in Win64 ABI function va_start (Marker, Format); ^ Steven Shi Intel\SSG\STO\UEFI Firmware> -----Original Message----- > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Tim > Northover via llvm-dev > Sent: Thursday, April 21, 2016 4:56 AM > To: Gaël Jobin <gael.jobin at switzerlandmail.ch> > Cc: LLVM Developers Mailing List <llvm-dev at lists.llvm.org> > Subject: Re: [llvm-dev] va_arg on Windows 64 > > Hi Gaël, > > On 20 April 2016 at 13:23, Gaël Jobin via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > Is there some kind of attributes to force function parameters to be > > aligned contiguously? Or could it be that va_arg alignment is wrongly > > made using DL.getABITypeAlignment? > > It looks like support for the Windows ABI hasn't really been added to > LLVM's va_arg instruction at all, so it's falling back to the generic > implementation in SelectionDAG. To change that, you'll probably want > to hook into X86TargetLowering::LowerVAARG. You ought to be able to > override the alignment of 32-bit types there. (Beware, it'll never be > able to handle the full breadth of C/C++ use-cases, there's a reason > Clang implements it directly and it's not purely for efficiency). > > Alternatively, you might be able to get away with always doing an i64 > va_arg and truncating the result if you control the front-end and > don't want to fully expand va_arg. > > Cheers. > > Tim. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<html><head></head><body><div class="-x-evo-paragraph"><div>Thank you for your response.</div><div><br></div><blockquote type="cite"><div>It looks like support for the Windows ABI hasn't really been added to</div><div>LLVM's va_arg instruction at all, so it's falling back to the generic</div><div>implementation in SelectionDAG. To change that, you'll probably want</div><div>to hook into X86TargetLowering::LowerVAARG.</div></blockquote><div><br></div><div class="-x-evo-paragraph">Yes, finally I changed the behavior and generated a new VAArg with correct alignment when using Win64 conv. Don't know if it's correct but it works.</div><div><br></div><div><span class="Apple-tab-span" style="white-space: pre;"> </span>SDValue X86TargetLowering::LowerVAARG(SDValue Op, SelectionDAG &DAG) const {</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span> assert(Subtarget->is64Bit() &&</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span> "LowerVAARG only handles 64-bit va_arg!");</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span> assert(Op.getNode()->getNumOperands() == 4);</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span></div><div><span class="Apple-tab-span" style="white-space: pre;"> </span> MachineFunction &MF = DAG.getMachineFunction();</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span> if (Subtarget->isCallingConvWin64(MF.getFunction()->getCallingConv()))</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span> // The Win64 ABI uses char* instead of a structure.</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span> </div><div><span class="Apple-tab-span" style="white-space: pre;"> </span> // Win64 ABI stores the arguments on the stack with an alignment of 8 bytes</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span> SDValue V = DAG.getVAArg(Op.getNode()->getVTList().VTs[0],</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span> Loc(Op),</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;"> </span> Op.getOperand(0),</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;"> </span> Op.getOperand(1),</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;"> </span> Op.getOperand(2),</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span> 8);</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span> return DAG.expandVAArg(V.getNode());</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span>...</div><div><br></div><div>I searched everywhere (especially in LowerCall) to find out a function that can give me this stack argument alignment of 8 but I didn't find it. So, I hardcoded the value.</div><div><br></div><div>By the way, I didn't find any official information that talk about the argument alignment on the stack in the case of a Win64 call. </div><div><br></div><blockquote type="cite"><div>You ought to be able to</div><div>override the alignment of 32-bit types there. (Beware, it'll never be</div><div>able to handle the full breadth of C/C++ use-cases, there's a reason</div><div>Clang implements it directly and it's not purely for efficiency).</div></blockquote><div><br></div><div>What do you mean by "C/C++ use-cases" ? Does the va_arg will never be "fully working"? Why?</div><div><br></div><blockquote type="cite"><div>Alternatively, you might be able to get away with always doing an i64</div><div>va_arg and truncating the result if you control the front-end and</div><div>don't want to fully expand va_arg.</div></blockquote><div><br></div><div class="-x-evo-paragraph">Yes, I tried to implement this solution but given the different types, it becomes complicated quickly.</div><div class="-x-evo-paragraph"><br></div><div class="-x-evo-paragraph">Regards,</div><div class="-x-evo-paragraph">Gaël</div></div><div></div></body></html>