Hello guys,
     Thanks for your help when you are busing.
     I am working on an open source project. It supports shader language
and I want JIT feature, so LLVM is used.
     But now I find the ABI & Calling Convention did not co-work with MSVC.
For example, following code I have:
struct float4 { float x, y, z, w; };
struct float4x4 { float4 x, y, z, w; };
float4 fetch_vs( float4x4* mat ){ return mat->y; }
Caller:
// ...
float4x4 mat; // Initialized
float4 ret = fetch(mat); // fetch is JITed by LLVM
float4 ret_vs = fetch_vs(mat)
// ...
Callee(LLVM):
%vec4 = type { float, float, float, float }
%mat44 = type { %vec4, %vec4, %vec4, %vec4 }
define %vec4 @fetch( %mat44* %m ) {
       %matval = load %mat44* %m
       %v2 = extractvalue %mat44 %matval, 2
       ret %vec4 %v2
 }
     But if it is implemented by LLVM and called the JIT-ed function in
MSVC, the program will be *crashed*.
     I traced into the implementations, ASMs are:
*    Caller:*
        float4x4 f;
        float4 b = fetch(&f);
    // Calling function. first address is a temporary result generated by
caller. And secondary is the &f.
    013C1428  lea         eax,[ebp-48h]
    013C142B  push        eax
    013C142C  lea         ecx,[ebp-138h]
    013C1432  push        ecx
    013C1433  call        fetch (13C11D6h)
    013C1438  add         esp,8
    // Copy result to another temporary vairable.
    013C143B  mov         edx,dword ptr [eax]
    013C143D  mov         dword ptr [ebp-150h],edx
    013C1443  mov         ecx,dword ptr [eax+4]
    013C1446  mov         dword ptr [ebp-14Ch],ecx
    013C144C  mov         edx,dword ptr [eax+8]
    013C144F  mov         dword ptr [ebp-148h],edx
    013C1455  mov         eax,dword ptr [eax+0Ch]
    013C1458  mov         dword ptr [ebp-144h],eax
    013C145E  mov         ecx,dword ptr [ebp-150h]
    // Copy secondary temporary to variable 'b'
    013C1464  mov         dword ptr [ebp-60h],ecx
    013C1467  mov         edx,dword ptr [ebp-14Ch]
    013C146D  mov         dword ptr [ebp-5Ch],edx
    013C1470  mov         eax,dword ptr [ebp-148h]
    013C1476  mov         dword ptr [ebp-58h],eax
    013C1479  mov         ecx,dword ptr [ebp-144h]
    013C147F  mov         dword ptr [ebp-54h],ecx
*    Callee( 'fetch_vs' MSVC ):*
      float4 __cdecl fetch( float4x4* mat ){
// Stack protection
002C13B0  push        ebp
002C13B1  mov         ebp,esp
002C13B3  sub         esp,0C0h
002C13B9  push        ebx
002C13BA  push        esi
002C13BB  push        edi
002C13BC  lea         edi,[ebp-0C0h]
002C13C2  mov         ecx,30h
002C13C7  mov         eax,0CCCCCCCCh
002C13CC  rep stos    dword ptr es:[edi]
        return mat->y;
// Copy to address of first temporary variable.
002C13CE  mov         eax,dword ptr [mat]
002C13D1  add         eax,10h
002C13D4  mov         ecx,dword ptr [ebp+8]
002C13D7  mov         edx,dword ptr [eax]
002C13D9  mov         dword ptr [ecx],edx
002C13DB  mov         edx,dword ptr [eax+4]
002C13DE  mov         dword ptr [ecx+4],edx
002C13E1  mov         edx,dword ptr [eax+8]
002C13E4  mov         dword ptr [ecx+8],edx
002C13E7  mov         eax,dword ptr [eax+0Ch]
002C13EA  mov         dword ptr [ecx+0Ch],eax
002C13ED  mov         eax,dword ptr [ebp+8]
    }
002C13F0  pop         edi
002C13F1  pop         esi
002C13F2  pop         ebx
002C13F3  mov         esp,ebp
002C13F5  pop         ebp
002C13F6  ret
    *Callee( 'fetch' LLVM ):*
010B0010  mov         eax,dword ptr [esp+4]
010B0014  mov         ecx,dword ptr [esp+8]
010B0018  movss       xmm0,dword ptr [ecx+1Ch]
010B001D  movss       dword ptr [eax+0Ch],xmm0
010B0022  movss       xmm0,dword ptr [ecx+18h]
010B0027  movss       dword ptr [eax+8],xmm0
010B002C  movss       xmm0,dword ptr [ecx+10h]
010B0031  movss       xmm1,dword ptr [ecx+14h]
010B0036  movss       dword ptr [eax+4],xmm1
010B003B  movss       dword ptr [eax],xmm0
010B003F  ret         4                                           // There
are nothing push/pop in function and stack space is managed by caller, so
why ret 4 is generated?
It will cause the stack unbalance before and after function call. the other
code look like well.
*Following is my questions:*
1. Why it generates "ret 4" but not "ret" without modify
esp?
2. Does it doesn't support Microsoft Visual C++ compiler directly ?
3. I want to integrate LLVM JIT in MSVC, is it possible? what I will do,
such as generate a adapter function ? Any porting document for it?
4. Does Clang support MS's call convention? If it does, how it work? I
traced into it, but code is too large.
5. Does it support Mingw directly ?
6. Does x64 work if your solution is applied ?
Thank you very much !
Best regards,
Your fans: Ye
-- 
Ye Wu
CELL: +86 13671730301
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111102/11b57b53/attachment.html>
2011/11/1 空明流转 <wuye9036 at gmail.com>:> Hello guys, > > Thanks for your help when you are busing. > I am working on an open source project. It supports shader language and > I want JIT feature, so LLVM is used. > But now I find the ABI & Calling Convention did not co-work with MSVC. > For example, following code I have: > > struct float4 { float x, y, z, w; }; > struct float4x4 { float4 x, y, z, w; }; > > float4 fetch_vs( float4x4* mat ){ return mat->y; } > > Caller: > > // ... > float4x4 mat; // Initialized > float4 ret = fetch(mat); // fetch is JITed by LLVM > float4 ret_vs = fetch_vs(mat) > // ... > > Callee(LLVM): > > %vec4 = type { float, float, float, float } > %mat44 = type { %vec4, %vec4, %vec4, %vec4 } > define %vec4 @fetch( %mat44* %m ) { > %matval = load %mat44* %m > %v2 = extractvalue %mat44 %matval, 2 > ret %vec4 %v2 > } > > But if it is implemented by LLVM and called the JIT-ed function in > MSVC, the program will be crashed. > I traced into the implementations, ASMs are: > > Caller: > > float4x4 f; > float4 b = fetch(&f); > // Calling function. first address is a temporary result generated by > caller. And secondary is the &f. > 013C1428 lea eax,[ebp-48h] > 013C142B push eax > 013C142C lea ecx,[ebp-138h] > 013C1432 push ecx > 013C1433 call fetch (13C11D6h) > 013C1438 add esp,8 > > // Copy result to another temporary vairable. > 013C143B mov edx,dword ptr [eax] > 013C143D mov dword ptr [ebp-150h],edx > 013C1443 mov ecx,dword ptr [eax+4] > 013C1446 mov dword ptr [ebp-14Ch],ecx > 013C144C mov edx,dword ptr [eax+8] > 013C144F mov dword ptr [ebp-148h],edx > 013C1455 mov eax,dword ptr [eax+0Ch] > 013C1458 mov dword ptr [ebp-144h],eax > 013C145E mov ecx,dword ptr [ebp-150h] > > // Copy secondary temporary to variable 'b' > 013C1464 mov dword ptr [ebp-60h],ecx > 013C1467 mov edx,dword ptr [ebp-14Ch] > 013C146D mov dword ptr [ebp-5Ch],edx > 013C1470 mov eax,dword ptr [ebp-148h] > 013C1476 mov dword ptr [ebp-58h],eax > 013C1479 mov ecx,dword ptr [ebp-144h] > 013C147F mov dword ptr [ebp-54h],ecx > > Callee( 'fetch_vs' MSVC ): > > float4 __cdecl fetch( float4x4* mat ){ > > // Stack protection > 002C13B0 push ebp > 002C13B1 mov ebp,esp > 002C13B3 sub esp,0C0h > 002C13B9 push ebx > 002C13BA push esi > 002C13BB push edi > 002C13BC lea edi,[ebp-0C0h] > 002C13C2 mov ecx,30h > 002C13C7 mov eax,0CCCCCCCCh > 002C13CC rep stos dword ptr es:[edi] > return mat->y; > > // Copy to address of first temporary variable. > 002C13CE mov eax,dword ptr [mat] > 002C13D1 add eax,10h > 002C13D4 mov ecx,dword ptr [ebp+8] > 002C13D7 mov edx,dword ptr [eax] > 002C13D9 mov dword ptr [ecx],edx > 002C13DB mov edx,dword ptr [eax+4] > 002C13DE mov dword ptr [ecx+4],edx > 002C13E1 mov edx,dword ptr [eax+8] > 002C13E4 mov dword ptr [ecx+8],edx > 002C13E7 mov eax,dword ptr [eax+0Ch] > 002C13EA mov dword ptr [ecx+0Ch],eax > 002C13ED mov eax,dword ptr [ebp+8] > } > 002C13F0 pop edi > 002C13F1 pop esi > 002C13F2 pop ebx > 002C13F3 mov esp,ebp > 002C13F5 pop ebp > 002C13F6 ret > > Callee( 'fetch' LLVM ): > > 010B0010 mov eax,dword ptr [esp+4] > 010B0014 mov ecx,dword ptr [esp+8] > 010B0018 movss xmm0,dword ptr [ecx+1Ch] > 010B001D movss dword ptr [eax+0Ch],xmm0 > 010B0022 movss xmm0,dword ptr [ecx+18h] > 010B0027 movss dword ptr [eax+8],xmm0 > 010B002C movss xmm0,dword ptr [ecx+10h] > 010B0031 movss xmm1,dword ptr [ecx+14h] > 010B0036 movss dword ptr [eax+4],xmm1 > 010B003B movss dword ptr [eax],xmm0 > 010B003F ret 4 // There > are nothing push/pop in function and stack space is managed by caller, so > why ret 4 is generated? > > It will cause the stack unbalance before and after function call. the other > code look like well. > > Following is my questions: > > 1. Why it generates "ret 4" but not "ret" without modify esp?>From what I can find, that follows the Mingw calling convention (whichis generally the same as cdecl, but not quite the same when returning a struct). Looks like it's an oversight; I'd suggest filing a bug.> 2. Does it doesn't support Microsoft Visual C++ compiler directly ? > > 3. I want to integrate LLVM JIT in MSVC, is it possible? what I will do, > such as generate a adapter function ? Any porting document for it? > 4. Does Clang support MS's call convention? If it does, how it work? I > traced into it, but code is too large.People have used LLVM with MSVC; making your calls across compilers simpler generally helps.> 5. Does it support Mingw directly ?LLVM support for mingw is generally more mature than the MSVC support.> 6. Does x64 work if your solution is applied ?The Windows 64-bit calling convention should work... it hasn't gotten much testing as far as I know, though. -Eli
空明流转 <wuye9036 at gmail.com> writes: [snip]> But now I find the ABI & Calling Convention did not co-work with MSVC. > For example, following code I have:[snip] I think that you're hitting this problem: http://www.llvm.org/bugs/show_bug.cgi?id=5058 Another feature of MSVC calling convention not directly supported by LLVM: http://www.llvm.org/bugs/show_bug.cgi?id=5064 HTH
空明流转 <wuye9036 at gmail.com> writes: [snip]> 1. Why it generates "ret 4" but not "ret" without modify esp?Already answered on my previous message.> 2. Does it doesn't support Microsoft Visual C++ compiler directly ?LLVM doesn't support the full Microsoft ABI. Specially with passing/returning structs (or C++ objects). The workaround for that is to always pass/return pointers to structs.> 3. I want to integrate LLVM JIT in MSVC, is it possible? what I will do, > such as generate a adapter function ? Any porting document for it?Using the JIT from MSVC is not harder than doing the same with any other compiler (g++, for instance) except from the problem noted on the previous response.> 4. Does Clang support MS's call convention? If it does, how it work? I > traced into it, but code is too large.Because Clang is based on LLVM, it does not support the MSVC calling convention due to the limitations noted on the answer to question 2.> 5. Does it support Mingw directly ?I'm afraid that you will discover compatibility problems among Clang and MinGW due to the struct return problem: MinGW does it right, because it follows the MS C ABI.> 6. Does x64 work if your solution is applied ?Sorry, I have no experience with LLVM and Windows calling conventions on x64.
空明流转 <wuye9036 at gmail.com> writes:> Could I wrap LLVM with mingw and expose some C api to called by MSVC? > > And in mingw, I will override the signature float4 foo( float44 ) to > float4* foo( float4*, float44* ); ? > > Is that OK?If you pass and return the structs through pointers, you don't need MinGW in the middle, you can use MSVC directly. Please note that using MinGW will not fix the LLVM sret problem.
空明流转 <wuye9036 at gmail.com> writes:> Sorry, Still a quesiton, if return struct is 8 bytes, I remember it will > return by EAX:EDX and LLVM works on this condition?Sorry, I can't remember. My guess is "no", but I'm not sure. Maybe someone on the LLVM mailing list (CC'ed) knows. But it doesn't seem a good idea to implement something that will break as soon as you add another data member to the struct.