Hello,
I think I've found a bug in the calling convention support for X86-64/ 
Win64.   It doesn't correctly save and restore the XMM registers in  
the function prolog/epilog.  (The problem only exists on Win64, since  
Linux and Mac OS use calling convention in which these registers are  
volatile and not callee-saved.)
X86RegisterInfo::getCalleeSavedRegs() when called for a Win64 target  
does return an array of registers which includes X86::XMM6 through  
X86:XMM15.
However, the prolog/epilog code does not seem to be able handle saving  
these registers correctly.
Firstly, in PEI::CalculateCalleeSavedRegisters() in CodeGen/ 
PrologEpilogInserter.cpp, the call to  
Fn.getRegInfo().isPhysRegUsed(Reg) always seems to return true for the  
all of the XMM registers if the Function being emitted makes any  
function calls whatsoever, and so it tries to save all thecallee-saved  
XMM registers even when none are actually used.
Further, the prolog/epilog emitter doesn't know how to correctly save  
and restore the XMM registers on the stack.  If outputting assembly,  
it tries to emit  "PUSH XMM6" and such; no such instruction exists,  
and this does not assemble.   If JIT'ting, it incorrectly emits PUSH  
instructions for other registers which happen to share bit encodings  
with the XMM registers (XMM6 becomes ESI, etc.)
Since there is no PUSH XMM* instruction, what needs to be done is to  
adjust the stack pointer directly and then use MOVAPS to write/read  
directly to the stack.
Is this already a known issue?    Other than adding a custom calling  
convention, all of my experience with LLVM has been as a client, so  
I'm not sure how to proceed in fixing this, so if anyone could provide  
pointers, I'd appreciate it.
-Craig
--
Craig Smith
National Instruments