Larry Gritz
2014-Sep-03 18:39 UTC
[LLVMdev] Questions on the llvm 'vector' types and resulting SIMD instructions
If I generate IR using 'vector' types, for example, if my code assembles IR like this: define <4 x float> @simd_mul(<4 x float>, <4 x float>) { %3 = fmul <4 x float> %0, %1 ret <4 x float> %3 } I assume that when I JIT, it will generates the best SIMD instructions available on the host it's running on? For example, when running on a machine supporting SSE, it does seem to generate SSE instructions, and this successfully turns into a function callable from C with a signature that looks like __m128 simd_mul (__m128 a, __m128 b); But the vector documentation is a little sketchy, and I am not sure about a few things: * Will it really autodetect and use the best SIMD available on my machine? (For example, SSE4.2 vs SSE2, etc.?) Is there anything I need to tell the JIT or the ExecutionEngine to make it use a particular instruction set? (The only case I care about is to generate the best code for the host it's currently running on.) * Is there any difference in vector functionality of old JIT versus MCJIT? (Yes, I know that starting in 3.6, it'll be only MCJIT.) * What happens if it runs on a machine without SSE? Is using vectors an error, or will it just generate the equivalent scalar code automatically? If it generates scalar code, what is the function signature, as it would appear to be called from a C function, on a machine without __m128? * What happens to vector types of length not equal to the machine's SIMD length? If I defined a <3 x float>, would it always generate scalar code, or would it pad to a 4xfloat and generate SSE instructions? Or is it not even allowed? Thanks, and apologies if I've missed the documentation where all this is spelled out. -- Larry Gritz lg at larrygritz.com
Renato Golin
2014-Sep-03 19:50 UTC
[LLVMdev] Questions on the llvm 'vector' types and resulting SIMD instructions
Hi Larry, I'll try to answer a few of your questions, but other folks will know more... On 3 September 2014 19:39, Larry Gritz <lg at larrygritz.com> wrote:> * Will it really autodetect and use the best SIMD available on my machine? (For example, SSE4.2 vs SSE2, etc.?) Is there anything I need to tell the JIT or the ExecutionEngine to make it use a particular instruction set? (The only case I care about is to generate the best code for the host it's currently running on.)If you don't specify -mfpu/-mcpu, LLVM will try to guess the best it can. Some archs (x86) are better than others (ARM) at that, but it should never generate bad code (ie. AVX on an SSE machine). At most, it'll guess conservatively and maybe generave SSE code in AVX machines, but not the other way around.> * Is there any difference in vector functionality of old JIT versus MCJIT? (Yes, I know that starting in 3.6, it'll be only MCJIT.)I don't think so. Both use the same passes and back-ends, so I'd be surprised if they did. As obvious as it sounds, I'd heavily encourage you not to use the old JIT. Not only we deleted it for good, but it was never that good on all architectures, so you'll be stuck with an ageing, unsupported and possibly broken JIT technology.> * What happens if it runs on a machine without SSE? Is using vectors an error, or will it just generate the equivalent scalar code automatically? If it generates scalar code, what is the function signature, as it would appear to be called from a C function, on a machine without __m128? > * What happens to vector types of length not equal to the machine's SIMD length? If I defined a <3 x float>, would it always generate scalar code, or would it pad to a 4xfloat and generate SSE instructions? Or is it not even allowed?The answer to both questions is: it depends. Obviously, <3 x float> is not a legal type on any machine, so LLVM tends to either expand it to a larger vector or split into multiple vectors, etc. There are some IR passes that do all that, including serialization of vector code, but your mileage may vary on different back-ends to support everything. Since you're fiddling with IR and JIT, you should make your choices based on what each one supports. Back-ends have a late legalization phase, where they scan the DAG (after IR lowering) and legalize types (ex. i64 into i32+i32 in 32-bit archs), so depending on the IR you provide the back-end, it may know how to legalize some types, but not others. Be careful. And, as usual, if you find any odd behaviour, please report to the list or in bugzilla. cheers, --renato
Jim Grosbach
2014-Sep-03 20:43 UTC
[LLVMdev] Questions on the llvm 'vector' types and resulting SIMD instructions
> On Sep 3, 2014, at 12:50 PM, Renato Golin <renato.golin at linaro.org> wrote: > > Hi Larry, > > I'll try to answer a few of your questions, but other folks will know more... > > > On 3 September 2014 19:39, Larry Gritz <lg at larrygritz.com> wrote: >> * Will it really autodetect and use the best SIMD available on my machine? (For example, SSE4.2 vs SSE2, etc.?) Is there anything I need to tell the JIT or the ExecutionEngine to make it use a particular instruction set? (The only case I care about is to generate the best code for the host it's currently running on.) > > If you don't specify -mfpu/-mcpu, LLVM will try to guess the best it > can. Some archs (x86) are better than others (ARM) at that, but it > should never generate bad code (ie. AVX on an SSE machine). At most, > it'll guess conservatively and maybe generave SSE code in AVX > machines, but not the other way around.LLVM does provide the facilities to do that, but it’s not completely automatic. It’s very easy to do, however. When creating the target machine there’s a parameter to the createTargetMachine() method for the CPU. There’s a utility function, sys::getHostCPUName(), which will return a suitable value for that parameter. There’s been discussion about making that easier to specify as the default behavior for the common case where the compilation host and the execution target are the same machine, but there’s nothing firm yet.> > >> * Is there any difference in vector functionality of old JIT versus MCJIT? (Yes, I know that starting in 3.6, it'll be only MCJIT.) > > I don't think so. Both use the same passes and back-ends, so I'd be > surprised if they did.The old JIT had somewhat spotty support for anything newer than SSS3. The MCJIT should be a strict improvement here.> > As obvious as it sounds, I'd heavily encourage you not to use the old > JIT. Not only we deleted it for good, but it was never that good on > all architectures, so you'll be stuck with an ageing, unsupported and > possibly broken JIT technology. > > >> * What happens if it runs on a machine without SSE? Is using vectors an error, or will it just generate the equivalent scalar code automatically? If it generates scalar code, what is the function signature, as it would appear to be called from a C function, on a machine without __m128? >> * What happens to vector types of length not equal to the machine's SIMD length? If I defined a <3 x float>, would it always generate scalar code, or would it pad to a 4xfloat and generate SSE instructions? Or is it not even allowed? > > The answer to both questions is: it depends. > > Obviously, <3 x float> is not a legal type on any machine, so LLVM > tends to either expand it to a larger vector or split into multiple > vectors, etc. There are some IR passes that do all that, including > serialization of vector code, but your mileage may vary on different > back-ends to support everything. Since you're fiddling with IR and > JIT, you should make your choices based on what each one supports. > > Back-ends have a late legalization phase, where they scan the DAG > (after IR lowering) and legalize types (ex. i64 into i32+i32 in 32-bit > archs), so depending on the IR you provide the back-end, it may know > how to legalize some types, but not others. Be careful. And, as usual, > if you find any odd behaviour, please report to the list or in > bugzilla. > > cheers, > --renato > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev