On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote:> In the same breath I’d also like to kindly ask if someone could have > a look at the reverse operations, namely trunk from 128 to 64 bit > using movdq2q, and 128 to 32 and 64 to 32 using movd. This also > seems related to Bug 2585. Thanks again.The operations you're describing can be represented as insertelement and extractelement in LLVM IR. I don't know of anyone actively working on MMX tuning for LLVM, so if you'd like to see it improve, consider yourself encouraged to get involved directly :-). Dan
On 31-Jul-08, at 2:38 PM, Dan Gohman wrote:> On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote: >> In the same breath I’d also like to kindly ask if someone could have >> a look at the reverse operations, namely trunk from 128 to 64 bit >> using movdq2q, and 128 to 32 and 64 to 32 using movd. This also >> seems related to Bug 2585. Thanks again. > > The operations you're describing can be represented as insertelement > and extractelement in LLVM IR. > > I don't know of anyone actively working on MMX tuning for LLVM, so > if you'd like to see it improve, consider yourself encouraged to > get involved directly :-).I noticed that, when doing operations on 64-bit vectors, MMX instructions are often emitted even when SSE3 is available. Is this really the intent or is it just that SSE versions of certain patterns have not been added, and therefore it falls back to MMX versions? It's not really encouraged to use MMX (or x87 for that matter) on modern microarchitectures if you can get away with SSE. -- Stefanus Du Toit <stefanus.dutoit at rapidmind.com> RapidMind Inc. phone: +1 519 885 5455 x116 -- fax: +1 519 885 1463
On Thu, Jul 31, 2008 at 2:50 PM, Stefanus Du Toit <stefanus.dutoit at rapidmind.com> wrote:> On 31-Jul-08, at 2:38 PM, Dan Gohman wrote: >> On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote: >>> In the same breath I'd also like to kindly ask if someone could have >>> a look at the reverse operations, namely trunk from 128 to 64 bit >>> using movdq2q, and 128 to 32 and 64 to 32 using movd. This also >>> seems related to Bug 2585. Thanks again. >> >> The operations you're describing can be represented as insertelement >> and extractelement in LLVM IR. >> >> I don't know of anyone actively working on MMX tuning for LLVM, so >> if you'd like to see it improve, consider yourself encouraged to >> get involved directly :-). > > I noticed that, when doing operations on 64-bit vectors, MMX > instructions are often emitted even when SSE3 is available. Is this > really the intent or is it just that SSE versions of certain patterns > have not been added, and therefore it falls back to MMX versions? It's > not really encouraged to use MMX (or x87 for that matter) on modern > microarchitectures if you can get away with SSE. >Just off the top of my head, I'd say that the pattern probably hasn't been added. You're right that we should use SSE whenever available. Could you send an example of a program that's using MMX when the equivalent SSE instruction is available? -bw
Hi Dan, Yes, they could be represented with insertelement and extractelement, but I don't think they actually generate optimal code using movq2dq and such. Else both bugs 2584 and 2585 would be fixed. Anyway, I'm actually already encouraged to get involved myself. I'm quite experienced with MMX and SSE but I'm still trying to learn more about how LLVM does instruction selection and such. By the way, I noticed that movq2dq and such are missing from the intrinsics as well. Maybe I could make myself useful by starting to add them? Do you know whether http://llvm.org/docs/ExtendingLLVM.html#intrinsic is still a good description on how to get started? Thank you, Nicolas -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Dan Gohman Sent: Thursday, 31 July, 2008 23:39 To: LLVM Developers Mailing List Subject: Re: [LLVMdev] Generating movq2dq using IRBuilder On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote:> In the same breath I'd also like to kindly ask if someone could have > a look at the reverse operations, namely trunk from 128 to 64 bit > using movdq2q, and 128 to 32 and 64 to 32 using movd. This also > seems related to Bug 2585. Thanks again.The operations you're describing can be represented as insertelement and extractelement in LLVM IR. I don't know of anyone actively working on MMX tuning for LLVM, so if you'd like to see it improve, consider yourself encouraged to get involved directly :-). Dan _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Hi Stefanus, I'm not if using MMX instructions when doing operations on 64-bit vectors is so terrible? With x86-64 you have double the registers, but it comes at the cost of longer instruction encodings. So there's probably no benefit using SSE. Or am I missing something? Cheers, Nicolas -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Stefanus Du Toit Sent: Thursday, 31 July, 2008 23:51 To: LLVM Developers Mailing List Subject: Re: [LLVMdev] Generating movq2dq using IRBuilder On 31-Jul-08, at 2:38 PM, Dan Gohman wrote:> On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote: >> In the same breath I'd also like to kindly ask if someone could have >> a look at the reverse operations, namely trunk from 128 to 64 bit >> using movdq2q, and 128 to 32 and 64 to 32 using movd. This also >> seems related to Bug 2585. Thanks again. > > The operations you're describing can be represented as insertelement > and extractelement in LLVM IR. > > I don't know of anyone actively working on MMX tuning for LLVM, so > if you'd like to see it improve, consider yourself encouraged to > get involved directly :-).I noticed that, when doing operations on 64-bit vectors, MMX instructions are often emitted even when SSE3 is available. Is this really the intent or is it just that SSE versions of certain patterns have not been added, and therefore it falls back to MMX versions? It's not really encouraged to use MMX (or x87 for that matter) on modern microarchitectures if you can get away with SSE. -- Stefanus Du Toit <stefanus.dutoit at rapidmind.com> RapidMind Inc. phone: +1 519 885 5455 x116 -- fax: +1 519 885 1463 _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Jul 31, 2008, at 10:46 PM, Nicolas Capens wrote:> Hi Dan, > > Yes, they could be represented with insertelement and > extractelement, but I > don't think they actually generate optimal code using movq2dq and > such. Else > both bugs 2584 and 2585 would be fixed. > > Anyway, I'm actually already encouraged to get involved myself. I'm > quite > experienced with MMX and SSE but I'm still trying to learn more > about how > LLVM does instruction selection and such.Nice. Perhaps Dan's talk at the Developer Meeting was helpful. :-)> > > By the way, I noticed that movq2dq and such are missing from the > intrinsics > as well. Maybe I could make myself useful by starting to add them? > Do you > know whether http://llvm.org/docs/ExtendingLLVM.html#intrinsic is > still a > good description on how to get started?No. We don't need intrinsics for most of the vector instructions. They can and should be lowered into the right combination of vector_shuffle, extractelement, etc. The right way to go about this is figuring out why x86 isn't doing what's expected. That probably involved fixing 2584 / 2585 and perhaps more. We usually handle SSE selections very well. Unfortunately MMX has not received much love. Perhaps you can take that lead? :-) Evan> > > Thank you, > > Nicolas > > > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev- > bounces at cs.uiuc.edu] On > Behalf Of Dan Gohman > Sent: Thursday, 31 July, 2008 23:39 > To: LLVM Developers Mailing List > Subject: Re: [LLVMdev] Generating movq2dq using IRBuilder > > > > On Jul 31, 2008, at 7:22 AM, Nicolas Capens wrote: > >> In the same breath I'd also like to kindly ask if someone could have >> a look at the reverse operations, namely trunk from 128 to 64 bit >> using movdq2q, and 128 to 32 and 64 to 32 using movd. This also >> seems related to Bug 2585. Thanks again. > > The operations you're describing can be represented as insertelement > and extractelement in LLVM IR. > > I don't know of anyone actively working on MMX tuning for LLVM, so > if you'd like to see it improve, consider yourself encouraged to > get involved directly :-). > > Dan > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev