Hi Shahid, Thank you so much for your response. You suggested approach is what I am right now using. However, it seems that the overhead is a little bit high because we are introducing two more instructions. I was wondering if there was a cheaper way to do it. Best, Zhi On Mon, May 4, 2015 at 2:12 AM, Shahid, Asghar-ahmad < Asghar-ahmad.Shahid at amd.com> wrote:> Hi Zhi, > > > > If I get your question correctly, Yes, you can do it by using the > IRBuilder’s CreateVectorSplat() API. > > > > /// \brief Return a vector value that contains \arg V broadcasted to \p > > /// NumElts elements. > > Value *CreateVectorSplat(unsigned NumElts, Value *V, const Twine &Name > "") > > > > For your case, here the Value V will be your loaded value %0 and NumElts > will be 2. > > > > So after %0 = load double* %x, align 4, !tbaa !0 > > you will get a sequence of LLVM-IR > > > > %1= insertelement <2 x double > %0, … > > %2= shufflevector <2 x double > %1, … > > > > %2 will be your desired value. > > > > Regards, > > Shahid > > > > *From:* llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] *On > Behalf Of *zhi chen > *Sent:* Monday, May 04, 2015 1:29 PM > *To:* LLVM Dev > *Subject:* [LLVMdev] Load value and broadcast in LLVM > > > > Is it possible to load a value into a vector register and broadcast it in > LLVM? > > > > For example, for the following address %x > > > > %x = getelementptr inbounds %struct._Ray* %ray, i32 0, i32 0, i32 0 > > > > instead of loading the value at %x into a scalar register %0: > > %0 = load double* %x, align 4, !tbaa !0 > > > > I want to load it into a <2 x double> vector register %1 and make both of > the two elements in %1 be the value at %x. > > > > I guess one way to do this is to make getelementptr return a <2 x i32>* > address, where the two addresses in <2 X 32> are the same. But I don't know > if it is possible to do this in LLVM. > > > > Any help would be appreciated. > > > > Best, > > Zhi > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150504/0985eb84/attachment.html>
Hi Zhi, At IR level, yes there is an overhead of two more instruction, however, as Michel has pointed backend may fold it to single instruction wherever there is such an instruction is available. Regards, Shahid From: zhi chen [mailto:zchenhn at gmail.com] Sent: Monday, May 04, 2015 10:32 PM To: Shahid, Asghar-ahmad Cc: LLVM Dev Subject: Re: [LLVMdev] Load value and broadcast in LLVM Hi Shahid, Thank you so much for your response. You suggested approach is what I am right now using. However, it seems that the overhead is a little bit high because we are introducing two more instructions. I was wondering if there was a cheaper way to do it. Best, Zhi On Mon, May 4, 2015 at 2:12 AM, Shahid, Asghar-ahmad <Asghar-ahmad.Shahid at amd.com<mailto:Asghar-ahmad.Shahid at amd.com>> wrote: Hi Zhi, If I get your question correctly, Yes, you can do it by using the IRBuilder’s CreateVectorSplat() API. /// \brief Return a vector value that contains \arg V broadcasted to \p /// NumElts elements. Value *CreateVectorSplat(unsigned NumElts, Value *V, const Twine &Name = "") For your case, here the Value V will be your loaded value %0 and NumElts will be 2. So after %0 = load double* %x, align 4, !tbaa !0 you will get a sequence of LLVM-IR %1= insertelement <2 x double > %0, … %2= shufflevector <2 x double > %1, … %2 will be your desired value. Regards, Shahid From: llvmdev-bounces at cs.uiuc.edu<mailto:llvmdev-bounces at cs.uiuc.edu> [mailto:llvmdev-bounces at cs.uiuc.edu<mailto:llvmdev-bounces at cs.uiuc.edu>] On Behalf Of zhi chen Sent: Monday, May 04, 2015 1:29 PM To: LLVM Dev Subject: [LLVMdev] Load value and broadcast in LLVM Is it possible to load a value into a vector register and broadcast it in LLVM? For example, for the following address %x %x = getelementptr inbounds %struct._Ray* %ray, i32 0, i32 0, i32 0 instead of loading the value at %x into a scalar register %0: %0 = load double* %x, align 4, !tbaa !0 I want to load it into a <2 x double> vector register %1 and make both of the two elements in %1 be the value at %x. I guess one way to do this is to make getelementptr return a <2 x i32>* address, where the two addresses in <2 X 32> are the same. But I don't know if it is possible to do this in LLVM. Any help would be appreciated. Best, Zhi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150504/98f693bd/attachment.html>
Zhi - If your IR is not ending up as the expected splat instructions (simple AVX examples below), please file a bug. $ cat broadcast.ll define <2 x double> @v2f64(double* %d) { %ld = load double, double* %d %v = insertelement <2 x double> undef, double %ld, i32 0 %sh = shufflevector <2 x double> %v, <2 x double> undef, <2 x i32><i32 0, i32 0> ret <2 x double> %sh } define <4 x double> @v4f64(double* %d) { %ld = load double, double* %d %v = insertelement <4 x double> undef, double %ld, i32 0 %sh = shufflevector <4 x double> %v, <4 x double> undef, <4 x i32><i32 0, i32 0, i32 0, i32 0> ret <4 x double> %sh } $ ./llc broadcast.ll -o - -mattr=avx _v2f64: ## @v2f64 vmovddup (%rdi), %xmm0 ## xmm0 = mem[0,0] retq _v4f64: ## @v4f64 vbroadcastsd (%rdi), %ymm0 retq On Mon, May 4, 2015 at 12:12 PM, Shahid, Asghar-ahmad < Asghar-ahmad.Shahid at amd.com> wrote:> Hi Zhi, > > > > At IR level, yes there is an overhead of two more instruction, however, as > Michel has pointed > > backend may fold it to single instruction wherever there is such an > instruction is available. > > > > Regards, > > Shahid > > > > *From:* zhi chen [mailto:zchenhn at gmail.com] > *Sent:* Monday, May 04, 2015 10:32 PM > *To:* Shahid, Asghar-ahmad > *Cc:* LLVM Dev > *Subject:* Re: [LLVMdev] Load value and broadcast in LLVM > > > > Hi Shahid, > > > > Thank you so much for your response. You suggested approach is what I am > right now using. However, it seems that the overhead is a little bit high > because we are introducing two more instructions. I was wondering if there > was a cheaper way to do it. > > > > Best, > > Zhi > > > > On Mon, May 4, 2015 at 2:12 AM, Shahid, Asghar-ahmad < > Asghar-ahmad.Shahid at amd.com> wrote: > > Hi Zhi, > > > > If I get your question correctly, Yes, you can do it by using the > IRBuilder’s CreateVectorSplat() API. > > > > /// \brief Return a vector value that contains \arg V broadcasted to \p > > /// NumElts elements. > > Value *CreateVectorSplat(unsigned NumElts, Value *V, const Twine &Name > "") > > > > For your case, here the Value V will be your loaded value %0 and NumElts > will be 2. > > > > So after %0 = load double* %x, align 4, !tbaa !0 > > you will get a sequence of LLVM-IR > > > > %1= insertelement <2 x double > %0, … > > %2= shufflevector <2 x double > %1, … > > > > %2 will be your desired value. > > > > Regards, > > Shahid > > > > *From:* llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] *On > Behalf Of *zhi chen > *Sent:* Monday, May 04, 2015 1:29 PM > *To:* LLVM Dev > *Subject:* [LLVMdev] Load value and broadcast in LLVM > > > > Is it possible to load a value into a vector register and broadcast it in > LLVM? > > > > For example, for the following address %x > > > > %x = getelementptr inbounds %struct._Ray* %ray, i32 0, i32 0, i32 0 > > > > instead of loading the value at %x into a scalar register %0: > > %0 = load double* %x, align 4, !tbaa !0 > > > > I want to load it into a <2 x double> vector register %1 and make both of > the two elements in %1 be the value at %x. > > > > I guess one way to do this is to make getelementptr return a <2 x i32>* > address, where the two addresses in <2 X 32> are the same. But I don't know > if it is possible to do this in LLVM. > > > > Any help would be appreciated. > > > > Best, > > Zhi > > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150504/8fea690a/attachment.html>