Displaying 20 results from an estimated 700 matches similar to: "[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits"
2011 Sep 21
0
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
Hi Duncan,
On Wed, Sep 21, 2011 at 1:24 PM, Duncan Sands <baldrick at free.fr> wrote:
> This patch synthesizes haddps/haddpd/hsubps/hsubpd instructions from
> floating
> point additions and subtractions of appropriate vector shuffles. To do this
> I
> introduced new x86 FHADD and FHSUB opcodes. These need to be wired up
> somehow
> in the .td file to the appropriate
2011 Sep 22
3
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
Hi Bruno,
> Some comments:
>
> + // Try to synthesize horizontal adds from adds of shuffles.
> + if (((Subtarget->hasSSE3()&& (VT == MVT::v4f32 || VT == MVT::v2f64)) ||
> + (Subtarget->hasAVX()&& (VT == MVT::v8f32 || VT == MVT::v4f64)))&&
> + isHorizontalBinOp(LHS, RHS, true))
>
> 1) You probably want to do something like:
>
2011 Sep 22
0
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
The output of the avx-hadd program is
3 11 7 15
Preston
-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Duncan Sands
Sent: Thursday, September 22, 2011 3:14 PM
To: Bruno Cardoso Lopes
Cc: LLVMdev
Subject: Re: [LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
Hi Bruno,
> Some comments:
2020 Aug 20
2
Question about llvm vectors
Hi Craig,
Thank you very much for your answer.
I did not want to discuss exactly the semantic and name of one operation
but instead raise the question "would it be beneficial to have more vector
builtins?".
You wrote that the compiler will recognize a pattern and replace it by
__builtin_ia32_haddps when possible, but how can I be sure of that? I would
have to disassemble the generated
2011 Feb 25
2
[LLVMdev] X86 LowerVECTOR_SHUFFLE Question
In ToT, LowerVECTOR_SHUFFLE for x86 has this code:
if (X86::isUNPCKLMask(SVOp))
getTargetShuffleNode(getUNPCKLOpcode(VT) dl, VT, V1, V2, DAG);
why would this not be:
if (X86::isUNPCKLMask(SVOp))
return SVOp;
I'm trying to add support for VUNPCKL and am getting into trouble
because the existing code ends up creating:
VUNPCKLPS
load
load
which is badness come selection
2011 Feb 26
0
[LLVMdev] X86 LowerVECTOR_SHUFFLE Question
David Greene <dag at cray.com> writes:
> In ToT, LowerVECTOR_SHUFFLE for x86 has this code:
>
> if (X86::isUNPCKLMask(SVOp))
> getTargetShuffleNode(getUNPCKLOpcode(VT) dl, VT, V1, V2, DAG);
>
> why would this not be:
>
> if (X86::isUNPCKLMask(SVOp))
> return SVOp;
Ok, I discovered that Bruno did this in revisions 112934, 112942 and
113020 but the logs
2020 Aug 19
2
Question about llvm vectors
Hi,
I love llvm vectors, yet I wonder why some advanced vector operations are
specific to some CPU targets?
Let me take an example:
/// Horizontally adds the adjacent pairs of values contained in two
/// 128-bit vectors of [4 x float].
///
/// \headerfile <x86intrin.h>
///
/// This intrinsic corresponds to the <c> VHADDPS </c> instruction.
///
/// \param __a
/// A
2015 May 04
2
[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
Hi Asghar-Ahmed,
I saw your last ping - sorry, I'm away on vacation and back on Wednesday.
Generally, I'm not sure that having both absd/hadd and sad are compatible
with the discussions going on in other threads, for example my thread about
min and max.
Given that those two intrinsics are fairly trivial to match , I don't see
the need to have two different canonical forms.
James
On
2015 May 05
2
[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
On 4 May 2015 at 08:37, Shahid, Asghar-ahmad
<Asghar-ahmad.Shahid at amd.com> wrote:
> My worry is regarding the query for cost calculation for specific SAD
> instructions such as ‘psad’ (X86) or ‘usad’ (ARM) in Loop Vectorizer.
Hi Shahid,
The vectorizer's cost model has the ability to return different costs
for the same instruction based on the arguments (scalar/vector,
2015 May 01
2
[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
Hi All,
I would like to introduce intrinsics to generate efficient codes for 'absolute differences', 'horizontal add'
and 'sum of absolute differences' Idioms used by user programs.
Identifying these idioms at lower level (Codegen) is complex. These idioms can be identified in LV/SLP
and vectorized using above intrinsics to generate better code.
Proposal:
1. Add
2015 May 05
1
[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
Hi Renato,
Thanks for your response. My concern was actually this. For example, take vector type V8i16 on X86 target
With llvm.sad() intrinsic:
VC1 (Vector Cost) = Cost associated with "PSAD" instruction.
W/ llvm.absd() and llvm.hadd()
VC2 = Cost associated with "absolute diff" + "horizontal add" ( ??? )
As I will be querying with getIntrinsicCost(ID) for these
2013 Nov 23
2
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
I agree with Tim, you need to implement a GetCpuIDAndInfoEx function in
Host.cpp and pass the correct value to ecx. Also you need to verify that 7
is a valid leaf because an invalid leaf is defined to return the highest
supported leaf on that processor. So if a processor supports say leaf 6 and
not leaf 7, then an access leaf 7 will return the data from leaf 6 causing
unrelated bits to be
2015 May 06
2
[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
Hi Renato,
That’s right. I agree with your *pattern vs complexity* thinking.
So I would drop llvm.sad() and go ahead with the remaining two.
Does it make sense in general?
Regards,
Shahid
> -----Original Message-----
> From: Renato Golin [mailto:renato.golin at linaro.org]
> Sent: Tuesday, May 05, 2015 8:40 PM
> To: Shahid, Asghar-ahmad
> Cc: James Molloy; llvmdev at
2014 Sep 19
2
[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
> On Behalf Of Tom Stellard
> Sent: 19 September 2014 01:36
> To: Sanjay Patel
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] predicates vs. requirements [TableGen,
> X86InstrInfo.td]
>
> On Thu, Sep 18, 2014 at 03:25:07PM -0600, Sanjay Patel wrote:
>
2015 May 06
2
[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
> For the time being, if you can get away with heuristics, and that fills your
> allocated time for this task, that it's the best way forward for now.
Sorry that I could not get what exactly you mean with "heuristics".
Is it the "intrinsics approach" itself or something else?
BTW, now my plan is to just add the two intrinsics for 'absolute difference'
and
2014 Mar 13
3
[LLVMdev] Possible bug in getCallPreservedMask for CallingConv::Intel_OCL_BI
Not sure who owns this bit of code, so sending this to the general list.
It looks like there may be an unintentional fall through happening in
the X86RegisterInfo::getCallPreservedMask function.
http://llvm.org/docs/doxygen/html/X86RegisterInfo_8cpp_source.html
case CallingConv::Intel_OCL_BI
2014 Sep 18
3
[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]
I tried to add an 'OptForSize' requirement to a pattern in X86InstrSSE.td,
but it appears to be ignored. However, the condition was detected when
specified as a predicate.
So this doesn't work:
def : Pat<(v2f64 (X86VBroadcast (loadf64 addr:$src))), (VMOVDDUPrm addr:
$src)>,
*Requires<[OptForSize**]>*;
But this does:
* let Predicates = [OptForSize]
2016 Apr 04
7
sum elements in the vector
My target has an instruction that adds up all elements in the vector and
stores the result in a register. I'm trying to implement it in my compiler
but I'm not sure even where to start.
I did look at other targets, but they don't seem to have anything like it (
I could be wrong. My experience with LLVM is limited, so if I missed it,
I'd appreciate if someone could point it out ).
2016 May 28
4
sum elements in the vector
Hi Rail,
Below 2 revisions might be of your interest which Detect SAD patterns and
emit psadbw instructions on X86.:
http://reviews.llvm.org/D14840
http://reviews.llvm.org/D14897
Intrinsics related to absdiff revisons :
http://reviews.llvm.org/D10867
http://reviews.llvm.org/D11678
Hope this helps.
Regards,
Suyog
On Sat, May 28, 2016 at 4:20 AM, Rail Shafigulin via llvm-dev <
llvm-dev at
2012 Jul 26
2
[LLVMdev] X86 sub_ss and sub_sd sub-register indexes
All,
I've been trying to simplify the way LLVM models sub-register relationships a bit, and the X86 sub_ss and sub_sd sub-register indices are getting in the way. I want to get rid of them.
These sub-registers are special, they are only mentioned here:
let CompositeIndices = [(sub_ss), (sub_sd)] in {
def XMM0: Register<"xmm0">, DwarfRegNum<[17, 21, 21]>;
def