search for: haddp

Displaying 7 results from an estimated 7 matches for "haddp".

Did you mean: hadd
2020 Aug 20
2
Question about llvm vectors
...ig, Thank you very much for your answer. I did not want to discuss exactly the semantic and name of one operation but instead raise the question "would it be beneficial to have more vector builtins?". You wrote that the compiler will recognize a pattern and replace it by __builtin_ia32_haddps when possible, but how can I be sure of that? I would have to disassemble the generated code right? It is very impractical isn'it? And it leads me to understand that each CPU target has a bank of patterns which it can recognize but wouldn't it be very similar to have advanced generic vecto...
2011 Sep 21
2
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
This patch synthesizes haddps/haddpd/hsubps/hsubpd instructions from floating point additions and subtractions of appropriate vector shuffles. To do this I introduced new x86 FHADD and FHSUB opcodes. These need to be wired up somehow in the .td file to the appropriate instructions. Since I have no idea how tablegen works I...
2011 Sep 21
0
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
Hi Duncan, On Wed, Sep 21, 2011 at 1:24 PM, Duncan Sands <baldrick at free.fr> wrote: > This patch synthesizes haddps/haddpd/hsubps/hsubpd instructions from > floating > point additions and subtractions of appropriate vector shuffles.  To do this > I > introduced new x86 FHADD and FHSUB opcodes.  These need to be wired up > somehow > in the .td file to the appropriate instructions.  Since I have...
2011 Sep 22
3
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
...; and check it for the first condition, because > when AVX is on, the SSE levels are all turned off (as to consider AVX > a reimplementation of all SSE levels). > > For the second condition: Does this logic works for 256-bit vectors? > I'm asking that because although the 128-bit HADDPS and the 256-bit > HADDPD have the same number of elements, their horizontal operation > behavior is different (look at AVX manual for details)! If it doesn't, > just remove the 256-bit handling for now. it's not clear whether it is correct for 256 bit operations. The AVX docs on...
2011 Sep 22
0
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
...and check it for the first condition, because > when AVX is on, the SSE levels are all turned off (as to consider AVX > a reimplementation of all SSE levels). > > For the second condition: Does this logic works for 256-bit vectors? > I'm asking that because although the 128-bit HADDPS and the 256-bit > HADDPD have the same number of elements, their horizontal operation > behavior is different (look at AVX manual for details)! If it doesn't, > just remove the 256-bit handling for now. it's not clear whether it is correct for 256 bit operations. The AVX docs...
2020 Aug 19
2
Question about llvm vectors
...der why some advanced vector operations are specific to some CPU targets? Let me take an example: /// Horizontally adds the adjacent pairs of values contained in two /// 128-bit vectors of [4 x float]. /// /// \headerfile <x86intrin.h> /// /// This intrinsic corresponds to the <c> VHADDPS </c> instruction. /// /// \param __a /// A 128-bit vector of [4 x float] containing one of the source operands. /// The horizontal sums of the values are stored in the lower bits of the /// destination. /// \param __b /// A 128-bit vector of [4 x float] containing one of the sour...
2012 Jun 29
2
[LLVMdev] Removing the separation between opt and codegen?
Hello, One important next step in turning LLVM into a first-class autovectorizing compiler will be to incorporate target information into the vectorization logic. To really make good decisions regarding what is profitable to vectorize, and how that vectorization should be done, it will be important for the vectorization pass(es) to understand the underlying target capabilities. The same will hold