search for: haddsub

Displaying 4 results from an estimated 4 matches for "haddsub".

Did you mean: addsub
2011 Sep 22
3
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
...nes work differently. If someone has a machine with AVX to test on, I've attached avx-hadd.s. It should be possible to do: gcc -o avx-hadd avx-hadd.s ./avx-hadd and the result should make it clear. In the meantime I'm removed the 256 bit logic. > 2) Rename horizontal.ll to sse3-haddsub.ll Done! > 3) Can you duplicate the testcase file to something like > avx-haddsub.ll, and check for the AVX 128-bit versions too? I added the avx checks to the same file (in which case calling it sse3-haddsub.ll is not so great). > 4) Your tablegen modifications are totally fine, for t...
2011 Sep 22
0
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
...nes work differently. If someone has a machine with AVX to test on, I've attached avx-hadd.s. It should be possible to do: gcc -o avx-hadd avx-hadd.s ./avx-hadd and the result should make it clear. In the meantime I'm removed the 256 bit logic. > 2) Rename horizontal.ll to sse3-haddsub.ll Done! > 3) Can you duplicate the testcase file to something like > avx-haddsub.ll, and check for the AVX 128-bit versions too? I added the avx checks to the same file (in which case calling it sse3-haddsub.ll is not so great). > 4) Your tablegen modifications are totally fine, for...
2011 Sep 21
0
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
...vectors? I'm asking that because although the 128-bit HADDPS and the 256-bit HADDPD have the same number of elements, their horizontal operation behavior is different (look at AVX manual for details)! If it doesn't, just remove the 256-bit handling for now. 2) Rename horizontal.ll to sse3-haddsub.ll 3) Can you duplicate the testcase file to something like avx-haddsub.ll, and check for the AVX 128-bit versions too? 4) Your tablegen modifications are totally fine, for the intrinsics just do: let Predicates = [HasSSE3] in { def : Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), VR128:$src2),...
2011 Sep 21
2
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
This patch synthesizes haddps/haddpd/hsubps/hsubpd instructions from floating point additions and subtractions of appropriate vector shuffles. To do this I introduced new x86 FHADD and FHSUB opcodes. These need to be wired up somehow in the .td file to the appropriate instructions. Since I have no idea how tablegen works I just hacked it in horribly. It works, but breaks support for the hadd