Displaying 9 results from an estimated 9 matches for "mssse3".
Did you mean:
ssse3
2009 Nov 13
0
Problem building R 2.10 release
...x86_64
uname -r = 2.6.31-ARCH
uname -s = Linux
gcc (GCC) 4.4.1
icc (ICC) 11.0 20081105
The first problem I encounter seems to be with icc and wctype.h during
./configure.
export CC="icc -std=c99"
export CXX=icpc
export OBJC="icc -std=c99"
export FC=ifort
export CFLAGS="-mssse3 -g -O3 -wd188 -ip"
export CXXFLAGS="-mssse3 -g -O3 -no-gcc"
export OBJCFLAGS=-"-mssse3 -g -O3"
export FCFLAGS="-mssse3 -g -O3 -mp"
./configure --with-system-zlib --with-system-bzlib --with-system-pcre
--with-blas=/usr/local/lib/libgoto_penrynp-r1.26.so
--with-lap...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
....6 with this level of performance regression?
Jack
On Fri, Feb 13, 2015 at 2:47 PM, Jack Howarth
<howarth.mailing.lists at gmail.com> wrote:
> Also confirmed with the llvm 3.5.1 release and the llvm 3.6 release
> branch on x86_64-apple-darwin14...
>
> % clang-3.5 -O3 -mssse3 -fomit-frame-pointer -fno-stack-protector
> -fno-exceptions -o 8 8.c
> % time ./8 9
> 352 solutions
> 3.603u 0.002s 0:03.60 100.0% 0+0k 0+0io 2pf+0w
> % time ./8 10
> 724 solutions
> 104.217u 0.059s 1:44.30 99.9% 0+0k 0+0io 2pf+0w
>
> % clang-3.6 -O3 -mssse3 -fomit-frame-...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...Jack
>>
>> On Fri, Feb 13, 2015 at 2:47 PM, Jack Howarth
>> <howarth.mailing.lists at gmail.com> wrote:
>>> Also confirmed with the llvm 3.5.1 release and the llvm 3.6 release
>>> branch on x86_64-apple-darwin14...
>>>
>>> % clang-3.5 -O3 -mssse3 -fomit-frame-pointer -fno-stack-protector
>>> -fno-exceptions -o 8 8.c
>>> % time ./8 9
>>> 352 solutions
>>> 3.603u 0.002s 0:03.60 100.0% 0+0k 0+0io 2pf+0w
>>> % time ./8 10
>>> 724 solutions
>>> 104.217u 0.059s 1:44.30 99.9% 0+0k 0+0i...
2015 Feb 13
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...as an educational example. As
compiled by both clang 3.5 and 3.7, it gave the correct answer, but
clang 3.5 generates code which runs 20% faster than 3.6/3.7.
##########################################
# clang 3.5 which comes with Xcode 6.1.1
##########################################
$ clang -O3 -mssse3 -fomit-frame-pointer -fno-stack-protector
-fno-exceptions -o 8 8.c
$ time ./8 9 # 9 queens
352 solutions
$ time ./8 10 # 10 queens
./8 9 1.63s user 0.00s system 99% cpu 1.632 total
724 solutions
./8 10 45.11s user 0.01s system 99% cpu 45.121 total
##########################################...
2020 May 18
3
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
What do you base this on?
Per https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html :
"For the x86-32 compiler, you must use -march=cpu-type, -msse or
-msse2 switches to enable SSE extensions and make this option
effective. For the x86-64 compiler, these extensions are enabled by
default."
That reads to me like we're fine for SSE2. As stated in my comments,
SSSE3 support must be
2020 May 18
6
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
...lable for testing. As 32-bit CPUs only have half the
+ * available xmm registers, this optimized version may not be faster than the
+ * pure C version anyway.
+ *
+ * GCC automatically enables SSE2 support on x86-64 builds. The SSSE3 code
+ * path must be enabled manually: ./configure CFLAGS="-mssse3 -O2"
+ */
+
+#ifdef __x86_64__
+#ifdef __SSE2__
+
+#include "rsync.h"
+
+#ifdef __SSSE3__
+#include <immintrin.h>
+#else
+#include <tmmintrin.h>
+#endif
+
+/* Compatibility functions to let our SSSE3 algorithm run on SSE2 */
+
+static inline __m128i sse_load_si128(void co...
2015 Dec 20
10
[Bug 93454] New: Can't build with LLVM/clang 3.7.0
...til/u_sse.h:140:
/usr/bin/../lib64/clang/3.7.0/include/tmmintrin.h:28:2: error: "SSSE3
instruction set not enabled"
#error "SSSE3 instruction set not enabled"
^
1 error generated.
make[3]: *** [nv50/nv84_video_vp.lo] Error 1
make[3]: *** Waiting for unfinished jobs....
Adding -mssse3 to CFLAGS will not work for all hardware.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/nouveau/attachments/201...
2020 May 18
0
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
...nly have half
> the
> + * available xmm registers, this optimized version may not be faster than
> the
> + * pure C version anyway.
> + *
> + * GCC automatically enables SSE2 support on x86-64 builds. The SSSE3 code
> + * path must be enabled manually: ./configure CFLAGS="-mssse3 -O2"
> + */
> +
> +#ifdef __x86_64__
> +#ifdef __SSE2__
> +
> +#include "rsync.h"
> +
> +#ifdef __SSSE3__
> +#include <immintrin.h>
> +#else
> +#include <tmmintrin.h>
> +#endif
> +
> +/* Compatibility functions to let our SSSE3 al...
2020 May 18
2
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
...alf the
>> + * available xmm registers, this optimized version may not be faster than the
>> + * pure C version anyway.
>> + *
>> + * GCC automatically enables SSE2 support on x86-64 builds. The SSSE3 code
>> + * path must be enabled manually: ./configure CFLAGS="-mssse3 -O2"
>> + */
>> +
>> +#ifdef __x86_64__
>> +#ifdef __SSE2__
>> +
>> +#include "rsync.h"
>> +
>> +#ifdef __SSSE3__
>> +#include <immintrin.h>
>> +#else
>> +#include <tmmintrin.h>
>> +#endif
>> +...