thr3ads.net - search: "fslp"

2016 Mar 12

4

clang triple and clang target

...ed to see how they look, whether it is ARM, Mips or X86. However for some reason clang would generate an error saying that a given target does not exist. Here is the command line I used: clang -S test.c -o test.sse2.x86-64.s --target=x86-unknown-linux-eabi -mfloat-abi=hard -mcpu=x86-64 -mfpu=SSE2 -fslp-vectorize-aggressive -fslp-vectorize-aggressive -fslp-vectorize -fvectorize -fno-lax-vector-conversions Here is the response I got: clang: warning: argument unused during compilation: '-mfloat-abi=hard' clang: warning: argument unused during compilation: '-mcpu=x86-64' clang: warni...

generate vectorized code

2016 Mar 16

2

generate vectorized code

...c[N]; for (int i = 0; i < N; ++i) c[i] = a[i] + b[i]; int sum=0; for (int i = 0; i < N; ++i) sum += c[i]; return sum; } Here are the command lines I tried: clang -S test.c --target=esencia -fvectorize -o test.esencia.s clang -S test.c --target=esencia -fvectorize -fslp-vectorize-aggressive -o test.esencia.s -fslp-vectorize clang -S test.c --target=esencia -fvectorize -fslp-vectorize-aggressive -o test.esencia.s -fslp-vectorize -fno-lax-vector-conversions Unfortunately nothing worked. Can someone help me out? I can't really figure out why this is not working...

vectorization for X86

2016 Mar 16

3

vectorization for X86

...options (such as -mtune). Below are the command and the code that I'm trying to vectorize. The code compiles but I don't see any vectors. What am I doing wrong? Any help is appreciated. clang -S x.c --target=x86_64-pc-gnu -mtune=core-avx2 -o x.x86.s -fvectorize -fno-lax-vector-conversions -fslp-vectorize-aggressive -fslp-vectorize #define N 32 int main () { int a[N], b[N]; int c[N]; for (int i = 0; i < N; ++i) c[i] = a[i] + b[i]; int sum=0; for (int i = 0; i < N; ++i) sum += c[i]; return sum; } -- Rail Shafigulin Software Engineer Esencia Technol...

Clang 5.0 support for armv8 64 bit with neon and auto vectorization

2017 Feb 03

3

Clang 5.0 support for armv8 64 bit with neon and auto vectorization

...product developer at Robert Bosch, Germany. We are using armv8 64bit targets for our development. We have the need to do the cross compiling for our target on windows. I have compiled clang 5.0 from the vcs git. I have tried compiling the code with following options set: clang.exe -target armv8 -fslp-vectorize-aggressive -mfpu=neon -mfloat-abi=hard -c test.cpp As you can see in the options, we require neon feature as well as auto vectorizations on armv8 (64 bit). Can you tell me that is it true that clang supports neon and auto vectorizations with 64bit arm v8 architecture? This setting says...

clang triple and clang target

2016 Mar 14

3

clang triple and clang target

...suggestion but it didn't work. Clearly I'm doing something wrong, but I don't know what exactly. I would really appreciate your help in this. Here is the command line I'm using clang -S test.c -o test.sse2.x86_64.s --target=x86_64-pc-linux-gnu -mfloat-abi=hard -mcpu=k8 -mfpu=SSE2 -fslp-vectorize-aggressive -fslp-vectorize-aggressive -fslp-vectorize -fvectorize -fno-lax-vector-conversions -O3 Here is the response that I get: clang-3.5: warning: argument unused during compilation: '-mfloat-abi=hard' clang-3.5: warning: argument unused during compilation: '-mcpu=k8'...

[LLVMdev] How to get debug dump of candidate pairs selected in BBVectorizer?

2014 Apr 24

2

[LLVMdev] How to get debug dump of candidate pairs selected in BBVectorizer?

Hi All, I'm trying to understand BB Vectorizer and gone through http://llvm.org/devmtg/2012-04-12/Slides/Hal_Finkel.pdf Wanted to know how to use bb-vectorize-debug-candidate-selection and bb-vectorize-debug-pair-selection command arguments. I tried the command with debug build clang - clang -O2 test.c -mllvm -vectorize \ -mllvm -debug-only=bb-vectorize \ -mllvm

clang triple and clang target

2016 Mar 11

2

clang triple and clang target

Can someone explain what exactly a clang triple is (--triple option) and what is the connection between triple and a target? I know there is an article ( http://clang.llvm.org/docs/CrossCompilation.html) that show how to cross compile code, but I'm not clear about is why I need to specify triple, why I can't just say compile for a given target? Any help is appreciated. -- Rail

[LLVMdev] [icFuzz] Help needed with analyzing randomly generated tests that fail on clang 3.4 trunk

2013 Jun 26

5

[LLVMdev] [icFuzz] Help needed with analyzing randomly generated tests that fail on clang 3.4 trunk

...he problem is that icFuzz is not > currently open sourced. I would be happy to see this open sourced, but I think that we can work something out regardless. Also, once we get the current set of things resolved, I think it would be useful to test running with: -- -O3, LTO (-O4 or -flto), -- -fslp-vectorize, -fslp-vectorize-aggressive (which are actually separate optimizations) -- -ffast-math (if you can do floating point with tolerances, or at least -ffinite-math-only), -fno-math-errno (and there are obviously a whole bunch of non-default code-generation and target options). Is it feasib...

[LLVMdev] Autovectorization questions

2014 Mar 12

2

[LLVMdev] Autovectorization questions

...t *A, int *B, int n, int k) { for (int i = 0; i < n; ++i) A[i*7] += B[i*k]; } I replaced "int *A"/"int *B" into "double *A"/"double *B" and then compiled the sample with $> ./clang -Ofast -ffast-math test.c -std=c99 -march=core-avx2 -S -o bb.S -fslp-vectorize-aggressive and loop body looks like: .LBB1_2: # %for.body # =>This Inner Loop Header: Depth=1 cltq vmovsd (%rsi,%rax,8), %xmm0 movq %r9, %r10 sarq $32, %r10 vaddsd (...

[LLVMdev] [icFuzz] Help needed with analyzing randomly generated tests that fail on clang 3.4 trunk

2013 Jul 26

0

[LLVMdev] [icFuzz] Help needed with analyzing randomly generated tests that fail on clang 3.4 trunk

...ly open sourced. > > I would be happy to see this open sourced, but I think that we can > work something out regardless. > > Also, once we get the current set of things resolved, I think it > would be useful to test running with: > > -- -O3, LTO (-O4 or -flto), > -- -fslp-vectorize, -fslp-vectorize-aggressive (which are actually > separate optimizations) > -- -ffast-math (if you can do floating point with tolerances, or at > least -ffinite-math-only), -fno-math-errno > (and there are obviously a whole bunch of non-default > code-generation and ta...

[LLVMdev] [icFuzz] Help needed with analyzing randomly generated tests that fail on clang 3.4 trunk

2013 Jul 27

2

[LLVMdev] [icFuzz] Help needed with analyzing randomly generated tests that fail on clang 3.4 trunk

...ly open sourced. > > I would be happy to see this open sourced, but I think that we can > work something out regardless. > > Also, once we get the current set of things resolved, I think it > would be useful to test running with: > > -- -O3, LTO (-O4 or -flto), > -- -fslp-vectorize, -fslp-vectorize-aggressive (which are actually > separate optimizations) > -- -ffast-math (if you can do floating point with tolerances, or at > least -ffinite-math-only), -fno-math-errno > (and there are obviously a whole bunch of non-default > code-generation and ta...

[LLVMdev] Enabling the SLP vectorizer by default for -O3

2013 Jul 14

6

[LLVMdev] Enabling the SLP vectorizer by default for -O3

Hi, LLVM’s SLP-vectorizer is a new pass that combines similar independent instructions in a straight-line code. It is currently not enabled by default, and people who want to experiment with it can use the clang command line flag “-fslp-vectorize”. I ran LLVM’s test suite with and without the SLP vectorizer on a Sandybridge mac (using SSE4, w/o AVX). Based on my performance measurements (below) I would like to enable the SLP-vectorizer by default on -O3. I would like to hear what others in the community think about this and giv...

[LLVMdev] [icFuzz] Help needed with analyzing randomly generated tests that fail on clang 3.4 trunk

2013 Jul 29

2

[LLVMdev] [icFuzz] Help needed with analyzing randomly generated tests that fail on clang 3.4 trunk

...to see this open sourced, but I think that we can > > work something out regardless. > > > > Also, once we get the current set of things resolved, I think it > > would > > be useful to test running with: > > > > -- -O3, LTO (-O4 or -flto), > > -- -fslp-vectorize, -fslp-vectorize-aggressive (which are actually > > separate optimizations) > > -- -ffast-math (if you can do floating point with tolerances, or > > at > > least -ffinite-math-only), -fno-math-errno (and there are > > obviously a > > whole bunch of n...

[LLVMdev] [icFuzz] Help needed with analyzing randomly generated tests that fail on clang 3.4 trunk

2013 Jul 29

0

[LLVMdev] [icFuzz] Help needed with analyzing randomly generated tests that fail on clang 3.4 trunk

...open sourced. > > I would be happy to see this open sourced, but I think that we can > work something out regardless. > > Also, once we get the current set of things resolved, I think it would > be useful to test running with: > > -- -O3, LTO (-O4 or -flto), > -- -fslp-vectorize, -fslp-vectorize-aggressive (which are actually > separate optimizations) > -- -ffast-math (if you can do floating point with tolerances, or at > least -ffinite-math-only), -fno-math-errno (and there are obviously a > whole bunch of non-default code-generation and targ...

[LLVMdev] AVX code gen

2013 Dec 11

2

[LLVMdev] AVX code gen

...ent where icc generates these instructions (vmulps %ymm1, %ymm3, %ymm2 for example) but I can not get clang/llvm to generate such instructions (using the 3.3 release or either 3.4 rc1 or 3.4 rc2). I am new to clang / llvm so I may not be invoking the tools correctly but given that –fvectorize and –fslp-vectorize are on by default at 3.4 I would have thought that if the code is AVX-able by icc that clang / llvm would be able to do the same… The code is basic matrix multiplication written a number of ways (with and without transposition and such) as a performance measurement exercise. The environ...

[LLVMdev] [icFuzz] Help needed with analyzing randomly generated tests that fail on clang 3.4 trunk

2014 Jan 17

2

[LLVMdev] [icFuzz] Help needed with analyzing randomly generated tests that fail on clang 3.4 trunk

...> can > > > work something out regardless. > > > > > > Also, once we get the current set of things resolved, I think it > > > would > > > be useful to test running with: > > > > > > -- -O3, LTO (-O4 or -flto), > > > -- -fslp-vectorize, -fslp-vectorize-aggressive (which are > > > actually > > > separate optimizations) > > > -- -ffast-math (if you can do floating point with tolerances, or > > > at > > > least -ffinite-math-only), -fno-math-errno (and there are > > &g...

[LLVMdev] Autovectorization questions

2014 Mar 12

4

[LLVMdev] Autovectorization questions

...n; ++i) >> A[i*7] += B[i*k]; >> } >> >> I replaced "int *A"/"int *B" into "double *A"/"double *B" and then compiled the sample with >> >> $> ./clang -Ofast -ffast-math test.c -std=c99 -march=core-avx2 -S -o bb.S -fslp-vectorize-aggressive >> >> and loop body looks like: >> >> .LBB1_2: # %for.body >> # =>This Inner Loop Header: Depth=1 >> cltq >> vmovsd (%rsi,%rax,8), %xmm0 >&g...

[LLVMdev] LICM promoting memory to scalar

2014 Sep 02

3

[LLVMdev] LICM promoting memory to scalar

..., [x6,#:lo12:globalvar] <== sink store of globalvar .L1: ret .cfi_endproc .LFE0: .size _Z3fooii, .-_Z3fooii .ident "GCC: (crosstool-NG linaro-1.13.1-4.8-2014.01 - Linaro GCC 2013.11) 4.9.0" LLVM output: $ clang-aarch64-x++ -S -o - -O3 -ffast-math -fslp-vectorize test.cpp .text .file "test.cpp" .globl _Z3fooii .align 2 .type _Z3fooii, at function _Z3fooii: // @_Z3fooii // BB#0: // %entry cbz w0, .LBB0_5 // BB#1:...

[LLVMdev] AVX code gen

2013 Dec 12

0

[LLVMdev] AVX code gen

...ent where icc generates these instructions (vmulps %ymm1, %ymm3, %ymm2 for example) but I can not get clang/llvm to generate such instructions (using the 3.3 release or either 3.4 rc1 or 3.4 rc2). I am new to clang / llvm so I may not be invoking the tools correctly but given that –fvectorize and –fslp-vectorize are on by default at 3.4 I would have thought that if the code is AVX-able by icc that clang / llvm would be able to do the same… The code is basic matrix multiplication written a number of ways (with and without transposition and such) as a performance measurement exercise. > > T...

[LLVMdev] Enabling the SLP vectorizer by default for -O3

2013 Jul 15

0

[LLVMdev] Enabling the SLP vectorizer by default for -O3

..., Nadav Rotem <nrotem at apple.com> wrote: > Hi, > > LLVM’s SLP-vectorizer is a new pass that combines similar independent instructions in a straight-line code. It is currently not enabled by default, and people who want to experiment with it can use the clang command line flag “-fslp-vectorize”. I ran LLVM’s test suite with and without the SLP vectorizer on a Sandybridge mac (using SSE4, w/o AVX). Based on my performance measurements (below) I would like to enable the SLP-vectorizer by default on -O3. I would like to hear what others in the community think about this and giv...

search for: fslp