thr3ads.net - search: "v1024"

[LLVMdev] [Patch][RFC] Change R600 data layout

2013 Dec 31

4

[LLVMdev] [Patch][RFC] Change R600 data layout

Hi, I've prepared patches for both LLVM and Clang to change the datalayout for R600. This may seem like a bold move, but I think it is warranted. R600/SI is a strange architecture in that it uses 64bit pointers but does not support 64 bit arithmetic except for load/store operations that roughly map onto getelementptr. The current datalayout for r600 includes n32:64, which is odd

Vectorizing remainder loop

2018 Jul 29

2

Vectorizing remainder loop

...m working on a hardware with very large vector width till v2048. Now when I vectorize using llvm default vectorizer maximum 2047 iterations are scalar remainder loop. These are not vectorized by llvm which increases the cost. However these should be vectorized using next available vector width I.e v1024, v512, v256, v128, v64, v32, v16, v8, v4..... The issue of scalar remainder loop has been there in llvm but this issue is enhanced and can't be ignored with large vector width. This is very important and significant to solve this issue. Please help. I m trying to see loopvectorizer.cpp but un...

[LLVMdev] alloca scalarization with dynamic indexing into vectors

2013 Feb 07

1

[LLVMdev] alloca scalarization with dynamic indexing into vectors

...nt and could be set to any value. (scalar_repl_store_delete.ll): target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f80:128:128-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024--a64:64:64-f80:128:128-n8:16:32:64" define void @test_fn(<2 x i32>* %src, <2 x i32>* %results, i32 %alignmentOffsets) nounwind alwaysinline { entry: %sPrivateStorage = alloca [3 x <2 x i32>], align 8 %0 = load <2 x i32>* %src, align 8, !tbaa !9 %a...

Vectorizing remainder loop

2018 Aug 02

2

Vectorizing remainder loop

...m working on a hardware with very large vector width till v2048. Now when I vectorize using llvm default vectorizer maximum 2047 iterations are scalar remainder loop. These are not vectorized by llvm which increases the cost. However these should be vectorized using next available vector width I.e v1024, v512, v256, v128, v64, v32, v16, v8, v4..... The issue of scalar remainder loop has been there in llvm but this issue is enhanced and can't be ignored with large vector width. This is very important and significant to solve this issue. Please help. I m trying to see loopvectorizer.cpp but un...

Vectorizing remainder loop

2018 Aug 03

2

Vectorizing remainder loop

...m working on a hardware with very large vector width till v2048. Now when I vectorize using llvm default vectorizer maximum 2047 iterations are scalar remainder loop. These are not vectorized by llvm which increases the cost. However these should be vectorized using next available vector width I.e v1024, v512, v256, v128, v64, v32, v16, v8, v4..... The issue of scalar remainder loop has been there in llvm but this issue is enhanced and can't be ignored with large vector width. This is very important and significant to solve this issue. Please help. I m trying to see loopvectorizer.cpp but un...

[LLVMdev] Vectorizing alloca instructions

2013 Oct 24

0

[LLVMdev] Vectorizing alloca instructions

Hi Tom, Thanks for working on this. The SLP-vectorizer thinks that %X %Y %Z and %W alias, so it tries to perform 4 scalar store operations (which is a bad idea). We need to figure out why AA thinks that X and Y may alias. Maybe there is a problem with the code that uses AA. Thanks, Nadav On Oct 24, 2013, at 2:04 PM, Tom Stellard <tom at stellard.net> wrote: > Hi, > >

Structurizing multi-exit regions

2017 Mar 02

5

Structurizing multi-exit regions

...I don't think this helps any here. -Matt -------------- next part -------------- ; RUN: opt -S -structurizecfg -si-annotate-control-flow %s target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:64:64-p5:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" target triple = "amdgcn-amd-amdhsa-opencl" ; Function Attrs: nounwind define amdgpu_kernel void @multi_divergent_region_exit(i32 addrspace(1)* nocapture %arg0, i32 addrspace(1)* nocapture %arg1, i32 addrspace(1)* nocapture %arg2) #0 { entry: %tmp = tail cal...

[LLVMdev] Address space extension

2013 Aug 11

0

[LLVMdev] Address space extension

...tput[x] = input[x] + mask[get_local_id(0)]; } The IR for R600 now is: /// test.r600.ll /// target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024-v2048:2048:2048-n32:64" target triple = "r600-none-none" ; Function Attrs: nounwind define void @convolve(i32 addrspace(1)* nocapture readonly %input, i32 addrspace(2)* nocapture readonly %mask, i32 addrspace(1)* nocapture %output) #0 { entry: %call = tail call i32 @get_...

[LLVMdev] Vectorizing alloca instructions

2013 Oct 24

4

[LLVMdev] Vectorizing alloca instructions

Hi, I've been playing around with the SLPVectorizer trying to get it to vectorize this simple program: define void @vector(i32 addrspace(1)* %out, i32 %index) { entry: %0 = alloca [4 x i32] %x = getelementptr [4 x i32]* %0, i32 0, i32 0 %y = getelementptr [4 x i32]* %0, i32 0, i32 1 %z = getelementptr [4 x i32]* %0, i32 0, i32 2 %w = getelementptr [4 x i32]* %0, i32 0, i32 3

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 25

2

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

...presence of "align 1". I'm not sure which side is correct, isn't it equivalent (that is, this is the natural ABI alignment of that type)? Here is my datalayout: target datalayout = "e-m:e-i64:64-n8:16:32:64-S128-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024" 2018-01-23 20:14 GMT-08:00 Daniel Neilson <dneilson at azul.com>: > Hi Alexandre, > The script uses extended-sed syntax, so you need to run sed with the -E > option. > > For example, when preparing the patch I created a file ( script.sed ) > containing all of...

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 25

3

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

...not sure which side is correct, > isn't it equivalent (that is, this is the natural ABI alignment of that > type)? > > Here is my datalayout: > > target datalayout = "e-m:e-i64:64-n8:16:32:64- > S128-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256- > v512:512-v1024:1024" > > > 2018-01-23 20:14 GMT-08:00 Daniel Neilson <dneilson at azul.com>: > >> Hi Alexandre, >> The script uses extended-sed syntax, so you need to run sed with the -E >> option. >> >> For example, when preparing the patch I created a file...

[LLVMdev] Address space extension

2013 Aug 10

2

[LLVMdev] Address space extension

> -----Original Message----- > From: Michele Scandale [mailto:michele.scandale at gmail.com] > Sent: Saturday, August 10, 2013 6:29 AM > To: Micah Villmow > Cc: LLVM Developers Mailing List > Subject: Re: [LLVMdev] Address space extension > > On 08/10/2013 02:47 PM, Micah Villmow wrote: > > Michele, > > The information you are trying to gather is fundamentally

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 25

0

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

...presence of "align 1". I'm not sure which side is correct, isn't it equivalent (that is, this is the natural ABI alignment of that type)? Here is my datalayout: target datalayout = "e-m:e-i64:64-n8:16:32:64-S128-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024" 2018-01-23 20:14 GMT-08:00 Daniel Neilson <dneilson at azul.com<mailto:dneilson at azul.com>>: Hi Alexandre, The script uses extended-sed syntax, so you need to run sed with the -E option. For example, when preparing the patch I created a file ( script.sed ) containing a...

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 25

0

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

...presence of "align 1". I'm not sure which side is correct, isn't it equivalent (that is, this is the natural ABI alignment of that type)? Here is my datalayout: target datalayout = "e-m:e-i64:64-n8:16:32:64-S128-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024" 2018-01-23 20:14 GMT-08:00 Daniel Neilson <dneilson at azul.com<mailto:dneilson at azul.com>>: Hi Alexandre, The script uses extended-sed syntax, so you need to run sed with the -E option. For example, when preparing the patch I created a file ( script.sed ) containing a...

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 24

0

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

Hi Alexandre, The script uses extended-sed syntax, so you need to run sed with the -E option. For example, when preparing the patch I created a file ( script.sed ) containing all of the lines that I copied into the commit message. Then, I ran this bash one-liner from the test directory: for f in $(find . -name '*.ll'); do sed -E -i ‘.sedbak' -f script.sed $f; done When I was happy

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 24

2

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

Hello, Is there a script to update those test cases? I see mention of a sed script in the commit message but when I try it (see attached) on sed I get the following error: sed: file script line 2: invalid reference \3 on `s' command's RHS Did I lose something in a copy-paste? Is it not really a sed script? How do I run it? On Fri, Jan 19, 2018 at 9:15 AM, Daniel Neilson via

search for: v1024