thr3ads.net - similar to: "[LLVMdev] [Patch][RFC] Change R600 data layout"

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] [Patch][RFC] Change R600 data layout"

[LLVMdev] alloca scalarization with dynamic indexing into vectors

2013 Feb 07

[LLVMdev] alloca scalarization with dynamic indexing into vectors

Hi all, I have a question regarding dynamic indexing into a vector with GEP. I see that in the ScalarReplAggregates pass in the LLVM 3.2 release the call SROA::isSafeGEP() will now allow alloca scalarization in the case where a GEP index into a vector isn’t a constant. My question is: what is the expected behavior when the index is out of bounds of the vector? Is it undefined? I have an

[LLVMdev] Vectorizing alloca instructions

2013 Oct 24

[LLVMdev] Vectorizing alloca instructions

Hi Tom, Thanks for working on this. The SLP-vectorizer thinks that %X %Y %Z and %W alias, so it tries to perform 4 scalar store operations (which is a bad idea). We need to figure out why AA thinks that X and Y may alias. Maybe there is a problem with the code that uses AA. Thanks, Nadav On Oct 24, 2013, at 2:04 PM, Tom Stellard <tom at stellard.net> wrote: > Hi, > >

Vectorizing remainder loop

2018 Jul 29

Vectorizing remainder loop

Hello, I m working on a hardware with very large vector width till v2048. Now when I vectorize using llvm default vectorizer maximum 2047 iterations are scalar remainder loop. These are not vectorized by llvm which increases the cost. However these should be vectorized using next available vector width I.e v1024, v512, v256, v128, v64, v32, v16, v8, v4..... The issue of scalar remainder loop has

[LLVMdev] Vectorizing alloca instructions

2013 Oct 24

[LLVMdev] Vectorizing alloca instructions

Hi, I've been playing around with the SLPVectorizer trying to get it to vectorize this simple program: define void @vector(i32 addrspace(1)* %out, i32 %index) { entry: %0 = alloca [4 x i32] %x = getelementptr [4 x i32]* %0, i32 0, i32 0 %y = getelementptr [4 x i32]* %0, i32 0, i32 1 %z = getelementptr [4 x i32]* %0, i32 0, i32 2 %w = getelementptr [4 x i32]* %0, i32 0, i32 3

[LLVMdev] Address space extension

2013 Aug 10

[LLVMdev] Address space extension

> -----Original Message----- > From: Michele Scandale [mailto:michele.scandale at gmail.com] > Sent: Saturday, August 10, 2013 6:29 AM > To: Micah Villmow > Cc: LLVM Developers Mailing List > Subject: Re: [LLVMdev] Address space extension > > On 08/10/2013 02:47 PM, Micah Villmow wrote: > > Michele, > > The information you are trying to gather is fundamentally

long data frame selection error

2008 Jul 14

long data frame selection error

Hello, I am trying to select the following headers from a data frame but when I try and run the command it executes halfway through and give me an error at V188 and V359. Temp <- data.frame(V4, V5, V6, V7, V8, V9, V10, V11, V12, V13, V14, V15, V16, V17, V18, V19, V20, V21, V22, V23, V24, V25, V26, V27, V28, V29, V30, V31, V32, V33, V34, V35, V36, V37, V38, V39, V40, V41, V42, V43, V44, V45,

[LLVMdev] [PATCH] R600 - Fix zero extend of i1

2013 Dec 31

[LLVMdev] [PATCH] R600 - Fix zero extend of i1

Hi, When trying to compile a trivial opencl kernel such as: __kernel void if_eq(__global int * out, int arg0, int arg1){ out[0] = arg0==arg1?0:1; } Clang generates IR like: %1 = icmp eq i32 %arg0, %arg1 %. = zext i1 %1 to i32 This eventually crashes ISel on R600. Attached patch adds a selector so it will compile. Regards, Jon Pry jonpry at gmail.com -------------- next

Vectorizing remainder loop

2018 Aug 02

Vectorizing remainder loop

Hi Hameeza, Aside from Ashutosh's patch..... When the vector width is that large, we can't keep vectorizing remainder like below. It'll be a huge code size if nothing else ---- hitting ITLB miss because of this is very bad, for example. VF=2048 // main vector loop VF=1024 // vectorized remainder 1 VF=512 // vectorized remainder 2 ... Vectorize remainder until trip count is

sort columns

2006 Feb 01

sort columns

Hi. I have a simple (I think) question My dataset have these variables: names(data) [1] "v1" "v2" "v3" "v4" "v5" "v6" "v7" "v8" "v9" "v10" "v11" "v12" "v13" "v14" "v15" "v16" "v17"

Vectorizing remainder loop

2018 Aug 03

Vectorizing remainder loop

>it cannot afford large size masks for large vectors So, even a standard way of vectorizing remainder in masked or unmasked fashion wouldn’t work, I suppose. Ouch. I suppose VPlan should be able to model this kind of gigantic remainder vector code (when the time comes). Not pretty at all, though. Now, be fully aware that Direction #2 is really a poor (or rather extremely poor) person’s

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 25

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

Thanks, that worked like a charm except for the following: llvm generate: call void @llvm.memcpy.p3i8.p1i8.i64(i8 addrspace(3)* align 1 bitcast ([512 x float] addrspace(3)* @a_scratchpad to i8 addrspace(3)*), i8 addrspace(1)* align 1 %0, i64 2048, i1 false) And we expected: call void @llvm.memcpy.p3i8.p1i8.i64(i8 addrspace(3)* bitcast ([512 x float] addrspace(3)* [[SPM0]] to i8

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 25

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

Yes, all that is correct. My question is more a long term question: why do the .ll printer specify the alignment if it is equivalent to the default one? That is, it seems the sed script expect the printer to not specify it (this would match the load/store behavior), but the ll-printer does specify it, which either means the printer is not ideal on this case and I should fix it, or in this case

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 25

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

Good question. AFAIK, the IR-printer doesn’t understand the semantics of parameter attributes. In this case, it only knows that there is an attribute on the parameter that is integer valued (with value 1) and that has the name “align”, so it prints it out. If we don’t want it printing out ‘align 1’ then it’s up to us to not set the alignment parameter attribute to a value if that value would be 1.

[LLVMdev] Address space extension

2013 Aug 11

[LLVMdev] Address space extension

Hello Micah, I first apologize for the mail length, but I think that using an example would be better to clarify the case and the objections. > [Micah Villmow] In the case of OpenCL, you can't correctly use the standard C calling convention and still be OpenCL compliant, the C calling convention is too permissive. The second you use OpenCL, you are using an OpenCL specific calling

sprintf error: "only 100 arguments allowed"

2015 Aug 22

sprintf error: "only 100 arguments allowed"

I'm trying to apply a function defined in the VW R docs, that attemps to convert a data.table object to Vowpal Wabbit format. In the process i'm getting the error in printf mentioned in the subject. The original function is here: https://github.com/JohnLangford/vowpal_wabbit/blob/master/R/dt2vw.R Below there is a small example that reproduces the error. The function works great with

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 25

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

Hi Alexandre, Before the change you would have been expecting one of the following, correct? a) call void @llvm.memcpy.p3i8.p1i8.i64(i8 addrspace(3)* bitcast ([512 x float] addrspace(3)* [[SPM0]] to i8 addrspace(3)*), i8 addrspace(1)* [[APTR]], i64 2048, i32 0, i1 false) b) call void @llvm.memcpy.p3i8.p1i8.i64(i8 addrspace(3)* bitcast ([512 x float] addrspace(3)* [[SPM0]] to i8 addrspace(3)*), i8

sprintf error: "only 100 arguments allowed"

2015 Aug 26

sprintf error: "only 100 arguments allowed"

Wouldn't it make sense to have this in the man page? The 8192-byte limitation for 'fmt' is mentioned but not this one. Thanks, H. On 08/25/2015 02:08 AM, Prof Brian Ripley wrote: > From the sources: > > #define MAXNARGS 100 > /* ^^^ not entirely arbitrary, but strongly linked to > allowing %$1 to %$99 !*/ > > > > On 22/08/2015 04:21, Martin

Structurizing multi-exit regions

2017 Mar 02

Structurizing multi-exit regions

Hi, I'm trying to solve a problem from StructurizeCFG not actually handling regions with multiple exits. Sample IR attached. StructurizeCFG doesn't touch this function, exiting early on the isTopLevelRegion check. SIAnnotateControlFlow then gets confused and ends up inserting an if into one of the blocks, and the matching end.cf into one of the return/unreachable blocks. The input to

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 24

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

Hi Alexandre, The script uses extended-sed syntax, so you need to run sed with the -E option. For example, when preparing the patch I created a file ( script.sed ) containing all of the lines that I copied into the commit message. Then, I ran this bash one-liner from the test directory: for f in $(find . -name '*.ll'); do sed -E -i ‘.sedbak' -f script.sed $f; done When I was happy

[LLVMdev] [PATCH] R600 - Fix zero extend of i1

2014 Jan 02

[LLVMdev] [PATCH] R600 - Fix zero extend of i1

> This patch looks good, but you need to add a test case. You can add it > to the file test/CodeGen/R600/zero_extend.ll Version 2 of patch attached which includes test case. -Jon -------------- next part --------------

similar to: [LLVMdev] [Patch][RFC] Change R600 data layout