similar to: [LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

Displaying 20 results from an estimated 6000 matches similar to: "[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer"

2013 Jan 29
0
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
On 01/28/2013 12:58 PM, Pekka Jääskeläinen wrote: > Hi, > > Attached is a patch which uses a simple "parallel_loop" metadata attached > to the loop branch instruction in the loop latch for skipping > cross-iteration > memory dependency checking in the LoopVectorizer. This was briefly > discussed > in the email thread "LoopVectorizer in OpenCL C work group
2013 Jan 29
3
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
On Jan 29, 2013, at 12:51 AM, Tobias Grosser <tobias at grosser.es> wrote: > > # ignore assumed dependences. > for (i = 0; i < 4; i++) { > tmp1 = A[3i+1]; > tmp2 = A[3i+2]; > tmp3 = tmp1 + tmp2; > A[3i] = tmp3; > } > > Now I apply for whatever reason a partial reg2mem transformation. > > float tmp3[1]; > > # ignore assumed
2013 Jan 30
0
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
On 01/29/2013 07:58 PM, Nadav Rotem wrote: > > On Jan 29, 2013, at 12:51 AM, Tobias Grosser <tobias at grosser.es > <mailto:tobias at grosser.es>> wrote: > >> >> # ignore assumed dependences. >> for (i = 0; i < 4; i++) { >> tmp1 = A[3i+1]; >> tmp2 = A[3i+2]; >> tmp3 = tmp1 + tmp2; >> A[3i] = tmp3; >> } >>
2013 Jan 29
1
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
Hi Tobias, On 01/29/2013 10:51 AM, Tobias Grosser wrote: > Is the meta data now still valid or how do we ensure the invalid meta data is > removed? It seems it's not valid anymore. Good catch. I was requesting for these transformation cases earlier. Probably there are more not thought of yet. > I have the feeling it may be necessary to link the loop as well as the accesses > for
2013 Jan 28
2
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
On 01/28/2013 09:23 PM, Redmond, Paul wrote: > If ivdep are the semantics you're going for I'd use that. Fine, except I prefer not to include 'v' in it. Vectorization is merely a one way to parallelize the loop. How does llvm.loop.ignore_assumed_deps sound? -- --Pekka
2007 Jan 17
1
tapply, data.frame problem
Hi R-users, I'm quite new to R and trying to learn the basics. I have a following problem concerning the convertion of array object into data frame. I have made following data sets tmp1 <- rnorm(100) tmp2 <- gl(10,2,length=100) tmp3 <- as.data.frame(cbind(tmp1,tmp2)) tmp3.sum <- tapply(tmp3$tmp1,tmp3$tmp2,sum) tmp3.sum <- as.data.frame(tapply(tmp1,tmp2,sum)) and I want the
2012 Sep 21
5
[LLVMdev] Question about LLVM NEON intrinsics
Hi all, I would like to know if LLVM Neon intrinsics are designed to support only 'Legal' types for NEON units. Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on following ll code: ; ModuleID = 'vmax.ll' target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32" target triple =
2008 Jul 18
2
[LLVMdev] Alignment of vectors
Consider the following C code: typedef __attribute__(( ext_vector_type(2) )) float float2; typedef __attribute__(( ext_vector_type(2) )) __attribute__(( aligned(4) )) float float2_align2; void foo(void) { const float * p; size_t offset; float2 tmp = *((float2_align2 *)(p+offset)); } When compiled with clang ‹emit-llvm I get: define void @foo() { entry: %p = alloca float*, align 4
2017 Oct 30
1
An iterative function
Dear all, The function f() below is a function of m1 and m2, both of which are matrices with 3 rows. The function works sequentially one row after another. So altogether there are three stages. I am trying to update the coding to write a generic function that will work for arbitrary k stages. I am hoping to get some suggestion and help. Thanks so much! Hanna ##x, y are two
2010 Jul 16
3
how to skip a specific value when using apply() function to a matrix?
Hello R experts, I'd like to studentize a matrix (tmp1) by column using apply() function and skip some specific values such as zeros in the example below to tmp2 but not tmp3. I used the script below and only can get a matrix tmp3. Could you please help me to studentize the matrix (tmp1) without changing the zeros and generate a new matrix tmp2? Thanks, Joshua tmp1 [,1] [,2] [,3] [,4]
2009 Mar 19
3
[LLVMdev] Proposal to disable some of DAG combine optimizations
Some of the optimizations that the first DAG combine performs is counter productive for our 8-bit target. For example in: // I dropped the types because they are irrelevant. // Excuse me for changing the syntax... store %tmp1, %var %tmp2 = load %var %tmp4 = add %tmp3, %tmp2 Since load is the only user of var and since var has just be stored to, it assumes that %tmp1 is alive and it goes ahead
2012 Sep 21
0
[LLVMdev] Question about LLVM NEON intrinsics
On Fri, Sep 21, 2012 at 1:28 AM, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: > Hi all, > > I would like to know if LLVM Neon intrinsics are designed to support only 'Legal' types for NEON units. > Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on following ll code: > > > ; ModuleID = 'vmax.ll' > target datalayout =
2006 Jul 23
3
RfW 2.3.1: regular expressions to detect pairs of identical word-final character sequences
Dear all I use R for Windows 2.3.1 on a fully updated Windows XP Home SP2 machine and I have two related regular expression problems. platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor
2012 Sep 21
2
[LLVMdev] RE : Question about LLVM NEON intrinsics
Hi Eli, Thanks for the answer, it clarifies the situation for me. Do you know if there is Pass in LLVM that could be adapted to 'legalize' intrinsics calls ? Or shall I define my own intrinsics for non supported types ? Best Regards Seb ________________________________________ De : Eli Friedman [eli.friedman at gmail.com] Date d'envoi : vendredi 21 septembre 2012 11:54 À : Sebastien
2008 Apr 01
2
Wrong UIDs returned from mailbox_transaction_commit_get_uids()
Hi, Wrong UIDs are returned from mailbox_transaction_commit_get_uids() in dovecot-1.1.rc3. The problem is in: int mailbox_transaction_commit(struct mailbox_transaction_context **t) { uint32_t tmp; return mailbox_transaction_commit_get_uids(t, &tmp, &tmp, &tmp); } It should be: int mailbox_transaction_commit(struct mailbox_transaction_context **t) { uint32_t tmp1,
2009 May 21
0
[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations
On Wed, May 20, 2009 at 4:55 PM, Dan Gohman <gohman at apple.com> wrote: > Can you explain why you chose the approach of using a new pass? > I pictured removing LegalizeDAG's type legalization code would > mostly consist of finding all the places that use TLI.getTypeAction > and just deleting code for handling its Expand and Promote. Are you > anticipating something more
2009 May 20
2
[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations
On May 20, 2009, at 1:34 PM, Eli Friedman wrote: > On Wed, May 20, 2009 at 1:19 PM, Eli Friedman > <eli.friedman at gmail.com> wrote: > >> Per subject, this patch adding an additional pass to handle vector >> >> operations; the idea is that this allows removing the code from >> >> LegalizeDAG that handles illegal types, which should be a significant
2009 Mar 23
3
[LLVMdev] Proposal to disable some of DAG combine optimizations
I can't think of any workaround? this optimization eliminates so much information that if we want to retrieve back, it will take a lot of processing and may not necessarily be able to retrieve the lost information for all cases. Besides, why does the generic part of llvm have to force an optimization that is counter productive to some targets? If there are other phases that do the same
2013 Jan 28
2
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
On 01/28/2013 06:45 PM, Nadav Rotem wrote: > I am okay with this patch, assuming that you follow the review of Tobias > and Renato and provide a separate patch for the min-iter-count and a few > test cases. OK. Any opinions on the location of the isParallelLoop() check? Shall I put it to Loop so it is more widely accessible? I.e. Loop->isParallel(). -- Pekka
2013 Jan 28
0
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
It sounds like a good idea to move the method in to Loop. Is there a naming scheme for metadata? I think llvm.loop.* would be helpful for loop-specific metadata. As for parallel I think it is a little too generic. If ivdep are the semantics you're going for I'd use that. paul On 2013-01-28, at 12:03 PM, Pekka Jääskeläinen wrote: > On 01/28/2013 06:45 PM, Nadav Rotem wrote: >>