thr3ads.net - similar to: "[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer"

Displaying 20 results from an estimated 6000 matches similar to: "[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer"

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

2013 Jan 29

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

On 01/28/2013 12:58 PM, Pekka Jääskeläinen wrote: > Hi, > > Attached is a patch which uses a simple "parallel_loop" metadata attached > to the loop branch instruction in the loop latch for skipping > cross-iteration > memory dependency checking in the LoopVectorizer. This was briefly > discussed > in the email thread "LoopVectorizer in OpenCL C work group

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

2013 Jan 29

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

On Jan 29, 2013, at 12:51 AM, Tobias Grosser <tobias at grosser.es> wrote: > > # ignore assumed dependences. > for (i = 0; i < 4; i++) { > tmp1 = A[3i+1]; > tmp2 = A[3i+2]; > tmp3 = tmp1 + tmp2; > A[3i] = tmp3; > } > > Now I apply for whatever reason a partial reg2mem transformation. > > float tmp3[1]; > > # ignore assumed

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

2013 Jan 30

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

On 01/29/2013 07:58 PM, Nadav Rotem wrote: > > On Jan 29, 2013, at 12:51 AM, Tobias Grosser <tobias at grosser.es > <mailto:tobias at grosser.es>> wrote: > >> >> # ignore assumed dependences. >> for (i = 0; i < 4; i++) { >> tmp1 = A[3i+1]; >> tmp2 = A[3i+2]; >> tmp3 = tmp1 + tmp2; >> A[3i] = tmp3; >> } >>

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

2013 Jan 29

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

Hi Tobias, On 01/29/2013 10:51 AM, Tobias Grosser wrote: > Is the meta data now still valid or how do we ensure the invalid meta data is > removed? It seems it's not valid anymore. Good catch. I was requesting for these transformation cases earlier. Probably there are more not thought of yet. > I have the feeling it may be necessary to link the loop as well as the accesses > for

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

2013 Jan 28

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

On 01/28/2013 09:23 PM, Redmond, Paul wrote: > If ivdep are the semantics you're going for I'd use that. Fine, except I prefer not to include 'v' in it. Vectorization is merely a one way to parallelize the loop. How does llvm.loop.ignore_assumed_deps sound? -- --Pekka

tapply, data.frame problem

2007 Jan 17

tapply, data.frame problem

Hi R-users, I'm quite new to R and trying to learn the basics. I have a following problem concerning the convertion of array object into data frame. I have made following data sets tmp1 <- rnorm(100) tmp2 <- gl(10,2,length=100) tmp3 <- as.data.frame(cbind(tmp1,tmp2)) tmp3.sum <- tapply(tmp3$tmp1,tmp3$tmp2,sum) tmp3.sum <- as.data.frame(tapply(tmp1,tmp2,sum)) and I want the

[LLVMdev] Question about LLVM NEON intrinsics

2012 Sep 21

[LLVMdev] Question about LLVM NEON intrinsics

Hi all, I would like to know if LLVM Neon intrinsics are designed to support only 'Legal' types for NEON units. Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on following ll code: ; ModuleID = 'vmax.ll' target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32" target triple =

[LLVMdev] Alignment of vectors

2008 Jul 18

[LLVMdev] Alignment of vectors

Consider the following C code: typedef __attribute__(( ext_vector_type(2) )) float float2; typedef __attribute__(( ext_vector_type(2) )) __attribute__(( aligned(4) )) float float2_align2; void foo(void) { const float * p; size_t offset; float2 tmp = *((float2_align2 *)(p+offset)); } When compiled with clang emit-llvm I get: define void @foo() { entry: %p = alloca float*, align 4

An iterative function

2017 Oct 30

An iterative function

Dear all, The function f() below is a function of m1 and m2, both of which are matrices with 3 rows. The function works sequentially one row after another. So altogether there are three stages. I am trying to update the coding to write a generic function that will work for arbitrary k stages. I am hoping to get some suggestion and help. Thanks so much! Hanna ##x, y are two

how to skip a specific value when using apply() function to a matrix?

2010 Jul 16

how to skip a specific value when using apply() function to a matrix?

Hello R experts, I'd like to studentize a matrix (tmp1) by column using apply() function and skip some specific values such as zeros in the example below to tmp2 but not tmp3. I used the script below and only can get a matrix tmp3. Could you please help me to studentize the matrix (tmp1) without changing the zeros and generate a new matrix tmp2? Thanks, Joshua tmp1 [,1] [,2] [,3] [,4]

[LLVMdev] Proposal to disable some of DAG combine optimizations

2009 Mar 19

[LLVMdev] Proposal to disable some of DAG combine optimizations

Some of the optimizations that the first DAG combine performs is counter productive for our 8-bit target. For example in: // I dropped the types because they are irrelevant. // Excuse me for changing the syntax... store %tmp1, %var %tmp2 = load %var %tmp4 = add %tmp3, %tmp2 Since load is the only user of var and since var has just be stored to, it assumes that %tmp1 is alive and it goes ahead

[LLVMdev] Question about LLVM NEON intrinsics

2012 Sep 21

[LLVMdev] Question about LLVM NEON intrinsics

On Fri, Sep 21, 2012 at 1:28 AM, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: > Hi all, > > I would like to know if LLVM Neon intrinsics are designed to support only 'Legal' types for NEON units. > Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on following ll code: > > > ; ModuleID = 'vmax.ll' > target datalayout =

RfW 2.3.1: regular expressions to detect pairs of identical word-final character sequences

2006 Jul 23

RfW 2.3.1: regular expressions to detect pairs of identical word-final character sequences

Dear all I use R for Windows 2.3.1 on a fully updated Windows XP Home SP2 machine and I have two related regular expression problems. platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor

[LLVMdev] RE : Question about LLVM NEON intrinsics

2012 Sep 21

[LLVMdev] RE : Question about LLVM NEON intrinsics

Hi Eli, Thanks for the answer, it clarifies the situation for me. Do you know if there is Pass in LLVM that could be adapted to 'legalize' intrinsics calls ? Or shall I define my own intrinsics for non supported types ? Best Regards Seb ________________________________________ De : Eli Friedman [eli.friedman at gmail.com] Date d'envoi : vendredi 21 septembre 2012 11:54 À : Sebastien

Wrong UIDs returned from mailbox_transaction_commit_get_uids()

2008 Apr 01

Wrong UIDs returned from mailbox_transaction_commit_get_uids()

Hi, Wrong UIDs are returned from mailbox_transaction_commit_get_uids() in dovecot-1.1.rc3. The problem is in: int mailbox_transaction_commit(struct mailbox_transaction_context **t) { uint32_t tmp; return mailbox_transaction_commit_get_uids(t, &tmp, &tmp, &tmp); } It should be: int mailbox_transaction_commit(struct mailbox_transaction_context **t) { uint32_t tmp1,

[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations

2009 May 21

[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations

On Wed, May 20, 2009 at 4:55 PM, Dan Gohman <gohman at apple.com> wrote: > Can you explain why you chose the approach of using a new pass? > I pictured removing LegalizeDAG's type legalization code would > mostly consist of finding all the places that use TLI.getTypeAction > and just deleting code for handling its Expand and Promote. Are you > anticipating something more

[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations

2009 May 20

[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations

On May 20, 2009, at 1:34 PM, Eli Friedman wrote: > On Wed, May 20, 2009 at 1:19 PM, Eli Friedman > <eli.friedman at gmail.com> wrote: > >> Per subject, this patch adding an additional pass to handle vector >> >> operations; the idea is that this allows removing the code from >> >> LegalizeDAG that handles illegal types, which should be a significant

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

2013 Jan 28

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

On 01/28/2013 06:45 PM, Nadav Rotem wrote: > I am okay with this patch, assuming that you follow the review of Tobias > and Renato and provide a separate patch for the min-iter-count and a few > test cases. OK. Any opinions on the location of the isParallelLoop() check? Shall I put it to Loop so it is more widely accessible? I.e. Loop->isParallel(). -- Pekka

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

2013 Jan 28

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

It sounds like a good idea to move the method in to Loop. Is there a naming scheme for metadata? I think llvm.loop.* would be helpful for loop-specific metadata. As for parallel I think it is a little too generic. If ivdep are the semantics you're going for I'd use that. paul On 2013-01-28, at 12:03 PM, Pekka Jääskeläinen wrote: > On 01/28/2013 06:45 PM, Nadav Rotem wrote: >>

[LLVMdev] Proposal to disable some of DAG combine optimizations

2009 Mar 23

[LLVMdev] Proposal to disable some of DAG combine optimizations

I can't think of any workaround? this optimization eliminates so much information that if we want to retrieve back, it will take a lot of processing and may not necessarily be able to retrieve the lost information for all cases. Besides, why does the generic part of llvm have to force an optimization that is counter productive to some targets? If there are other phases that do the same

similar to: [LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer