thr3ads.net - search: "vectorizeable"

Displaying 20 results from an estimated 202 matches for "vectorizeable".

Did you mean: vectorizable

2011 May 22

[LLVMdev] No SSE instructions

On Sun, May 22, 2011 at 1:07 PM, Serg Anohovsky <serg.anohovsky at gmail.com>wrote: > Hello. > I have compiled the simple program: > > #include <stdio.h> > #include <stdlib.h> > > int v1[10000]; > > int main() > { > int i; > > for (i = 0; i < 10000; i++) { > v1[i] = i; > } > > This loop

[LLVMdev] Loop vectorizer

2012 Oct 16

[LLVMdev] Loop vectorizer

...e best vectorization factor > (could be 1). > > 3. Legality check - This unit checks if it is *legal* (from a > correctness point of view) to vectorize the program. This is target > independent. Also, this unit needs to describe which transformation > are needed to make this loop vectorizeable. For example: if-conversion > is required if the control flow is not uniform for all iterations of > the loop. > > 4. Vectorization - This is where the actual widening of the > instructions happen. Every time we improve #3 by detecting more > vectorizeable loops, we need to add t...

[LLVMdev] Loop vectorizer

2012 Oct 16

[LLVMdev] Loop vectorizer

Nadav Rotem <nrotem at apple.com> wrote: > I sent a patch to llvm-commit with a new loop vectorizer. > This is a very simple loop vectorizer, but we have to start somewhere. > With this new loop vectorizer we can already vectorize a good number of loops. > I know that we can improve the new loop vectorizer in a number of ways. > We can implement a precise dependence test, >

[LLVMdev] Loop vectorizer

2012 Oct 16

[LLVMdev] Loop vectorizer

...del - This unit decides on the best vectorization factor (could be 1). 3. Legality check - This unit checks if it is *legal* (from a correctness point of view) to vectorize the program. This is target independent. Also, this unit needs to describe which transformation are needed to make this loop vectorizeable. For example: if-conversion is required if the control flow is not uniform for all iterations of the loop. 4. Vectorization - This is where the actual widening of the instructions happen. Every time we improve #3 by detecting more vectorizeable loops, we need to add the mechanism for actually ge...

[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable

2015 Aug 13

[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable

Hi Gerolf, I think we have several (perhaps separable) issues here: 1. Do we have a canonical form for loops, preserved through the optimizer, that allows naturally-constructed loop nests to remain separable? 2. Do we forbid non-lowering transformations that turn vectorizable loops into non-vectorizable loops? 3. How do we detect cases where transformations cause a negative answer to either

[LLVMdev] No SSE instructions

2011 May 22

[LLVMdev] No SSE instructions

Hello. I have compiled the simple program: #include <stdio.h> #include <stdlib.h> int v1[10000]; int main() { int i; for (i = 0; i < 10000; i++) { v1[i] = i; } for (i = 0; i < 10000; i++) { printf("%d ", v1[i]); } return 0; } Next, I disasseble the executable file and have not found

[Proposal][RFC] Epilog loop vectorization

2017 Feb 27

[Proposal][RFC] Epilog loop vectorization

On 02/27/2017 12:41 PM, Michael Kuperstein wrote: There's another issue with re-running the vectorizer (which I support, btw - I'm just saying there are more problems to solve on the way :-) ) Historically, we haven't even tried to evaluate the cost of the "constant" (not per-iteration) vectorization overhead - things like alias checks. Instead, we have hard bounds - we

[LLVMdev] Loop vectorizer

2012 Oct 17

[LLVMdev] Loop vectorizer

2012 Oct 17

[LLVMdev] Loop vectorizer

Hi everybody, On 10/17/12 12:32 AM, Hal Finkel wrote: >>> Do you have a plan for xforms to increase the amount of >>> vectorization? >> >> Yes. We will need to implement a predication phase and to design the >> interaction with other loop transformations. Also, this will have to >> work well with the cost model. We also need to think of a good way to

RE: [R] when can we expect Prof Tierney's compiled R?

2005 Apr 22

RE: [R] when can we expect Prof Tierney's compiled R?

If we are on the subject of byte compilation, let me bring a couple of examples which have been puzzling me for some time. I'd like to know a) if the compilation will likely to improve the performance for this type of computations, and b) at least roughly understand the reasons for the observed numbers, specifically why x[i]<- assignment is so much slower than x[i] extraction. The loops

[LLVMdev] Enabling the vectorizer for -Os

2013 Jun 06

[LLVMdev] Enabling the vectorizer for -Os

On Wed, Jun 5, 2013 at 5:51 PM, Nadav Rotem <nrotem at apple.com> wrote: > Hi, > > Thanks for the feedback. I think that we agree that vectorization on -Os > can benefit many programs. Regarding -O2 vs -O3, maybe we should set a > higher cost threshold for O2 to increase the likelihood of improving the > performance ? We have very few regressions on -O3 as is and with

[LLVMdev] Enabling the vectorizer for -Os

2013 Jun 06

[LLVMdev] Enabling the vectorizer for -Os

Hi, Thanks for the feedback. I think that we agree that vectorization on -Os can benefit many programs. Regarding -O2 vs -O3, maybe we should set a higher cost threshold for O2 to increase the likelihood of improving the performance ? We have very few regressions on -O3 as is and with better cost models I believe that we can bring them close to zero, so I am not sure if it can help that much.

[LLVMdev] Alias-based Loop Versioning

2015 May 21

[LLVMdev] Alias-based Loop Versioning

There is a work taking place by multiple people in this area and more is expected to happen and I’d like to make sure we’re working toward a common end goal. I tried to collect the use-cases for run-time memory checks and the specific memchecks required for each: 1. Loop Vectorizer: each memory access is checked against all other memory accesses in the loop (except read vs read) 2. Loop

Vectorizing multiple exit loops

2019 Sep 09

Vectorizing multiple exit loops

I've recently mentioned in a few places that I'm interested in enhancing the loop vectorizer to handle multiple exit loops, and have been asked to share plans. This email is intended to a) share my current thinking and b) help spark discussion among interested parties. I do need to warn that my near term plans for this have been delayed; I got pulled into an internal project

[LLVMdev] Fwd: No SSE instructions

2011 May 22

[LLVMdev] Fwd: No SSE instructions

---------- Forwarded message ---------- From: Serg Anohovsky <serg.anohovsky at gmail.com> Date: 2011/5/22 Subject: Re: [LLVMdev] No SSE instructions To: Chris Lattner <clattner at apple.com> 2011/5/22 Chris Lattner <clattner at apple.com> > > On May 22, 2011, at 10:47 AM, Justin Holewinski wrote: > > On Sun, May 22, 2011 at 1:07 PM, Serg Anohovsky

[LLVMdev] RFC: Loop distribution/Partial vectorization

2015 Jan 12

[LLVMdev] RFC: Loop distribution/Partial vectorization

Hi, We'd like to propose new Loop Distribution pass. The main motivation is to allow partial vectorization of loops. One such example is the main loop of 456.hmmer in SpecINT_2006. The current version of the patch improves hmmer by 24% on ARM64 and 18% on X86. The goal of the pass is to distribute a loop that can't be vectorized because of memory dependence cycles. The pass splits

[LLVMdev] Enabling the vectorizer for -Os

2013 Jun 05

[LLVMdev] Enabling the vectorizer for -Os

Hi, I would like to start a discussion about enabling the loop vectorizer by default for -Os. The loop vectorizer can accelerate many workloads and enabling it for -Os and -O2 has obvious performance benefits. At the same time the loop vectorizer can increase the code size because of two reasons. First, to vectorize some loops we have to keep the original loop around in order to handle the last

Why are big data.frames slow? What can I do to get it fas ter?

2002 Oct 07

Why are big data.frames slow? What can I do to get it fas ter?

Extracting from data frame one element at a time the way you did is expensive. I.e., test[i, 6] is slower than test$whatever[i]. As an example: > dat <- data.frame(a = sample(LETTERS, 1e6, replace=TRUE), b=1:1e6, + c=rep("A", 1e6)) > dat$a <- as.character(dat$a) > dat$c <- as.character(dat$c) > > system.time( + for(i in 1:10) { + dat[i, 3]

[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable

2015 Jul 16

[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable

----- Original Message ----- > From: "Hal Finkel" <hfinkel at anl.gov> > To: "Chandler Carruth" <chandlerc at google.com> > Cc: llvmdev at cs.uiuc.edu > Sent: Thursday, July 16, 2015 1:58:02 AM > Subject: Re: [LLVMdev] Improving loop vectorizer support for loops > with a volatile iteration variable > ----- Original Message ----- > >

RE: [R] when can we expect Prof Tierney's compiled R?

2005 Apr 27

RE: [R] when can we expect Prof Tierney's compiled R?

Luke, Thank you for sharing the benchmark results. The improvement is very substantial, I am looking forward to the release of the byte compiler! The arithmetic shows that x[i]<- is still the bottleneck. I suspect that this is due to a very involved dispatching/search for the appropriate function on the C level. There might be significant gain if loops somehow cached the result of the initial

search for: vectorizeable