thr3ads.net - similar to: "Expedite scalar f(x) evaluation over vectors"

Displaying 20 results from an estimated 800 matches similar to: "Expedite scalar f(x) evaluation over vectors"

Clarification: Expedite scalar f(x) evaluation over vectors

2007 Aug 23

Clarification: Expedite scalar f(x) evaluation over vectors

Please note clarifications in <<>> below. My apologies for any confusion. Thanks again, Scott ---------- Forwarded message ---------- From: Scott Stark <stark.sc@gmail.com> Date: Aug 23, 2007 1:03 PM Subject: Expedite scalar f(x) evaluation over vectors To: r-help@lists.r-project.org Dear R community, I am trying to code a fairly complex equation for optim(). My current

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

> > PHIElim and TwoAddress passes leave SSA form. > May be a missed something in your code but %vreg48 seems to be there > after PHI elimination. PHIElim tags those kind of registers as being > PHIJoin regs, updating LiveVariables pass, so the regcoalescer is aware > of them (some SSA info is still alive but the reg coalescer will > invalidate that information after

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 24

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

Hi, I don't know if my llvm ir code is faulty, or if I spot a bug in the RegisterCoalescing Pass, so I'm posting my issue on the ML. Shader and print-before-all dump are given below. The interessing part is the vreg6/vreg48 reduction : before RegCoalescing, the machine code is : // BEFORE LOOP ... Some COPYs.... 400B%vreg47<def> = COPY %vreg2<kill>; R600_Reg32:%vreg47,%vreg2

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

Hi Vincent, On 24/10/2012 23:26, Vincent Lejeune wrote: > Hi, > > I don't know if my llvm ir code is faulty, or if I spot a bug in the RegisterCoalescing Pass, so I'm posting my issue on the ML. Shader and print-before-all dump are given below. > > The interessing part is the vreg6/vreg48 reduction : before RegCoalescing, the machine code is : > > // BEFORE LOOP >

[LLVMdev] [LLVMDev] [Fishing expedition] Virtual Machines and LLVM

2010 Sep 21

[LLVMdev] [LLVMDev] [Fishing expedition] Virtual Machines and LLVM

When attempting to compile a dynamic language like python/java does LLVM allow a function to compile itself one at a time? In other words, can I parse a function, then gain the machine bit code, then execute without parsing the other functions related to the compiled function? Thanks, Jeff Kunkel

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

Thank for your help. You're right, merging vreg32 and vreg48 is perfectly fine, sorry I missed that. I "brute force" debuged by adding MachineFunction dump after each join, I think I found the issue : it's when vreg32 and vreg10 are merged. vreg10 only appears in BB#3, and the join only occurs in BB#3 apparently even if vreg32 lives in the 4 machine blocks After joining, there

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 26

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

Vincent, File a bug report so you can get a fix for it. Ivan On 25/10/2012 23:01, Vincent Lejeune wrote: > Thank for your help. You're right, merging vreg32 and vreg48 is perfectly fine, sorry I missed that. > I "brute force" debuged by adding MachineFunction dump after each join, I think I found the issue : it's when vreg32 and vreg10 are merged. > vreg10 only

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

When examining the debug output of regalloc, it seems that joining 32bits reg also joins 128 parent reg. If I look at the : %vreg34<def> = COPY %vreg6:sel_y; R600_Reg32:%vreg34 R600_Reg128:%vreg6 instructions ; it gets joined to : 928B%vreg34<def> = COPY %vreg48:sel_y; when vreg6 and vreg48 are joined. It's right. But joining the following copy

[GlobalISel] Legalize generic instructions that also depend on type of scalar, not only scalar size

2018 Sep 21

[GlobalISel] Legalize generic instructions that also depend on type of scalar, not only scalar size

Hi, Mips32 has 64 bit floating point instructions, while i64 instructions have to be emulated with i32 instructions. This means that G_LOAD should be custom legalized for s64 integer value, and be legal for s64 floating point value. There are also other generic instructions with the same problem: G_STORE, G_SELECT, G_EXTRACT, and G_INSERT. There are also other configurations where integer

Using optim() with a function which returns more than a scalar - alternatives?

2006 Feb 14

Using optim() with a function which returns more than a scalar - alternatives?

I want to numerically maximize a function with optim (maximization over several arguments). optim() needs a function which returns a scalar only. However, it could be nice to be able to "take other things out" from the function as well. I'tried to create an attribute to the scalar with what I want to take out, but that attribute disappears in optim(). I looked into the code to see if

[LLVMdev] Patch for scoping problem in lib/Transforms/Scalar/LowerGC.cpp

2004 Oct 18

[LLVMdev] Patch for scoping problem in lib/Transforms/Scalar/LowerGC.cpp

This isn't really a bug, but it's generally bad style to declare variables with the same name in almost the same scope... The Visual C compiler can't compile things like this, so here is a patch. m. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: diff.txt URL:

[LLVMdev] Does spu backend works with scalar variable?

2008 Feb 24

[LLVMdev] Does spu backend works with scalar variable?

I compiled the following code with llvm-gcc (4.2.1) and llc (2.3svn) for spu of Cell broadband engine processor. > cat add.c float add (float a, float b) { return a + b; } > llvm-gcc add.c --emit-llvm -c -o add.bc > llc -march=cellspu add.bc Cannot yet select: 0x867c700: v4f32 = SPUISD::INSERT_MASK 0x8670800 Abort (core dumped) But llc returned the above error. If I

[LLVMdev] SSE Scalar Convert Intrinsics

2009 Jun 05

[LLVMdev] SSE Scalar Convert Intrinsics

On Fri, Jun 5, 2009 at 8:51 AM, David Greene<dag at cray.com> wrote: > def Int_CVTSD2SIrm : SDI<0x2D, MRMSrcMem, (outs GR32:$dst), (ins f128mem: > $src), > "cvtsd2si\t{$src, $dst|$dst, $src}", > [(set GR32:$dst, (int_x86_sse2_cvtsd2si > (load addr:$src)))]>; > > Er,

[LLVMdev] SSE Scalar Convert Intrinsics

2009 Jun 05

[LLVMdev] SSE Scalar Convert Intrinsics

On Friday 05 June 2009 15:19, Dan Gohman wrote: > > Do we need two intrinsics for these scalar converts, one to satisfy > > the > > (arguably broken) GCC interface and one to really reflect the > > operation > > as specified by the ISA? > > That's what's done for most other instructions, unfortunately. > For cvtsd2si, there's currently no

[LLVMdev] SSE Scalar Convert Intrinsics

2009 Jun 05

[LLVMdev] SSE Scalar Convert Intrinsics

On Friday 05 June 2009 15:22, Eli Friedman wrote: > > Do we need two intrinsics for these scalar converts, one to satisfy the > > (arguably broken) GCC interface and one to really reflect the operation > > as specified by the ISA? > > We really need zero intrinsics... it's quite easy to map onto existing > LLVM instructions. See the definition of CVTSD2SIrm. In

[LLVMdev] SSE Scalar Convert Intrinsics

2009 Jun 05

[LLVMdev] SSE Scalar Convert Intrinsics

On Jun 5, 2009, at 3:16 PM, David Greene wrote: > On Friday 05 June 2009 15:19, Dan Gohman wrote: > >> One thing we'd like to do at some point is have front-ends lower >> intrinsics for scalar instructions into >> extractelement+op+insertelement, so that we don't need two >> versions of each of the instructions. Doing this for everything >> will

[LLVMdev] SSE Scalar Convert Intrinsics

2009 Jun 05

[LLVMdev] SSE Scalar Convert Intrinsics

On Fri, Jun 5, 2009 at 3:19 PM, David Greene<dag at cray.com> wrote: > On Friday 05 June 2009 15:22, Eli Friedman wrote: > >> > Do we need two intrinsics for these scalar converts, one to satisfy the >> > (arguably broken) GCC interface and one to really reflect the operation >> > as specified by the ISA? >> >> We really need zero intrinsics...

[LLVMdev] Rework of Vector/Scalar Classification

2009 Dec 07

[LLVMdev] Rework of Vector/Scalar Classification

On Friday 04 December 2009 16:44, David Greene wrote: > Here's a reworked patch to mark instructions and operands as vector or > scalar. It uses TableGen to infer the flags from types, allowing the user > to override with a "let isVector = 0" clause. > > I decided to forego classifying MachineMemOperands for now in the interests > of getting this piece in. I still

[LLVMdev] Scalar Evolution not canonalizing division?

2010 Oct 28

[LLVMdev] Scalar Evolution not canonalizing division?

On 27 October 2010 14:20, Tobias Grosser <grosser at fim.uni-passau.de> wrote: > Hi, > > I am just found a scalar evolution function that does not seem canonical to > me. > > The C code I used to produce it is: > > long foo (long n, long m) { > long i, j; > long A[n][m]; > > for (i = 0; i < n; ++i) > for (j = 0; j < m; ++j) >

[LLVMdev] scalar evolution to determine access functions in arays

2011 Aug 03

[LLVMdev] scalar evolution to determine access functions in arays

On 08/03/2011 08:35 AM, Jimborean Alexandra wrote: > Hello Tobi, > > You are right, we need to run some other passes before running the > scalar evolution pass. The sequence that I run for this example is -O3 > -loop-simplify -reg2mem. This is why I did not obtain the expressions > depending on the loop indices. So I removed the reg2mem pass and scalar > evolution computes the

similar to: Expedite scalar f(x) evaluation over vectors