search for: iadds

Displaying 20 results from an estimated 29 matches for "iadds".

Did you mean: adds
2014 Oct 24
3
[LLVMdev] IndVar widening in IndVarSimplify causing performance regression on GPU programs
Hi, I noticed a significant performance regression (up to 40%) on some internal CUDA benchmarks (a reduced example presented below). The root cause of this regression seems that IndVarSimpilfy widens induction variables assuming arithmetics on wider integer types are as cheap as those on narrower ones. However, this assumption is wrong at least for the NVPTX64 target. Although the NVPTX64 target
2015 Feb 05
8
[LLVMdev] type legalization/operation action
Dear there, I have a target which is supporting the 32 bit operations natively. Right now,I want to make it support the 16 bits operations as well. My initial thought is: (1) I can adding something like β€œ CCIfType< [i16], CCPromoteToType<i32>>”, to the CallingConv.td, then β€œall” the 16 bits operands will be automatically promoted to 32 bits, it will be all set. but looks it is not
2015 Feb 17
2
[LLVMdev] why llvm does not have uadd, iadd node
So if the overflow happens for either one of the case, the return value will be implementation dependent? best kevin On Feb 17, 2015, at 2:01 PM, Tim Northover <t.p.northover at gmail.com> wrote: > Hi Kevin, > > On 17 February 2015 at 10:41, kewuzhang <kewu.zhang at amd.com> wrote: >> I just noticed that the LLVM has some node for signed/unsigned type( like udiv,
2015 Feb 17
5
[LLVMdev] why llvm does not have uadd, iadd node
Hi guys, I just noticed that the LLVM has some node for signed/unsigned type( like udiv, sdiv), but why the ADD, SUB do not have the counter part sadd, uadd? best kevin
2008 Oct 30
1
[LLVMdev] Using patterns inside patterns
I do not have access to a subtraction routine, as it is considered add with negation on the second parameter, so I have this pattern: // integer subtraction // a - b ==> a + (-b) def ISUB : Pat<(sub GPRI32:$src0, GPRI32:$src1), (IADD GPRI32:$src0, (INEGATE GPRI32:$src1))>; I am attemping to do 64 bit integer shifts and using the following pattern: def LSHL :
2008 Oct 30
0
[LLVMdev] Using patterns inside patterns
I am not sure what you are looking to do. Please provide a mark up example. Evan On Oct 28, 2008, at 11:00 AM, Villmow, Micah wrote: > Is there currently a way to use a pattern inside of another pattern? > > Micah Villmow > Systems Engineer > Advanced Technology & Performance > Advanced Micro Devices Inc. > 4555 Great America Pkwy, > Santa Clara, CA. 95054 > P:
2009 Feb 16
0
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
Alex, From my experience in working with GPU vector registers; there is no support for swizzles in the manner that you would normally code them, and in my case I have 6^4 permutations on src registers and 24 combinations in the dst registers. The way that I ended up handling this was to have different register classes for 1, 2, 3 and 4 component vectors. This made the generic cases very simple
2009 Feb 16
2
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
Evan Cheng-2 wrote: > > Well, how many possible permutations are there? Is it possible to > model each case as a separate physical register? > > Evan > I don't think so. There are 4x4x4x4 = 256 permutations. For example: * xyzw: default * zxyw * yyyy: splat Even if can model each of these 256 cases as a separate physical register, how can I model the use of r0.xyzw in
2008 Oct 28
4
[LLVMdev] Using patterns inside patterns
Is there currently a way to use a pattern inside of another pattern? Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. 4555 Great America Pkwy, Santa Clara, CA. 95054 P: 408-572-6219 F: 408-572-6596 -------------- next part -------------- An HTML attachment was scrubbed... URL:
2010 Jun 09
2
Help with simple dll wrapper around linux so
Ive recently got metatrader to work on linux uner wine and would now like to see if i can import a dll wrapper so i can use some code i wrote in linux. Im trying something like this (based on http://www.winehq.org/docs/winelib-guide/bindlls) : add.c: Code: int add(int a,int b) { return a+b; } add.h: > int add(int,int); WinAdd.c: WinAdd.c: Code: #include <windef.h> #include
2017 Jun 13
1
[Mesa-dev] [RFC 0/9] Add precise/invariant semantics to TGSI
Am 13.06.2017 um 02:05 schrieb Ilia Mirkin: > On Mon, Jun 12, 2017 at 7:57 PM, Roland Scheidegger <sroland at vmware.com> wrote: >> FWIW surely on nv50 you could keep a single mad instruction for umad >> (sad maybe too?). (I'm actually wondering if the hw really can't do >> unfused float multiply+add as a single instruction but I know next to >> nothing
2012 Feb 15
2
[LLVMdev] Performance problems with FORTRAN allocatable arrays
I've noticed that LLVM does a bad job of optimizing array indexing code for FORTRAN arrays declared using the ALLOCATABLE keyword. For example if you have something like the following: DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:,:) :: QAV ... ALLOCATE( QAV( -2:IMAX+2,-2:JMAX+2,-2:KMAX+2,ND) ) ... DO L = 1, 5 DO K = K1, K2 DO J = J1, J2 DO I = I1, I2 II = I +
2012 Feb 15
0
[LLVMdev] Performance problems with FORTRAN allocatable arrays
Hi Wonsun, can you please provide a testcase. Best wishes, Duncan. > I've noticed that LLVM does a bad job of optimizing array indexing > code for FORTRAN arrays declared using the ALLOCATABLE keyword. > > For example if you have something like the following: > > DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:,:,:) :: QAV > ... > ALLOCATE( QAV(
2017 Jun 15
2
Implementing cross-thread reduction in the AMDGPU backend
On 06/14/2017 05:05 PM, Connor Abbott wrote: > On Tue, Jun 13, 2017 at 6:13 PM, Tom Stellard <tstellar at redhat.com> wrote: >> On 06/13/2017 07:33 PM, Matt Arsenault wrote: >>> >>>> On Jun 12, 2017, at 17:23, Tom Stellard <tstellar at redhat.com <mailto:tstellar at redhat.com>> wrote: >>>> >>>> On 06/12/2017 08:03 PM, Connor
2017 Jun 15
1
Implementing cross-thread reduction in the AMDGPU backend
I'm wondering about the focus on bound_cntl. Any cleared bit in the row_mask or bank_mask will also disable updating the result. Brian -----Original Message----- From: Connor Abbott [mailto:cwabbott0 at gmail.com] Sent: Wednesday, June 14, 2017 6:13 PM To: tstellar at redhat.com Cc: Matt Arsenault; llvm-dev at lists.llvm.org; Kolton, Sam; Sumner, Brian; Pykhtin, Valery Subject: Re:
2017 Jun 12
3
[Mesa-dev] [RFC 0/9] Add precise/invariant semantics to TGSI
This looks like the right idea to me too. It may sound a bit weird to do that per instruction, but d3d11 does that as well. (Some d3d versions just have a global flag basically forbidding or allowing any such fast math optimizations in the assembly, but I'm not actually sure everybody honors that without tesselation...) For 1/9: Reviewed-by: Roland Scheidegger <sroland at vmware.com>
2009 May 08
0
[LLVMdev] Question on tablegen
Manjunath, I had a very similar problem and I solved it using a custom vector shuffle and addition instead of mov. For example, Vector_shuffle s1, s2, <0,3> is mapped to a custom instruction where I transform the swizzle to a 32bit integer mask and an inverted mask. So I have dst, src0, src1, imm1, imm2 And I have my asm look similar to: Add dst, src0.imm1, src1.imm2 and then in the asm
2017 Jun 13
0
[Mesa-dev] [RFC 0/9] Add precise/invariant semantics to TGSI
On Mon, Jun 12, 2017 at 7:57 PM, Roland Scheidegger <sroland at vmware.com> wrote: > FWIW surely on nv50 you could keep a single mad instruction for umad > (sad maybe too?). (I'm actually wondering if the hw really can't do > unfused float multiply+add as a single instruction but I know next to > nothing about nvidia hw...) The compiler should reassociate a mul + add
2009 May 08
2
[LLVMdev] Question on tablegen
Dan, Thanks a lot. Using a modifier in the assembly string works for this case. I am trying to solve a related problem. I am trying to print out a set of "mov" ops for the vector_shuffle node. Since the source of the "mov" is from one of the sources to vector_shuffle, depending on the mask, I am not sure what assembly string to emit. For example, if I have d <-
2008 Jun 11
3
[LLVMdev] Possible miscompilation?
Hi all, I'm trying to figure out a weird bug I'm seeing. I'm hoping it's something simple in my IR but I can't see anything wrong so I'm hoping someone here can see something. I'm using LLVM to compile Java bytecode into native functions. My code keeps track of the Java local variables in an array of llvm::Value pointers which get phi'd up at various points. The