thr3ads.net - similar to: "[LLVMdev] RE: LLVM extension v.s. DirectX Shaders"

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] RE: LLVM extension v.s. DirectX Shaders"

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

2005 Dec 15

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

Dear all: To write a compiler for Microsoft Direct3D shaders from our hardware, I have a program which translates the Direct3D shader assembly to LLVM assembly. I added several intrinsics for this purpose. It's a vector ISA and has some special instructions like: * rcp (reciprocal) * frc (the fractional portion of each input component) * dp4 (dot product) * exp (exponential) * max, min These

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

2005 Dec 15

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

On Thu, 15 Dec 2005, Tzu-Chien Chiu wrote: > To write a compiler for Microsoft Direct3D shaders from our hardware, > I have a program which translates the Direct3D shader assembly to LLVM > assembly. I added several intrinsics for this purpose. > It's a vector ISA and has some special instructions like: > * rcp (reciprocal) > * frc (the fractional portion of each input

[LLVMdev] ADDE to use branch registers

2013 Oct 04

[LLVMdev] ADDE to use branch registers

Hi, I am working on a LLVM backend that has eight different branch registers. I am having a lot of trouble with implementing the following instructions: addcg $r0.1, $b0.0 = $r0.1, $r0.1, $b0.0 (r is a general purpose register and b is a 1 bit branch register) The branch register is used for carry in and carry out. I have noticed that this instruction is very closely related to the ADDE

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 13

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

On Feb 13, 2009, at 9:47 AM, Alex wrote: > It seems to me that LLVM sub-register is not for the following > hardware architecture. > > All instructions of a hardware are vector instructions. All > registers contains > 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. > > Most instructions write more than one elements in this way: > > mul

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 13

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

It seems to me that LLVM sub-register is not for the following hardware architecture. All instructions of a hardware are vector instructions. All registers contains 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. Most instructions write more than one elements in this way: mul r0.xyw, r1, r2 add r0.z, r3, r4 sub r5, r0, r1 Notice that the four elements of r0 are written

Patch: update zlib/* to 1.1.3

2002 Jan 30

Patch: update zlib/* to 1.1.3

This patch (apologies for the size) updates zlib/* to the files that ship with zlib 1.1.3. Index: zlib/ChangeLog =================================================================== RCS file: /cvsroot/rsync/zlib/ChangeLog,v retrieving revision 1.1 diff -u -r1.1 ChangeLog --- zlib/ChangeLog 7 May 1998 06:19:41 -0000 1.1 +++ zlib/ChangeLog 30 Jan 2002 01:12:41 -0000 @@ -1,6 +1,54 @@ ChangeLog

IfConversion and representation of predicates

2016 Mar 29

IfConversion and representation of predicates

Hello, I have a few questions about applying the IfConversion pass to my out-of-tree target. (1) Is it true that the IfConversion pass may only run after register allocation? I often encounter this bad scenario, and I think it could be entirely avoided if IfConversion ran before register allocation: the block-to-be-predicated contains load-immediate (LI) instructions. The LI instructions

PATCH: Fix IRIX 6 testsuite failures

2002 Aug 29

PATCH: Fix IRIX 6 testsuite failures

Having built rsync 2.5.5 on IRIX 6.2 with gcc 3.1, I ran into two failures when running the testsuite with make check: both the chgrp and hardlinks tests fail: The failure in the chgrp test occurs here: + rsync -rtgvvv /amnt/callisto/volumes/obj-irix5/local/obj.irix5/rsync-2.5.5/testtmp.chgrp/from/ /amnt/callisto/volumes/obj-irix5/local/obj.irix5/rsync-2.5.5/testtmp.chgrp/to/ rsync: opendir

[PATCH] Regression test portabilization.

2003 Jun 20

[PATCH] Regression test portabilization.

Hi All. Attached is a patch (against OpenSSH Portable -current) to portablize the regression tests. It will also apply to OpenBSD's (with a couple of rejects). They are based on work by Roumen Petrov and myself, with contributions from Corinna Vinschen and David M Williams. My goal is to have the tests work out of the box on as many of our supported platforms as possible so running the

proofreading corrections (cvs) (PR#5730)

2003 Dec 12

proofreading corrections (cvs) (PR#5730)

Here is a patch of changes from the proof-reading of the R reference manual, made against the current cvs. regards -- Brian Gough Network Theory Ltd -- Publishing Free Software Manuals 15 Royal Park Bristol BS8 3AL United Kingdom Tel: +44 (0)117 3179309 Fax: +44 (0)117 9048108 Web: http://www.network-theory.co.uk/ Index: src/library/base/man/Arithmetic.Rd

InstructionSimplify: adding a hook for shufflevector instructions

2017 Mar 30

InstructionSimplify: adding a hook for shufflevector instructions

As Sanjay noted in D31426<https://reviews.llvm.org/D31426#712701>, InstructionSimplify is missing the following simplification: This function: define <4 x i32> @splat_operand(<4 x i32> %x) { %splat = shufflevector <4 x i32> %x, <4 x i32> undef, <4 x i32> zeroinitializer %shuf = shufflevector <4 x i32> %splat, <4 x i32> undef, <4 x i32>

[LLVMdev] Generalizing shuffle vector

2008 Sep 30

[LLVMdev] Generalizing shuffle vector

Hi Mon Ping, Generalizing shufflevector would be great. I have an additional suggestion below. On 29-Sep-08, at 11:11 PM, Mon Ping Wang wrote: > I am proposing to extend the shuffle vector definition to be > <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> > <mask> ; yields <m x <ty>> > > The

LangRef semantics for shufflevector with undef mask is incorrect

2019 Nov 27

LangRef semantics for shufflevector with undef mask is incorrect

Ok, makes sense. My suggestion is that we patch the IR Verifier to ensure that the mask is indeed a vector of constants and/or undefs. Right now it only runs the standard checks for instructions. We will also run Alive2 on the test suite to make sure undef is never replaced in practice. Thanks, Nuno -----Original Message----- From: Eli Friedman <efriedma at quicinc.com> Sent: 27 de

llvm-stress crash

2017 Mar 14

llvm-stress crash

Hi, Using llvm-stress, I got a crash after Post-RA pseudo expansion, with machine verifier. A 128 bit register %vreg233:subreg_l32<def,read-undef> = LLCRMux %vreg119; GR128Bit:%vreg233 GRX32Bit:%vreg119 gets spilled: %vreg265:subreg_l32<def,read-undef> = LLCRMux %vreg119; GR128Bit:%vreg265 GRX32Bit:%vreg119 ST128 %vreg265, <fi#10>, 0, %noreg;

[PATCH] D70246: [InstCombine] remove identity shuffle simplification for mask with undefs

2019 Dec 09

[PATCH] D70246: [InstCombine] remove identity shuffle simplification for mask with undefs

Sanjay, I'm looking at some missed optimizations caused by D70246. Here's a test case: define <4 x float> @f(i32 %t32, <4 x float>* %t24) { .entry: %t43 = insertelement <3 x i32> undef, i32 %t32, i32 2 %t44 = bitcast <3 x i32> %t43 to <3 x float> %t45 = shufflevector <3 x float> %t44, <3 x float> undef, <4 x i32> <i32 0, i32 undef,

[LLVMdev] Generalizing shuffle vector

2008 Sep 30

[LLVMdev] Generalizing shuffle vector

Hi, I agree that the more general shufflevector is more useful. I narrowed the original proposal a little bit because of the concern for the implementation cost. However, the slightly narrowed definition will probably require falling backing to generate insert and extracts for complex masks so it is possible that there will be no extra cost in supporting the more general definition.

[LLVMdev] Generalizing shuffle vector

2008 Sep 30

[LLVMdev] Generalizing shuffle vector

I agree further generalization seems like a very good idea. But I'd like to see what Mon Ping proposed implemented first so we have a better idea of the implementation cost. Thanks, Evan On Sep 30, 2008, at 6:44 AM, Stefanus Du Toit wrote: > Hi Mon Ping, > > Generalizing shufflevector would be great. I have an additional > suggestion below. > > On 29-Sep-08, at 11:11

LangRef semantics for shufflevector with undef mask is incorrect

2019 Nov 27

LangRef semantics for shufflevector with undef mask is incorrect

On 11/27/19 2:10 AM, Eli Friedman via llvm-dev wrote: The shuffle mask of a shufflevector is special: it's required to be a constant in a specific form. From LangRef: "The shuffle mask operand is required to be a constant vector with either constant integer or undef values." So really, we can resolve this any way we want; "undef" in this context doesn't have to mean

documentation fixes (cvs) (PR#5632)

2003 Dec 09

documentation fixes (cvs) (PR#5632)

The patch below attempts to correct some unclear sentences in the R documentation. In the case of coplot.Rd it wasn't clear whether "shingle" bar had a special meaning or was a typo for "single". I've just put a comment in that case. regards -- Brian Gough Network Theory Ltd -- Publishing Free Software Manuals 15 Royal Park Bristol BS8 3AL United Kingdom Tel: +44

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 16

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

Alex, From my experience in working with GPU vector registers; there is no support for swizzles in the manner that you would normally code them, and in my case I have 6^4 permutations on src registers and 24 combinations in the dst registers. The way that I ended up handling this was to have different register classes for 1, 2, 3 and 4 component vectors. This made the generic cases very simple

similar to: [LLVMdev] RE: LLVM extension v.s. DirectX Shaders