thr3ads.net - similar to: "[LLVMdev] Subword register allocation"

Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] Subword register allocation"

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 13

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

On Feb 13, 2009, at 9:47 AM, Alex wrote: > It seems to me that LLVM sub-register is not for the following > hardware architecture. > > All instructions of a hardware are vector instructions. All > registers contains > 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. > > Most instructions write more than one elements in this way: > > mul

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 13

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

It seems to me that LLVM sub-register is not for the following hardware architecture. All instructions of a hardware are vector instructions. All registers contains 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. Most instructions write more than one elements in this way: mul r0.xyw, r1, r2 add r0.z, r3, r4 sub r5, r0, r1 Notice that the four elements of r0 are written

[LLVMdev] avoid live range overlap of "vector" registers

2005 May 06

[LLVMdev] avoid live range overlap of "vector" registers

a "vector" register r0 is composed of four 32-bit floating scalar registers, r0.x, r0.y, r0.z, r0.w. each scalar reg can be assigned individually, e.g. mov r0.x, r1.y add r0.y, r1,x, r2.z or assigned simultaneously with vector instructions, e.g. add r0.xyzw, r1.xzyw, r2.xyzw My question is how to define the register in .td file to avoid the code generator overlaps the

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

2005 Dec 15

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

Dear all: To write a compiler for Microsoft Direct3D shaders from our hardware, I have a program which translates the Direct3D shader assembly to LLVM assembly. I added several intrinsics for this purpose. It's a vector ISA and has some special instructions like: * rcp (reciprocal) * frc (the fractional portion of each input component) * dp4 (dot product) * exp (exponential) * max, min These

[LLVMdev] avoid live range overlap of "vector" registers

2005 May 10

[LLVMdev] avoid live range overlap of "vector" registers

On Fri, 6 May 2005, Tzu-Chien Chiu wrote: > a "vector" register r0 is composed of four 32-bit floating scalar > registers, r0.x, r0.y, r0.z, r0.w. > > each scalar reg can be assigned individually, e.g. > > mov r0.x, r1.y > add r0.y, r1,x, r2.z > > or assigned simultaneously with vector instructions, e.g. > > add r0.xyzw, r1.xzyw, r2.xyzw > > My

[LLVMdev] [RFC] LegalizeDAG support for targets without subword load/store instructions

2011 Jul 16

[LLVMdev] [RFC] LegalizeDAG support for targets without subword load/store instructions

Hi All, Some targets don't provide subword (e.g., i8 and i16 for a 32-bit machine) load and store instructions, so currently we have to custom-lower Load- and StoreSDNodes in our backends. For examples, see LowerLOAD() and LowerSTORE() in {XCore,CellSPU}ISelLowering.cpp. I believe it's possible to support this lowering in a target-agnostic fashion in LegalizeDAG.cpp, similar to

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 16

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

Evan Cheng-2 wrote: > > Well, how many possible permutations are there? Is it possible to > model each case as a separate physical register? > > Evan > I don't think so. There are 4x4x4x4 = 256 permutations. For example: * xyzw: default * zxyw * yyyy: splat Even if can model each of these 256 cases as a separate physical register, how can I model the use of r0.xyzw in

runes of Magic doesn't display login

2010 May 18

runes of Magic doesn't display login

I get the launcher. Click start game. the screen pops up with the background but the login never pops up. Please help. Im running ubuntu 10.04 with the latest version of Wine .44. I have winetricks installed with all required installed according to AppDB, Pre thanks Code: fixme:font:WineEngAddFontResourceEx Ignoring flags 10 fixme:font:WineEngAddFontResourceEx Ignoring flags 10

[LLVMdev] [RFC] LegalizeDAG support for targets without subword load/store instructions

2011 Jul 16

[LLVMdev] [RFC] LegalizeDAG support for targets without subword load/store instructions

On 16 Jul 2011, at 03:34, Matt Johnson wrote: > Hi All, > Some targets don't provide subword (e.g., i8 and i16 for a 32-bit > machine) load and store instructions, so currently we have to > custom-lower Load- and StoreSDNodes in our backends. For examples, see > LowerLOAD() and LowerSTORE() in {XCore,CellSPU}ISelLowering.cpp. I > believe it's possible to

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

2005 Dec 15

[LLVMdev] Vector LLVM extension v.s. DirectX Shaders

On Thu, 15 Dec 2005, Tzu-Chien Chiu wrote: > To write a compiler for Microsoft Direct3D shaders from our hardware, > I have a program which translates the Direct3D shader assembly to LLVM > assembly. I added several intrinsics for this purpose. > It's a vector ISA and has some special instructions like: > * rcp (reciprocal) > * frc (the fractional portion of each input

[LLVMdev] avoid live range overlap of "vector" registers

2005 May 11

[LLVMdev] avoid live range overlap of "vector" registers

Chris Lattner wrote: > None, that documentation is out of date and doesn't make a ton of sense > for your application. I would suggest that you implement it in the > context of the SelectionDAG framework that all of the code generators > either currently use or are moving to. I updated the documentation > here:

[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)

2014 Oct 03

[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)

Hi Tom, Matt, I'm running into strange issues with the cos test (piglit generated_tests/cl/builtin/math/builtin-float-cos-1.0.generated.c) I have been seeing random failures (incorrect results) for some time and tried to investigate. the weird part is that the failures are not 100% reproducible, sometimes the tests pass, or partly pass (it's usually float8 and float16 subtests that

[Mesa-dev] llvm TGSI backend (WIP) questions

2015 Nov 18

[Mesa-dev] llvm TGSI backend (WIP) questions

Hi, On 13-11-15 19:51, Tom Stellard wrote: > On Fri, Nov 13, 2015 at 02:46:52PM +0100, Hans de Goede wrote: >> Hi All, >> >> So as discussed I've started working on a TGSI backend for >> llvm to use as a way to get compute going on nouveau (and other gpu-s). >> >> I'm still learning all the ins and outs of llvm so I do not have >> much to show

[LLVMdev] (no subject)

2011 Jul 01

[LLVMdev] (no subject)

I'm trying to debug a problem with our custom backend with using a tiered register allocation setup. Just a little background. My target uses vec4 32bit registers and I want to have three levels of registers setup. Each vec4 register can have two sub-regs of size vec2 32bit, and each sub-reg, has its own two sub-regs of 32bit each. So it looks like this, xyzw -> {xy, zw} -> {x, y, z,

Translating tests/trivial/compute.c gallium tests to opencl (input / help wanted)

2015 Dec 22

Translating tests/trivial/compute.c gallium tests to opencl (input / help wanted)

Hi All, I've been working on translating the tests/trivial/compute.c tests to opencl (for the buffer setup and kernel launch, I'm keeping the compute kernels in tgsi as an intermediate step). I've got the test_input_global() test working, see: https://fedorapeople.org/~jwrdegoede/compute-opencl-tgsi.c Next I wanted to convert the test_system_values() test and there I've gotten

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 05

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

Hi Chandler, While doing the performance measurement on a Ivy Bridge, I ran into compile time errors. I saw a bunch of “cannot select" in the LLVM test suite with -march=core-avx-i. E.g., SingleSource/UnitTests/Vector/SSE/sse.isamax.c is failing at O3 -march=core-avx-i with: fatal error: error in backend: Cannot select: 0x7f91b99a6420: v4i32 = bitcast 0x7f91b99b0e10 [ORD=3] [ID=27]

[LLVMdev] (no subject)

2011 Jul 01

[LLVMdev] (no subject)

On Jul 1, 2011, at 12:16 PM, Villmow, Micah wrote: > I'm trying to debug a problem with our custom backend with using a tiered register allocation setup. > > Just a little background. My target uses vec4 32bit registers and I want to have three levels of registers setup. > Each vec4 register can have two sub-regs of size vec2 32bit, and each sub-reg, has its own two sub-regs of

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 06

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

I've run the SingleSource test suite for core-avx-i and have no failures here so a preprocessed file + commandline would be very useful if this reproduces for you still. On Sat, Sep 6, 2014 at 4:07 PM, Chandler Carruth <chandlerc at gmail.com> wrote: > I'm having trouble reproducing this. I'm trying to get LNT to actually > run, but manually compiling the given source

[LLVMdev] (no subject)

2011 Jul 01

[LLVMdev] (no subject)

From: Jakob Stoklund Olesen [mailto:stoklund at 2pi.dk] Sent: Friday, July 01, 2011 2:56 PM To: Villmow, Micah Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] (no subject) On Jul 1, 2011, at 12:16 PM, Villmow, Micah wrote: I'm trying to debug a problem with our custom backend with using a tiered register allocation setup. Just a little background. My target uses vec4 32bit registers and

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 08

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

> On Sep 7, 2014, at 8:49 PM, Quentin Colombet <qcolombet at apple.com> wrote: > > Sure, > > Here is the command line: > clang -cc1 -triple x86_64-apple-macosx -S -disable-free -disable-llvm-verifier -main-file-name tmp.i -mrelocation-model pic -pic-level 2 -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu core-avx-i -O3 -ferror-limit 19 -fmessage-length 114

similar to: [LLVMdev] Subword register allocation