thr3ads.net - similar to: "[LLVMdev] avoid live range overlap of "vector" registers"

Displaying 20 results from an estimated 200 matches similar to: "[LLVMdev] avoid live range overlap of "vector" registers"

[LLVMdev] avoid live range overlap of "vector" registers

2005 May 10

[LLVMdev] avoid live range overlap of "vector" registers

On Fri, 6 May 2005, Tzu-Chien Chiu wrote: > a "vector" register r0 is composed of four 32-bit floating scalar > registers, r0.x, r0.y, r0.z, r0.w. > > each scalar reg can be assigned individually, e.g. > > mov r0.x, r1.y > add r0.y, r1,x, r2.z > > or assigned simultaneously with vector instructions, e.g. > > add r0.xyzw, r1.xzyw, r2.xyzw > > My

[LLVMdev] Subword register allocation

2005 Sep 17

[LLVMdev] Subword register allocation

Hi, I have a question about implementing subword register allocation problems (see the REFERENCES in the end of this message) on LLVM. I have algorithms, but don't know the best way to implement them in LLVM. I asked similar question before: http://lists.cs.uiuc.edu/pipermail/llvmdev/2005- May/004001.html Because I still don't have a satisfying solution now, I try to elaborate it

[LLVMdev] avoid live range overlap of "vector" registers

2005 May 10

[LLVMdev] avoid live range overlap of "vector" registers

Chris Lattner wrote: > On Fri, 6 May 2005, Tzu-Chien Chiu wrote: > >> a "vector" register r0 is composed of four 32-bit floating scalar >> registers, r0.x, r0.y, r0.z, r0.w. >> >> each scalar reg can be assigned individually, e.g. >> >> mov r0.x, r1.y >> add r0.y, r1,x, r2.z >> >> or assigned simultaneously with vector

[LLVMdev] adding new instructions to support "swizzle" and "writemask"

2005 Apr 20

[LLVMdev] adding new instructions to support "swizzle" and "writemask"

Hello, everyone: I am writing a compiler for a programmable graphics hardware. Each registers of the hardware has four channels, namely 'r', 'b', 'g', 'a', and each channel is a 32-bit floating point. It's similar to the high and low 8-bit of an x86 16-bit general purpose register "AX" can be individually referenced as "AH" and

[LLVMdev] avoid live range overlap of "vector" registers

2005 May 11

[LLVMdev] avoid live range overlap of "vector" registers

On Tue May 10 2005, Chris Lattner wrote: >On Tue, 10 May 2005, Morten Ofstad wrote: >> Actually, I think it would be better to define the registers as a machine >> value type for packed float x4, and providing some 'extract' and 'inject' >> instructions to access individual components... There should also be a >> 'shuffle' instruction

[LLVMdev] Target.td:Register changes

2004 Nov 16

[LLVMdev] Target.td:Register changes

Hi, looking at the fresh CVS state I see: class Register<string n> : RegisterBase<n> { list<RegisterBase> Aliases = []; } while previously the Register class did not require any parameters. The change log is just: * Target.td: Revamp the Register class, and allow the use of the RegisterGroup class to specify aliases directly in register definitions. and I

[LLVMdev] Target.td:Register changes

2004 Nov 16

[LLVMdev] Target.td:Register changes

On Tue, 16 Nov 2004, Vladimir Prus wrote: > and I could not find any discussions in the archives. > > Why the change was necessary? Writing: > > def gr0 : Register<"gr0">; > def gr1 : Register<"gr1">; > def gr2 : Register<"gr2">; > def gr3 : Register<"gr3">; > def gr4 :

[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)

2014 Oct 03

[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)

Hi Tom, Matt, I'm running into strange issues with the cos test (piglit generated_tests/cl/builtin/math/builtin-float-cos-1.0.generated.c) I have been seeing random failures (incorrect results) for some time and tried to investigate. the weird part is that the failures are not 100% reproducible, sometimes the tests pass, or partly pass (it's usually float8 and float16 subtests that

[LLVMdev] avoid live range overlap of "vector" registers

2005 May 11

[LLVMdev] avoid live range overlap of "vector" registers

On Wed, 11 May 2005, Tzu-Chien Chiu wrote: > On Tue May 10 2005, Chris Lattner wrote: >> On Tue, 10 May 2005, Morten Ofstad wrote: >>> Actually, I think it would be better to define the registers as a machine >>> value type for packed float x4, and providing some 'extract' and 'inject' >>> instructions to access individual components... There

[Mesa-dev] llvm TGSI backend (WIP) questions

2015 Nov 18

[Mesa-dev] llvm TGSI backend (WIP) questions

Hi, On 13-11-15 19:51, Tom Stellard wrote: > On Fri, Nov 13, 2015 at 02:46:52PM +0100, Hans de Goede wrote: >> Hi All, >> >> So as discussed I've started working on a TGSI backend for >> llvm to use as a way to get compute going on nouveau (and other gpu-s). >> >> I'm still learning all the ins and outs of llvm so I do not have >> much to show

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 05

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

On Fri, Sep 5, 2014 at 9:32 AM, Robert Lougher <rob.lougher at gmail.com> wrote: > Unfortunately, another team, while doing internal testing has seen the > new path generating illegal insertps masks. A sample here: > > vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3] > vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3] >

[LLVMdev] TableGen target description file change

2004 Sep 14

[LLVMdev] TableGen target description file change

This is just a note for people who have targets that are not in the main LLVM tree. I just checked in a patch (contributed by Jason Eckhardt) that makes the following changes: 1. The 'Register' tablegen class now requires a register name to be specified as an argument for the register. If you had this: def FP0 : Register; before, change it to: def FP0 :

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 05

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

Hi Chandler, While doing the performance measurement on a Ivy Bridge, I ran into compile time errors. I saw a bunch of “cannot select" in the LLVM test suite with -march=core-avx-i. E.g., SingleSource/UnitTests/Vector/SSE/sse.isamax.c is failing at O3 -march=core-avx-i with: fatal error: error in backend: Cannot select: 0x7f91b99a6420: v4i32 = bitcast 0x7f91b99b0e10 [ORD=3] [ID=27]

llvm TGSI backend (WIP) questions

2015 Nov 13

llvm TGSI backend (WIP) questions

Hi All, So as discussed I've started working on a TGSI backend for llvm to use as a way to get compute going on nouveau (and other gpu-s). I'm still learning all the ins and outs of llvm so I do not have much to show yet. I've rebased Francisco's (curro's) latest version on top of llvm trunk, and added a commit on top to actual get it build with the latest trunk. So

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 06

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

I've run the SingleSource test suite for core-avx-i and have no failures here so a preprocessed file + commandline would be very useful if this reproduces for you still. On Sat, Sep 6, 2014 at 4:07 PM, Chandler Carruth <chandlerc at gmail.com> wrote: > I'm having trouble reproducing this. I'm trying to get LNT to actually > run, but manually compiling the given source

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 13

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

It seems to me that LLVM sub-register is not for the following hardware architecture. All instructions of a hardware are vector instructions. All registers contains 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. Most instructions write more than one elements in this way: mul r0.xyw, r1, r2 add r0.z, r3, r4 sub r5, r0, r1 Notice that the four elements of r0 are written

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 08

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

> On Sep 7, 2014, at 8:49 PM, Quentin Colombet <qcolombet at apple.com> wrote: > > Sure, > > Here is the command line: > clang -cc1 -triple x86_64-apple-macosx -S -disable-free -disable-llvm-verifier -main-file-name tmp.i -mrelocation-model pic -pic-level 2 -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu core-avx-i -O3 -ferror-limit 19 -fmessage-length 114

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 16

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

Evan Cheng-2 wrote: > > Well, how many possible permutations are there? Is it possible to > model each case as a separate physical register? > > Evan > I don't think so. There are 4x4x4x4 = 256 permutations. For example: * xyzw: default * zxyw * yyyy: splat Even if can model each of these 256 cases as a separate physical register, how can I model the use of r0.xyzw in

win32-dir 0.1.0 compile problems

2005 May 01

win32-dir 0.1.0 compile problems

I tried to download/compile/install win32-dir, but I couldn''t get it to go. Over a private email Daniel Berger had me... "Curious. What platform are you on exactly? Try modifying the extconf.rb file. Add ''have_library("SHFolder")'' above ''have_library("shell32")''. If that doesn''t work, try uncommenting the other

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 13

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

On Feb 13, 2009, at 9:47 AM, Alex wrote: > It seems to me that LLVM sub-register is not for the following > hardware architecture. > > All instructions of a hardware are vector instructions. All > registers contains > 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. > > Most instructions write more than one elements in this way: > > mul

similar to: [LLVMdev] avoid live range overlap of "vector" registers