Displaying 20 results from an estimated 200 matches similar to: "[LLVMdev] avoid live range overlap of "vector" registers"
2005 May 10
0
[LLVMdev] avoid live range overlap of "vector" registers
On Fri, 6 May 2005, Tzu-Chien Chiu wrote:
> a "vector" register r0 is composed of four 32-bit floating scalar
> registers, r0.x, r0.y, r0.z, r0.w.
>
> each scalar reg can be assigned individually, e.g.
>
> mov r0.x, r1.y
> add r0.y, r1,x, r2.z
>
> or assigned simultaneously with vector instructions, e.g.
>
> add r0.xyzw, r1.xzyw, r2.xyzw
>
> My
2005 Sep 17
1
[LLVMdev] Subword register allocation
Hi,
I have a question about implementing subword register allocation
problems (see the REFERENCES in the end of this message) on LLVM. I
have algorithms, but don't know the best way to implement them in
LLVM.
I asked similar question before:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2005-
May/004001.html
Because I still don't have a satisfying solution now, I try to
elaborate it
2005 May 10
1
[LLVMdev] avoid live range overlap of "vector" registers
Chris Lattner wrote:
> On Fri, 6 May 2005, Tzu-Chien Chiu wrote:
>
>> a "vector" register r0 is composed of four 32-bit floating scalar
>> registers, r0.x, r0.y, r0.z, r0.w.
>>
>> each scalar reg can be assigned individually, e.g.
>>
>> mov r0.x, r1.y
>> add r0.y, r1,x, r2.z
>>
>> or assigned simultaneously with vector
2005 Apr 20
1
[LLVMdev] adding new instructions to support "swizzle" and "writemask"
Hello, everyone:
I am writing a compiler for a programmable graphics hardware. Each
registers of the hardware has four channels, namely 'r', 'b', 'g',
'a', and each channel is a 32-bit floating point. It's similar to the
high and low 8-bit of an x86 16-bit general purpose register "AX" can
be individually referenced as "AH" and
2005 May 11
2
[LLVMdev] avoid live range overlap of "vector" registers
On Tue May 10 2005, Chris Lattner wrote:
>On Tue, 10 May 2005, Morten Ofstad wrote:
>> Actually, I think it would be better to define the registers as a machine
>> value type for packed float x4, and providing some 'extract' and 'inject'
>> instructions to access individual components... There should also be a
>> 'shuffle' instruction
2004 Nov 16
2
[LLVMdev] Target.td:Register changes
Hi, looking at the fresh CVS state I see:
class Register<string n> : RegisterBase<n> {
list<RegisterBase> Aliases = [];
}
while previously the Register class did not require any parameters. The change
log is just:
* Target.td: Revamp the Register class, and allow the use of the
RegisterGroup class to specify aliases directly in register
definitions.
and I
2004 Nov 16
0
[LLVMdev] Target.td:Register changes
On Tue, 16 Nov 2004, Vladimir Prus wrote:
> and I could not find any discussions in the archives.
>
> Why the change was necessary? Writing:
>
> def gr0 : Register<"gr0">;
> def gr1 : Register<"gr1">;
> def gr2 : Register<"gr2">;
> def gr3 : Register<"gr3">;
> def gr4 :
2014 Oct 03
2
[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)
Hi Tom, Matt,
I'm running into strange issues with the cos test (piglit
generated_tests/cl/builtin/math/builtin-float-cos-1.0.generated.c)
I have been seeing random failures (incorrect results) for some time and
tried to investigate. the weird part is that the failures are not 100%
reproducible, sometimes the tests pass, or partly pass
(it's usually float8 and float16 subtests that
2005 May 11
0
[LLVMdev] avoid live range overlap of "vector" registers
On Wed, 11 May 2005, Tzu-Chien Chiu wrote:
> On Tue May 10 2005, Chris Lattner wrote:
>> On Tue, 10 May 2005, Morten Ofstad wrote:
>>> Actually, I think it would be better to define the registers as a machine
>>> value type for packed float x4, and providing some 'extract' and 'inject'
>>> instructions to access individual components... There
2015 Nov 18
1
[Mesa-dev] llvm TGSI backend (WIP) questions
Hi,
On 13-11-15 19:51, Tom Stellard wrote:
> On Fri, Nov 13, 2015 at 02:46:52PM +0100, Hans de Goede wrote:
>> Hi All,
>>
>> So as discussed I've started working on a TGSI backend for
>> llvm to use as a way to get compute going on nouveau (and other gpu-s).
>>
>> I'm still learning all the ins and outs of llvm so I do not have
>> much to show
2014 Sep 05
3
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
On Fri, Sep 5, 2014 at 9:32 AM, Robert Lougher <rob.lougher at gmail.com>
wrote:
> Unfortunately, another team, while doing internal testing has seen the
> new path generating illegal insertps masks. A sample here:
>
> vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3]
> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3]
>
2004 Sep 14
0
[LLVMdev] TableGen target description file change
This is just a note for people who have targets that are not in the main
LLVM tree. I just checked in a patch (contributed by Jason Eckhardt) that
makes the following changes:
1. The 'Register' tablegen class now requires a register name to be
specified as an argument for the register. If you had this:
def FP0 : Register;
before, change it to:
def FP0 :
2014 Sep 05
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
Hi Chandler,
While doing the performance measurement on a Ivy Bridge, I ran into compile time errors.
I saw a bunch of “cannot select" in the LLVM test suite with -march=core-avx-i.
E.g., SingleSource/UnitTests/Vector/SSE/sse.isamax.c is failing at O3 -march=core-avx-i with:
fatal error: error in backend: Cannot select: 0x7f91b99a6420: v4i32 = bitcast 0x7f91b99b0e10 [ORD=3] [ID=27]
2015 Nov 13
6
llvm TGSI backend (WIP) questions
Hi All,
So as discussed I've started working on a TGSI backend for
llvm to use as a way to get compute going on nouveau (and other gpu-s).
I'm still learning all the ins and outs of llvm so I do not have
much to show yet.
I've rebased Francisco's (curro's) latest version on top of llvm
trunk, and added a commit on top to actual get it build with the
latest trunk. So
2014 Sep 06
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
I've run the SingleSource test suite for core-avx-i and have no failures
here so a preprocessed file + commandline would be very useful if this
reproduces for you still.
On Sat, Sep 6, 2014 at 4:07 PM, Chandler Carruth <chandlerc at gmail.com>
wrote:
> I'm having trouble reproducing this. I'm trying to get LNT to actually
> run, but manually compiling the given source
2009 Feb 13
3
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
It seems to me that LLVM sub-register is not for the following hardware
architecture.
All instructions of a hardware are vector instructions. All registers
contains
4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w.
Most instructions write more than one elements in this way:
mul r0.xyw, r1, r2
add r0.z, r3, r4
sub r5, r0, r1
Notice that the four elements of r0 are written
2014 Sep 08
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
> On Sep 7, 2014, at 8:49 PM, Quentin Colombet <qcolombet at apple.com> wrote:
>
> Sure,
>
> Here is the command line:
> clang -cc1 -triple x86_64-apple-macosx -S -disable-free -disable-llvm-verifier -main-file-name tmp.i -mrelocation-model pic -pic-level 2 -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu core-avx-i -O3 -ferror-limit 19 -fmessage-length 114
2009 Feb 16
2
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
Evan Cheng-2 wrote:
>
> Well, how many possible permutations are there? Is it possible to
> model each case as a separate physical register?
>
> Evan
>
I don't think so. There are 4x4x4x4 = 256 permutations. For example:
* xyzw: default
* zxyw
* yyyy: splat
Even if can model each of these 256 cases as a separate physical register,
how can I model the use of r0.xyzw in
2005 May 01
3
win32-dir 0.1.0 compile problems
I tried to download/compile/install win32-dir, but I couldn''t get it to
go. Over a private email Daniel Berger had me...
"Curious. What platform are you on exactly? Try
modifying the extconf.rb file. Add
''have_library("SHFolder")'' above
''have_library("shell32")''. If that doesn''t work, try
uncommenting the other
2009 Feb 13
0
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
On Feb 13, 2009, at 9:47 AM, Alex wrote:
> It seems to me that LLVM sub-register is not for the following
> hardware architecture.
>
> All instructions of a hardware are vector instructions. All
> registers contains
> 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w.
>
> Most instructions write more than one elements in this way:
>
> mul