thr3ads.net - llvm dev - [LLVMdev] Does current LLVM target-independent code generator supports my strange chip? [Nov 2008]

If this information is useful, please help other people find it:
Share via:

Wei

2008-Nov-22 16:03 UTC

[LLVMdev] Does current LLVM target-independent code generator supports my strange chip?

I have 24-bit integer operations as well as 24-bit floating point
(s7.16) operations.

The H/W supports load/store instructions, however, they does suggest
us not to use these load/store instructions besides debugging purpose.
That is to say, you can imagine we don't have load/store instructions,
we don't have memory, we just have registers.

I will run OpenGL shading laugnage programs on these chip.

About your comments, I (a new LLVM user) have some more questions:

1) You mention "custom handle the conversion of the integer/float
constants that LLVM spits out", does it means:
I have to register a callback function which will operate when LLVM
wants to spits out a constant value to memory. But what about non-
constant value? ex:
int a;
and LLVM wants to put a into memory.
and I don't really know what the "i32/f32 sounds a good place to
start" means...

2) I don't know why you mention "I'd assume you'd have
intrinsics for
I/O."

3) I don't think I get you about the following
statements:> If you want to support memory operations, your integers need to
> support the addressing range correctly - you effectively have 17 bits
> of mantissa - so it may be a tight squeeze without 24 bit integer ops
> (shifts and ands and stuff will also be a painful, but that's a more
> expansive topic).Can you give some example?

Really really thanks about your comments.

Wei.

On Nov 20, 10:24 pm, Daniel M Gessel <ges... at apple.com>
wrote:> This is similar to ATI's R300/R420 pixel shaders. I'm familiar with
 
> this hardware, but not really an LLVM expert (working on a code  
> generator myself, but learning as I go).
>
> Do you have 24-bit integer operations, or just floating point?
>
> What about load/store?
>
> Are you looking to run large C programs with complex data structures,  
> or just comparatively simple math functions (i.e. a compute
"kernel")?
>
> If you only want to support programs that can live entirely within  
> registers, you can custom handle the conversion of the integer/float  
> constants that LLVM spits out and i32/f32 sounds a good place to start  
> - LLVM's mem2reg and inlining is very effective at getting rid the  
> majority of stack operations, and I'd assume you'd have intrinsics
for  
> I/O.
>
> If you want to support memory operations, your integers need to  
> support the addressing range correctly - you effectively have 17 bits  
> of mantissa - so it may be a tight squeeze without 24 bit integer ops  
> (shifts and ands and stuff will also be a painful, but that's a more  
> expansive topic).
>
> Dan
>
> On Nov 20, 2008, at 7:46 AM, Wei wrote:
>
>
>
> > Because each channel contains 24-bit, so.. what is the
> > llvm::SimpleValueType I should use for each channel?
> > the current llvm::SimpleValueType contains i1, i8, i16, i32, i64, f32,
> > f64, f80, none of them are fit one channel (24-bit).
>
> > I think I can use i32 or f32 to represent each 24-bit channel, if the
> > runtime result of some machine instructions exceeds 23-bit (1 bit is
> > for sign), then it is an overflow.
> > Is it correct to claim that the programmers needs to revise his
> > program to fix this problem?
> > Am I right or wrong about this thought?
>
> > If there is a chip, whose registers are 24-bit long, and you have to
> > compile C/C++ programs on it.
> > How would you represent the following statement?
>
> > int a = 3;
> > (Programmers think sizeof(int) = 4)
>
> > Wei.
>
> > On Nov 19, 2:01 am, Evan Cheng <evan.ch... at apple.com> wrote:
> >> Why not model each channel as a separate physical register?
>
> >> Evan
>
> >> On Nov 17, 2008, at 6:36 AM, Wei wrote:
>
> >>> I have a very strange and complicate H/W platform.
> >>> It has many registers in one format.
> >>> The register format is:
>
> >>> ------------------------------
> >>>
----------------------------------------------------------------------------------------
> >>> |         24-bit                |             24-bit
> >>> |           24-bit               |              24-bit        
   |
> >>>
----------------------------------------------------------------------------------------------------------------------
> >>>               a
> >>> b
> >>> c                                       d
>
> >>> There are 4 channels in a register, and each channel contains
24-
> >>> bit, hence, there are total 96-bit in 'one' register.
> >>> You can store a 24-bit integer or a s7.16 floating-point data
into
> >>> each channel.
> >>> You can name each channel 'a', 'b',
'c', 'd'.
>
> >>> Here is an example of the operation in this H/W platform:
>
> >>>             ADD      R3.ab, R1.abab, R2.bbaa
>
> >>> it means
>
> >>>            Add 'abab' channel of R1 and 'bbaa'
channel of R2, and
> >>> put the result into the 'ab' channel of R3.
>
> >>> It's complicate.
> >>> Imagine a non-existed temp register named 'Rt1', the
content of its
> >>> 'a','b','c','d' channel are
got from 'a','b','a','b' channel of R1,
> >>> and imagine another non-existed temp register named
'Rt2', the
> >>> content of its 'a','b','c','d'
channel are got from 'b','b','a','a'
> >>> channel of R2.
> >>> and then add Rt1 & Rt2, put the result to R3
> >>> this means
> >>> the 'a' channel of R3 will be equal to the 'a'
channel of Rt1 plus
> >>> the 'a' channel of Rt2, (i.e. 'a' from R1 +
'b' from R2, because
> >>> R1.'a'bab and R2.'b'baa)
> >>> the 'b' channel of R3 will be equal to the 'b'
channel of Rt1 plus
> >>> the 'b' channel of Rt2, (i.e. 'b' from R1 +
'b' from R2, because
> >>> R1.a'b'ab and R2.b'b'aa)
> >>> the 'c' channel of R3 will be untouched, the value of
the 'c'
> >>> channel of Rt1 plus the 'c' channel of Rt2 (i.e.
'a' from R1 + 'a'
> >>> from R2, because R1.ab'a'b and R2.bb'a'a) will
be lost.
> >>> the 'd' channel of R3 will be untouched, too. The
value of the 'd'
> >>> channel of Rt1 plus the 'd' channel of Rt2 (i.e.
'b' from R1 + 'a'
> >>> from R2, because R1.aba'b' and R2.bba'a') will
be lost, too.
>
> >>> I don't know whether I can set the 'type' of such
register using a
> >>> llvm::MVT::SimpleValueType?
> >>> According the LLVM doc & LLVM source codes, I think
llvm::MVT::v8i8,
> >>> v2f32, etc is used to represent register for SIMD
instructions.
> >>> I don't think the operations in my platform are SIMD
instructions.
> >>> However, I can not find any llvm::MVT::SimpleValueType which
can
> >>> represents a 96-bit register.
>
> >>> Thus, my question is:
>
> >>> 1) Does current LLVM backend supports this H/W?
> >>> 2) If yes, how can I write the type of the register class in
my .td
> >>> file?
>
> >>> (Which value should I fill in the following 'XXX' ?)
> >>> def TempRegs : RegisterClass<"MFLEXG", [XXX], 32,
[R0, R1, R2, R3,
> >>> R4, R5, R6, R7, R8, R9,
> >>>                                                    R10, R11,
R12,
> >>> R13, R14, R15, R16, R17, R18, R19,
> >>>                                                    R20, R21,
R22,
> >>> R23, R24, R25, R26, R27, R28, R29,
> >>>                                                    R30,
R31]> {
> >>> }
>
> >>> 3) If not, does this means I have to write the whole LLVM
backend
> >>> based on the basic llvm::TargetMachine & llvm::TargetData,
just like
> >>> what CBackend does?
>
> >>> --------------------------------------------------------
> >>> Wei Hu
> >>>http://www.csie.ntu.edu.tw/~r88052/
> >>>http://wei-hu-tw.blogspot.com/
>
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> LLVM... at cs.uiuc.edu        http://llvm.cs.uiuc.edu
> >>>http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> LLVM... at cs.uiuc.edu      
 http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVM... at cs.uiuc.edu        http://llvm.cs.uiuc.edu
> >http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> LLVM... at cs.uiuc.edu      
 http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Daniel M Gessel

2008-Nov-22 17:37 UTC

head link

[LLVMdev] Does current LLVM target-independent code generator supports my strange chip?

On Nov 22, 2008, at 11:03 AM, Wei wrote:
> I have 24-bit integer operations as well as 24-bit floating point
> (s7.16) operations.
>
> The H/W supports load/store instructions, however, they does suggest
> us not to use these load/store instructions besides debugging purpose.
> That is to say, you can imagine we don't have load/store instructions,
> we don't have memory, we just have registers.
>
> I will run OpenGL shading laugnage programs on these chip.
GLSL doesn't have pointers, so no "generic" load + store
simplifying
things.

Unextended GLSL only requires support for integers in the 16 bit  
range, and has no bitwise operations. It also doesn't specify integer  
overflow behavior in any way.

The machines I worked with didn't support any integer ops, but GLSL  
let us get by with "emulated" 16 bit integers (storing and operating  
on them as floating point; divides required truncation after the op -  
that sort of thing).

Since you have 24 bit integer operations, you're in better shape.
> About your comments, I (a new LLVM user) have some more questions:
>
> 1) You mention "custom handle the conversion of the integer/float
> constants that LLVM spits out", does it means:
> I have to register a callback function which will operate when LLVM
> wants to spits out a constant value to memory. But what about non-
> constant value?
What I mean is that you can probably get away with LLVM working with  
float literals as f32, then converting them to your 24 bit format  
during code gen. The specifics depend on how you want to handle  
constants in your backend: literals in instructions or a constant pool  
are the options I know of. For now, I'm using special "load
literal"
instructions, but a constant pool may be more appropriate in the long  
run. I'm still learning.

Integers too: let LLVM work with i32 internally, and convert literals  
during code gen.

Since GLSL doesn't require load/store, and it sounds like your HW may  
not 100% reliable for these ops, you want to make sure your code stays  
in registers.

I assume you'll be starting with the reference GLSL parser (from  
3DLabs, IIRC - I don't even know if they stil exist, actually) and  
having it generate LLVM IR (has anybody done this before?). This will  
give you much more control over the code - Clang is the front end for  
the project I'm working on, and it generates stack based code; most of  
the stack operations get optimized out by inlining and the mem2reg  
pass, but not everything.

> ex:
> int a;
> and LLVM wants to put a into memory.
>
> and I don't really know what the "i32/f32 sounds a good place to
> start" means...
I mean that having your registers declared as i32 + f32 will probably  
work out well, especially since you don't have pointers in your  
language.

The issue would be that LLVM would want to store register values as 32  
bits - and do all the pointer math that way. Depending on how your HW  
works, this may or may not be okay. Even then, you might be able to  
patch it up if you really needed to store your registers 3 byte aligned.

Fortunately, this is not an issue with GLSL.
> 2) I don't know why you mention "I'd assume you'd have
intrinsics for
> I/O."
For GLSL, you have to have some way of reading attributes and  
uniforms, exporting to/reading from varyings, etc.

Different GPUs do things differently of course: in some cases, it's a  
matter of certain GPRs being initialized by "fixed function" HW with  
input values at the start of the shader and certain GPRs being left  
with output values at the end of the shader. Other GPUs require  
explicit "export" instructions, perhaps just reads/writes to dedicated
I/O registers. Some have a mix (this is the case for HW I've worked  
with).

If you have export instructions, or even special I/O registers, I was  
thinking that they could be represented or accessed by Target specific  
ops -intrinsics. You'd have the GLSL front end generate these  
intrinsic operations.

I haven't had to work with register constraints in LLVM, so I'm not  
sure what would be best approach if I/O is done through specific GPRs:  
you don't want to reserve those registers for I/O only.... it would  
take some exploration.
>
> 3) I don't think I get you about the following statements:
>> If you want to support memory operations, your integers need to
>> support the addressing range correctly - you effectively have 17 bits
>> of mantissa - so it may be a tight squeeze without 24 bit integer ops
>> (shifts and ands and stuff will also be a painful, but that's a
more
>> expansive topic).
> Can you give some example?
Sorry, I was "thinking out loud".

I made the assumption here that you didn't have 24 bit integer ops,  
and that you might try to represent pointers as integers in a single  
24 bit float value (maybe with a 1D texture as your addressable  
memory). In that case, you'd have a very limited range.

But GLSL doesn't have pointers, so this isn't an issue (and 24 bit  
integers gives you a decent addressing range for debugging).

Dan


>
> Really really thanks about your comments.
>
> Wei.
>
> On Nov 20, 10:24 pm, Daniel M Gessel <ges... at apple.com> wrote:
>> This is similar to ATI's R300/R420 pixel shaders. I'm familiar
with
>> this hardware, but not really an LLVM expert (working on a code
>> generator myself, but learning as I go).
>>
>> Do you have 24-bit integer operations, or just floating point?
>>
>> What about load/store?
>>
>> Are you looking to run large C programs with complex data structures,
>> or just comparatively simple math functions (i.e. a compute  
>> "kernel")?
>>
>> If you only want to support programs that can live entirely within
>> registers, you can custom handle the conversion of the integer/float
>> constants that LLVM spits out and i32/f32 sounds a good place to  
>> start
>> - LLVM's mem2reg and inlining is very effective at getting rid the
>> majority of stack operations, and I'd assume you'd have
intrinsics
>> for
>> I/O.
>>
>> If you want to support memory operations, your integers need to
>> support the addressing range correctly - you effectively have 17 bits
>> of mantissa - so it may be a tight squeeze without 24 bit integer ops
>> (shifts and ands and stuff will also be a painful, but that's a
more
>> expansive topic).
>>
>> Dan
>>
>> On Nov 20, 2008, at 7:46 AM, Wei wrote:
>>
>>
>>
>>> Because each channel contains 24-bit, so.. what is the
>>> llvm::SimpleValueType I should use for each channel?
>>> the current llvm::SimpleValueType contains i1, i8, i16, i32, i64,  
>>> f32,
>>> f64, f80, none of them are fit one channel (24-bit).
>>
>>> I think I can use i32 or f32 to represent each 24-bit channel, if  
>>> the
>>> runtime result of some machine instructions exceeds 23-bit (1 bit
is
>>> for sign), then it is an overflow.
>>> Is it correct to claim that the programmers needs to revise his
>>> program to fix this problem?
>>> Am I right or wrong about this thought?
>>
>>> If there is a chip, whose registers are 24-bit long, and you have
to
>>> compile C/C++ programs on it.
>>> How would you represent the following statement?
>>
>>> int a = 3;
>>> (Programmers think sizeof(int) = 4)
>>
>>> Wei.
>>
>>> On Nov 19, 2:01 am, Evan Cheng <evan.ch... at apple.com>
wrote:
>>>> Why not model each channel as a separate physical register?
>>
>>>> Evan
>>
>>>> On Nov 17, 2008, at 6:36 AM, Wei wrote:
>>
>>>>> I have a very strange and complicate H/W platform.
>>>>> It has many registers in one format.
>>>>> The register format is:
>>
>>>>> ------------------------------
>>>>>
----------------------------------------------------------------------------------------
>>>>> |         24-bit                |             24-bit
>>>>> |           24-bit               |              24- 
>>>>> bit            |
>>>>>
----------------------------------------------------------------------------------------------------------------------
>>>>>               a
>>>>> b
>>>>> c                                       d
>>
>>>>> There are 4 channels in a register, and each channel
contains 24-
>>>>> bit, hence, there are total 96-bit in 'one'
register.
>>>>> You can store a 24-bit integer or a s7.16 floating-point
data into
>>>>> each channel.
>>>>> You can name each channel 'a', 'b',
'c', 'd'.
>>
>>>>> Here is an example of the operation in this H/W platform:
>>
>>>>>             ADD      R3.ab, R1.abab, R2.bbaa
>>
>>>>> it means
>>
>>>>>            Add 'abab' channel of R1 and
'bbaa' channel of R2, and
>>>>> put the result into the 'ab' channel of R3.
>>
>>>>> It's complicate.
>>>>> Imagine a non-existed temp register named 'Rt1',
the content of
>>>>> its
>>>>> 'a','b','c','d' channel are
got from 'a','b','a','b' channel of
>>>>> R1,
>>>>> and imagine another non-existed temp register named
'Rt2', the
>>>>> content of its
'a','b','c','d' channel are got from
>>>>> 'b','b','a','a'
>>>>> channel of R2.
>>>>> and then add Rt1 & Rt2, put the result to R3
>>>>> this means
>>>>> the 'a' channel of R3 will be equal to the
'a' channel of Rt1 plus
>>>>> the 'a' channel of Rt2, (i.e. 'a' from R1 +
'b' from R2, because
>>>>> R1.'a'bab and R2.'b'baa)
>>>>> the 'b' channel of R3 will be equal to the
'b' channel of Rt1 plus
>>>>> the 'b' channel of Rt2, (i.e. 'b' from R1 +
'b' from R2, because
>>>>> R1.a'b'ab and R2.b'b'aa)
>>>>> the 'c' channel of R3 will be untouched, the value
of the 'c'
>>>>> channel of Rt1 plus the 'c' channel of Rt2 (i.e.
'a' from R1 + 'a'
>>>>> from R2, because R1.ab'a'b and R2.bb'a'a)
will be lost.
>>>>> the 'd' channel of R3 will be untouched, too. The
value of the 'd'
>>>>> channel of Rt1 plus the 'd' channel of Rt2 (i.e.
'b' from R1 + 'a'
>>>>> from R2, because R1.aba'b' and R2.bba'a')
will be lost, too.
>>
>>>>> I don't know whether I can set the 'type' of
such register using a
>>>>> llvm::MVT::SimpleValueType?
>>>>> According the LLVM doc & LLVM source codes, I think  
>>>>> llvm::MVT::v8i8,
>>>>> v2f32, etc is used to represent register for SIMD
instructions.
>>>>> I don't think the operations in my platform are SIMD
instructions.
>>>>> However, I can not find any llvm::MVT::SimpleValueType
which can
>>>>> represents a 96-bit register.
>>
>>>>> Thus, my question is:
>>
>>>>> 1) Does current LLVM backend supports this H/W?
>>>>> 2) If yes, how can I write the type of the register class
in
>>>>> my .td
>>>>> file?
>>
>>>>> (Which value should I fill in the following 'XXX'
?)
>>>>> def TempRegs : RegisterClass<"MFLEXG", [XXX],
32, [R0, R1, R2, R3,
>>>>> R4, R5, R6, R7, R8, R9,
>>>>>                                                    R10,
R11, R12,
>>>>> R13, R14, R15, R16, R17, R18, R19,
>>>>>                                                    R20,
R21, R22,
>>>>> R23, R24, R25, R26, R27, R28, R29,
>>>>>                                                    R30,
R31]> {
>>>>> }
>>
>>>>> 3) If not, does this means I have to write the whole LLVM
backend
>>>>> based on the basic llvm::TargetMachine &
llvm::TargetData, just
>>>>> like
>>>>> what CBackend does?
>>
>>>>> --------------------------------------------------------
>>>>> Wei Hu
>>>>> http://www.csie.ntu.edu.tw/~r88052/
>>>>> http://wei-hu-tw.blogspot.com/
>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVM... at cs.uiuc.edu        http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVM... at cs.uiuc.edu       
http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVM... at cs.uiuc.edu        http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVM... at cs.uiuc.edu       
http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20081122/6fb6e885/attachment.html>

Wei

2008-Nov-24 15:25 UTC

head link

[LLVMdev] Does current LLVM target-independent code generator supports my strange chip?

> The machines I worked with didn't support any integer ops, but GLSL
> let us get by with "emulated" 16 bit integers (storing and
operating
> on them as floating point; divides required truncation after the op -
> that sort of thing).
Although my platform indeed supports integer operations, however, it
only supports integer +,-,*, not /. The document says if I need to do
integer division, I have to convert them to floating point first.
Hence, I have similar problems.
So...
Does your method means you write some codes in your 'frontend' to emit
LLVM IR to convert the integer to floating point first, then perform
the operations, and then convert the result back to integer?
Or you write such codes in your 'backend'?

No matter what your answer is, I think the 'frontend' approach is more
cleaner than the 'backend' approach (The 'backend' approach is
more
like a hack?). Am I right? Or writing such mechanism in backend has
other advantages?
> What I mean is that you can probably get away with LLVM working with
> float literals as f32, then converting them to your 24 bit format
> during code gen.
I think I got you here.
> Integers too: let LLVM work with i32 internally, and convert literals
> during code gen.
Huh.. I think I got you here, too.
But I probably don't know how you handle integer constants larger than
24-bit.
For example, if I sees the following instructions during code gen:

int %a, add int %b, int 0x12345678

Do I have to emit machine instructions similar to the following?

int %a, add int %b, int 0x5678
int %c, add int %d, int 0x1234
int %e, add int %c, 1 <--- depends on the result of the first addition

However, this means the backend has to remember the register %a now
stores low bytes of the result, and the register %c stores the high
bytes of the result. This tracking is not an easy job, I think.
> I assume you'll be starting with the reference GLSL parser (from
> 3DLabs, IIRC - I don't even know if they stil exist, actually)
You can find the 3Dlabs frontend here:
http://l4.me.uk/static/glsl/

And I don't think anyone has ported this frontend onto LLVM before.
> The issue would be that LLVM would want to store register values as 32
> bits - and do all the pointer math that way.
I don't really get you here.
Why LLVM do all the pointer math in 32-bit just because I store
register values as 32-bit?
> I haven't had to work with register constraints in LLVM, so I'm not
> sure what would be best approach if I/O is done through specific GPRs:
> you don't want to reserve those registers for I/O only.... it would
> take some exploration.
unfortunately~! my platform indeed uses GPRs to do the input/output.
My current thought is to compute the amount of used attributes/
varyings in a shader, and reserve same amount GPRs for those
attributes/varyings ONLY. Because I think if I have NO memory can
spill registers out, there is no much space for the register
allocation. The method I might use is to INLINE all functions, and
perform the register allocation. This strategy is bad, or course, or
do you think of some other better solution?


Wei.


On Nov 23, 1:37 am, Daniel M Gessel <ges... at apple.com>
wrote:> On Nov 22, 2008, at 11:03 AM, Wei wrote:
>
> > I have 24-bit integer operations as well as 24-bit floating point
> > (s7.16) operations.
>
> > The H/W supports load/store instructions, however, they does suggest
> > us not to use these load/store instructions besides debugging purpose.
> > That is to say, you can imagine we don't have load/store
instructions,
> > we don't have memory, we just have registers.
>
> > I will run OpenGL shading laugnage programs on these chip.
>
> GLSL doesn't have pointers, so no "generic" load + store
simplifying  
> things.
>
> Unextended GLSL only requires support for integers in the 16 bit  
> range, and has no bitwise operations. It also doesn't specify integer  
> overflow behavior in any way.
>
> The machines I worked with didn't support any integer ops, but GLSL  
> let us get by with "emulated" 16 bit integers (storing and
operating  
> on them as floating point; divides required truncation after the op -  
> that sort of thing).
>
> Since you have 24 bit integer operations, you're in better shape.
>
> > About your comments, I (a new LLVM user) have some more questions:
>
> > 1) You mention "custom handle the conversion of the integer/float
> > constants that LLVM spits out", does it means:
> > I have to register a callback function which will operate when LLVM
> > wants to spits out a constant value to memory. But what about non-
> > constant value?
>
> What I mean is that you can probably get away with LLVM working with  
> float literals as f32, then converting them to your 24 bit format  
> during code gen. The specifics depend on how you want to handle  
> constants in your backend: literals in instructions or a constant pool  
> are the options I know of. For now, I'm using special "load
literal"  
> instructions, but a constant pool may be more appropriate in the long  
> run. I'm still learning.
>
> Integers too: let LLVM work with i32 internally, and convert literals  
> during code gen.
>
> Since GLSL doesn't require load/store, and it sounds like your HW may  
> not 100% reliable for these ops, you want to make sure your code stays  
> in registers.
>
> I assume you'll be starting with the reference GLSL parser (from  
> 3DLabs, IIRC - I don't even know if they stil exist, actually) and  
> having it generate LLVM IR (has anybody done this before?). This will  
> give you much more control over the code - Clang is the front end for  
> the project I'm working on, and it generates stack based code; most of
 
> the stack operations get optimized out by inlining and the mem2reg  
> pass, but not everything.
>
> > ex:
> > int a;
> > and LLVM wants to put a into memory.
>
> > and I don't really know what the "i32/f32 sounds a good place
to
> > start" means...
>
> I mean that having your registers declared as i32 + f32 will probably  
> work out well, especially since you don't have pointers in your  
> language.
>
> The issue would be that LLVM would want to store register values as 32  
> bits - and do all the pointer math that way. Depending on how your HW  
> works, this may or may not be okay. Even then, you might be able to  
> patch it up if you really needed to store your registers 3 byte aligned.
>
> Fortunately, this is not an issue with GLSL.
>
> > 2) I don't know why you mention "I'd assume you'd
have intrinsics for
> > I/O."
>
> For GLSL, you have to have some way of reading attributes and  
> uniforms, exporting to/reading from varyings, etc.
>
> Different GPUs do things differently of course: in some cases, it's a  
> matter of certain GPRs being initialized by "fixed function" HW
with  
> input values at the start of the shader and certain GPRs being left  
> with output values at the end of the shader. Other GPUs require  
> explicit "export" instructions, perhaps just reads/writes to
dedicated  
> I/O registers. Some have a mix (this is the case for HW I've worked  
> with).
>
> If you have export instructions, or even special I/O registers, I was  
> thinking that they could be represented or accessed by Target specific  
> ops -intrinsics. You'd have the GLSL front end generate these  
> intrinsic operations.
>
> I haven't had to work with register constraints in LLVM, so I'm not
 
> sure what would be best approach if I/O is done through specific GPRs:  
> you don't want to reserve those registers for I/O only.... it would  
> take some exploration.
>
>
>
> > 3) I don't think I get you about the following statements:
> >> If you want to support memory operations, your integers need to
> >> support the addressing range correctly - you effectively have 17
bits
> >> of mantissa - so it may be a tight squeeze without 24 bit integer
ops
> >> (shifts and ands and stuff will also be a painful, but that's
a more
> >> expansive topic).
> > Can you give some example?
>
> Sorry, I was "thinking out loud".
>
> I made the assumption here that you didn't have 24 bit integer ops,  
> and that you might try to represent pointers as integers in a single  
> 24 bit float value (maybe with a 1D texture as your addressable  
> memory). In that case, you'd have a very limited range.
>
> But GLSL doesn't have pointers, so this isn't an issue (and 24 bit
 
> integers gives you a decent addressing range for debugging).
>
> Dan
>
>
>
> > Really really thanks about your comments.
>
> > Wei.
>
> > On Nov 20, 10:24 pm, Daniel M Gessel <ges... at apple.com>
wrote:
> >> This is similar to ATI's R300/R420 pixel shaders. I'm
familiar with
> >> this hardware, but not really an LLVM expert (working on a code
> >> generator myself, but learning as I go).
>
> >> Do you have 24-bit integer operations, or just floating point?
>
> >> What about load/store?
>
> >> Are you looking to run large C programs with complex data
structures,
> >> or just comparatively simple math functions (i.e. a compute  
> >> "kernel")?
>
> >> If you only want to support programs that can live entirely within
> >> registers, you can custom handle the conversion of the
integer/float
> >> constants that LLVM spits out and i32/f32 sounds a good place to  
> >> start
> >> - LLVM's mem2reg and inlining is very effective at getting rid
the
> >> majority of stack operations, and I'd assume you'd have
intrinsics  
> >> for
> >> I/O.
>
> >> If you want to support memory operations, your integers need to
> >> support the addressing range correctly - you effectively have 17
bits
> >> of mantissa - so it may be a tight squeeze without 24 bit integer
ops
> >> (shifts and ands and stuff will also be a painful, but that's
a more
> >> expansive topic).
>
> >> Dan
>
> >> On Nov 20, 2008, at 7:46 AM, Wei wrote:
>
> >>> Because each channel contains 24-bit, so.. what is the
> >>> llvm::SimpleValueType I should use for each channel?
> >>> the current llvm::SimpleValueType contains i1, i8, i16, i32,
i64,  
> >>> f32,
> >>> f64, f80, none of them are fit one channel (24-bit).
>
> >>> I think I can use i32 or f32 to represent each 24-bit channel,
if  
> >>> the
> >>> runtime result of some machine instructions exceeds 23-bit (1
bit is
> >>> for sign), then it is an overflow.
> >>> Is it correct to claim that the programmers needs to revise
his
> >>> program to fix this problem?
> >>> Am I right or wrong about this thought?
>
> >>> If there is a chip, whose registers are 24-bit long, and you
have to
> >>> compile C/C++ programs on it.
> >>> How would you represent the following statement?
>
> >>> int a = 3;
> >>> (Programmers think sizeof(int) = 4)
>
> >>> Wei.
>
> >>> On Nov 19, 2:01 am, Evan Cheng <evan.ch... at apple.com>
wrote:
> >>>> Why not model each channel as a separate physical
register?
>
> >>>> Evan
>
> >>>> On Nov 17, 2008, at 6:36 AM, Wei wrote:
>
> >>>>> I have a very strange and complicate H/W platform.
> >>>>> It has many registers in one format.
> >>>>> The register format is:
>
> >>>>> ------------------------------
> >>>>>
----------------------------------------------------------------------------------------
> >>>>> |         24-bit                |             24-bit
> >>>>> |           24-bit               |              24-
> >>>>> bit            |
> >>>>>
----------------------------------------------------------------------------------------------------------------------
> >>>>>               a
> >>>>> b
> >>>>> c                                       d
>
> >>>>> There are 4 channels in a register, and each channel
contains 24-
> >>>>> bit, hence, there are total 96-bit in 'one'
register.
> >>>>> You can store a 24-bit integer or a s7.16
floating-point data into
> >>>>> each channel.
> >>>>> You can name each channel 'a', 'b',
'c', 'd'.
>
> >>>>> Here is an example of the operation in this H/W
platform:
>
> >>>>>             ADD      R3.ab, R1.abab, R2.bbaa
>
> >>>>> it means
>
> >>>>>            Add 'abab' channel of R1 and
'bbaa' channel of R2, and
> >>>>> put the result into the 'ab' channel of R3.
>
> >>>>> It's complicate.
> >>>>> Imagine a non-existed temp register named
'Rt1', the content of  
> >>>>> its
> >>>>> 'a','b','c','d'
channel are got from 'a','b','a','b' channel of
 
> >>>>> R1,
> >>>>> and imagine another non-existed temp register named
'Rt2', the
> >>>>> content of its
'a','b','c','d' channel are got from  
> >>>>> 'b','b','a','a'
> >>>>> channel of R2.
> >>>>> and then add Rt1 & Rt2, put the result to R3
> >>>>> this means
> >>>>> the 'a' channel of R3 will be equal to the
'a' channel of Rt1 plus
> >>>>> the 'a' channel of Rt2, (i.e. 'a' from
R1 + 'b' from R2, because
> >>>>> R1.'a'bab and R2.'b'baa)
> >>>>> the 'b' channel of R3 will be equal to the
'b' channel of Rt1 plus
> >>>>> the 'b' channel of Rt2, (i.e. 'b' from
R1 + 'b' from R2, because
> >>>>> R1.a'b'ab and R2.b'b'aa)
> >>>>> the 'c' channel of R3 will be untouched, the
value of the 'c'
> >>>>> channel of Rt1 plus the 'c' channel of Rt2
(i.e. 'a' from R1 + 'a'
> >>>>> from R2, because R1.ab'a'b and
R2.bb'a'a) will be lost.
> >>>>> the 'd' channel of R3 will be untouched, too.
The value of the 'd'
> >>>>> channel of Rt1 plus the 'd' channel of Rt2
(i.e. 'b' from R1 + 'a'
> >>>>> from R2, because R1.aba'b' and
R2.bba'a') will be lost, too.
>
> >>>>> I don't know whether I can set the 'type'
of such register using a
> >>>>> llvm::MVT::SimpleValueType?
> >>>>> According the LLVM doc & LLVM source codes, I
think  
> >>>>> llvm::MVT::v8i8,
> >>>>> v2f32, etc is used to represent register for SIMD
>
> ...
>
> read more »
>
> _______________________________________________
> LLVM Developers mailing list
> LLVM... at cs.uiuc.edu      
 http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - Nov 2008 - [LLVMdev] Does current LLVM target-independent code generator supports my strange chip?

[LLVMdev] Does current LLVM target-independent code generator supports my strange chip?

[LLVMdev] Does current LLVM target-independent code generator supports my strange chip?

[LLVMdev] Does current LLVM target-independent code generator supports my strange chip?

Maybe Matching Threads