Tzu-Chien Chiu
2005-Jul-27 17:29 UTC
[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)
Each register is a 4-component (namely, r, g, b, a) vector register. They are actually defined as llvm packed [4xfloat]. The instruction: add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz Explaination: '.a' is a writemask. only the specified component will be update '.xxyy' and '.zzzz' are swizzle masks, specify the component permutation, simliar to the Intel SSE permutation instruction SHUFPD '_bias' and '_x2' are modifiers. they modify the value of source operands and send the modified values to the adder. '_bias' = source - 0.5, '_x2' = source * 2 '_sat' is an instruction modifier. when specified, it saturates (or clamps) the instruction result to the range [0, 1] before writing to the destination register. All of these 'writemask', 'swizzle', 'source modifier', and 'instruction modifiers' are optionally specified. How should I define the instruction in a TableGen .td file? I have two alternatives: 1. class WriteMask : Operand<i8> {} def WM : WriteMask; class Swizzle : Operand<8> {} def SW: Swizzle; class InstructionModifier : Operand<i8> {} def IM: InstructionModifier ; class SourceModifier : Operand<i8> {} def SM: SourceModifier ; def ADD<0x01, (ops GPR:$dest, ops WM:$wm, IM:$im, GPR:$src0, SW:$sw0, SM:$sm0, GPR:$src1, SW:$sw1 SM:$sm1 ), ... > 2. add llvm intrinsics: ; add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz r1_1 = llvm.bias( r1_0 ) r1_2 = llvm.shuffle( xxyy ) r3_1 = llvm.x2( r3_0 ) r3_2 = llvm.shuffle( zzzz ) r0_0 = add r1_2, r3_2 r0_1 = llvm.sature( r0_0 ) r0_2 = llvm.select( a ) but it makes the implementing the instruction selector very diffifult. in this example, llvm.select() and llvm.sature() are encountered frist (bootm-up), but they must be 'remembered' and the instruction cannot be generated (BuildMI) until the opcode is known. Which one should I do? -- Tzu-Chien Chiu, 3D Graphics Hardware Architect <URL:http://www.csie.nctu.edu.tw/~jwchiu>
Morten Ofstad
2005-Jul-29 07:48 UTC
[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)
Actually the problems that Tzu-Chien Chiu are encountering are similar to what should be done for generating SSE code in the X86 backend and also other SIMD instruction sets. I think LLVM neeeds to add instructions for permuting components, extracting and injecting elements in packed types. If the architecture has instructions which can do permutations for each instruction (for example 'add' with permutation) it should be the role of the pattern instruction selector to recognise the shuffle+add combination and emit a single instruction. m. Tzu-Chien Chiu wrote:> Each register is a 4-component (namely, r, g, b, a) vector register. > They are actually defined as llvm packed [4xfloat]. > > The instruction: > > add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz > > Explaination: > > '.a' is a writemask. only the specified component will be update > > '.xxyy' and '.zzzz' are swizzle masks, specify the component > permutation, simliar to the Intel SSE permutation instruction SHUFPD > > '_bias' and '_x2' are modifiers. they modify the value of source > operands and send the modified values to the adder. '_bias' = source - > 0.5, '_x2' = source * 2 > > '_sat' is an instruction modifier. when specified, it saturates (or > clamps) the instruction result to the range [0, 1] before writing to > the destination register. > > All of these 'writemask', 'swizzle', 'source modifier', and > 'instruction modifiers' are optionally specified. > > How should I define the instruction in a TableGen .td file? > > I have two alternatives: > > 1. > class WriteMask : Operand<i8> {} > def WM : WriteMask; > > class Swizzle : Operand<8> {} > def SW: Swizzle; > > class InstructionModifier : Operand<i8> {} > def IM: InstructionModifier ; > > class SourceModifier : Operand<i8> {} > def SM: SourceModifier ; > > def ADD<0x01, (ops > GPR:$dest, ops WM:$wm, IM:$im, > GPR:$src0, SW:$sw0, SM:$sm0, > GPR:$src1, SW:$sw1 SM:$sm1 ), ... > > > 2. add llvm intrinsics: > > ; add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz > r1_1 = llvm.bias( r1_0 ) > r1_2 = llvm.shuffle( xxyy ) > r3_1 = llvm.x2( r3_0 ) > r3_2 = llvm.shuffle( zzzz ) > r0_0 = add r1_2, r3_2 > r0_1 = llvm.sature( r0_0 ) > r0_2 = llvm.select( a ) > > but it makes the implementing the instruction selector very diffifult. > in this example, llvm.select() and llvm.sature() are encountered frist > (bootm-up), but they must be 'remembered' and the instruction cannot > be generated (BuildMI) until the opcode is known. > > Which one should I do? >
Robert L. Bocchino Jr.
2005-Jul-29 23:15 UTC
[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)
Hi, I am working on this. Part of my Ph.D. thesis work involves extending the LLVM instruction set to express vector parallelism, including but not limited to subword SIMD-style parallelism. We already have extract and inject (we call it combine) instructions. Permutation is something we are going to add. All of this will be checked into LLVM at some point, but I'm not sure when. If you would like to discuss this or have suggestions, your input would be welcome. Rob On Jul 29, 2005, at 2:48 AM, Morten Ofstad wrote:> Actually the problems that Tzu-Chien Chiu are encountering are similar > to what should be done for generating SSE code in the X86 backend and > also other SIMD instruction sets. I think LLVM neeeds to add > instructions for permuting components, extracting and injecting > elements in packed types. If the architecture has instructions which > can do permutations for each instruction (for example 'add' with > permutation) it should be the role of the pattern instruction selector > to recognise the shuffle+add combination and emit a single > instruction. > > m. > > Tzu-Chien Chiu wrote: >> Each register is a 4-component (namely, r, g, b, a) vector register. >> They are actually defined as llvm packed [4xfloat]. >> The instruction: >> add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz >> Explaination: >> '.a' is a writemask. only the specified component will be update >> '.xxyy' and '.zzzz' are swizzle masks, specify the component >> permutation, simliar to the Intel SSE permutation instruction SHUFPD >> '_bias' and '_x2' are modifiers. they modify the value of source >> operands and send the modified values to the adder. '_bias' = source - >> 0.5, '_x2' = source * 2 >> '_sat' is an instruction modifier. when specified, it saturates (or >> clamps) the instruction result to the range [0, 1] before writing to >> the destination register. >> All of these 'writemask', 'swizzle', 'source modifier', and >> 'instruction modifiers' are optionally specified. >> How should I define the instruction in a TableGen .td file? >> I have two alternatives: >> 1. class WriteMask : Operand<i8> {} >> def WM : WriteMask; >> class Swizzle : Operand<8> {} >> def SW: Swizzle; >> class InstructionModifier : Operand<i8> {} >> def IM: InstructionModifier ; >> class SourceModifier : Operand<i8> {} >> def SM: SourceModifier ; >> def ADD<0x01, (ops GPR:$dest, ops WM:$wm, IM:$im, >> GPR:$src0, SW:$sw0, SM:$sm0, >> GPR:$src1, SW:$sw1 SM:$sm1 ), ... > >> 2. add llvm intrinsics: >> ; add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz >> r1_1 = llvm.bias( r1_0 ) >> r1_2 = llvm.shuffle( xxyy ) >> r3_1 = llvm.x2( r3_0 ) >> r3_2 = llvm.shuffle( zzzz ) >> r0_0 = add r1_2, r3_2 >> r0_1 = llvm.sature( r0_0 ) >> r0_2 = llvm.select( a ) >> but it makes the implementing the instruction selector very diffifult. >> in this example, llvm.select() and llvm.sature() are encountered frist >> (bootm-up), but they must be 'remembered' and the instruction cannot >> be generated (BuildMI) until the opcode is known. >> Which one should I do? > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >Robert L. Bocchino Jr. Ph.D. Student University of Illinois, Urbana-Champaign
Chris Lattner
2005-Aug-02 22:11 UTC
[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)
On Fri, 29 Jul 2005, Morten Ofstad wrote:> Actually the problems that Tzu-Chien Chiu are encountering are similar to > what should be done for generating SSE code in the X86 backend and also other > SIMD instruction sets. I think LLVM neeeds to add instructions for permuting > components, extracting and injecting elements in packed types. If the > architecture has instructions which can do permutations for each instruction > (for example 'add' with permutation) it should be the role of the pattern > instruction selector to recognise the shuffle+add combination and emit a > single instruction.Agreed 100%. -Chris> Tzu-Chien Chiu wrote: >> Each register is a 4-component (namely, r, g, b, a) vector register. They >> are actually defined as llvm packed [4xfloat]. >> >> The instruction: >> >> add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz >> >> Explaination: >> >> '.a' is a writemask. only the specified component will be update >> >> '.xxyy' and '.zzzz' are swizzle masks, specify the component >> permutation, simliar to the Intel SSE permutation instruction SHUFPD >> >> '_bias' and '_x2' are modifiers. they modify the value of source >> operands and send the modified values to the adder. '_bias' = source - >> 0.5, '_x2' = source * 2 >> >> '_sat' is an instruction modifier. when specified, it saturates (or >> clamps) the instruction result to the range [0, 1] before writing to >> the destination register. >> >> All of these 'writemask', 'swizzle', 'source modifier', and >> 'instruction modifiers' are optionally specified. >> >> How should I define the instruction in a TableGen .td file? >> >> I have two alternatives: >> >> 1. class WriteMask : Operand<i8> {} >> def WM : WriteMask; >> >> class Swizzle : Operand<8> {} >> def SW: Swizzle; >> >> class InstructionModifier : Operand<i8> {} >> def IM: InstructionModifier ; >> class SourceModifier : Operand<i8> {} >> def SM: SourceModifier ; >> >> def ADD<0x01, (ops GPR:$dest, ops WM:$wm, IM:$im, GPR:$src0, >> SW:$sw0, SM:$sm0, >> GPR:$src1, SW:$sw1 SM:$sm1 ), ... > >> >> 2. add llvm intrinsics: >> >> ; add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz >> r1_1 = llvm.bias( r1_0 ) >> r1_2 = llvm.shuffle( xxyy ) >> r3_1 = llvm.x2( r3_0 ) >> r3_2 = llvm.shuffle( zzzz ) >> r0_0 = add r1_2, r3_2 >> r0_1 = llvm.sature( r0_0 ) >> r0_2 = llvm.select( a ) >> >> but it makes the implementing the instruction selector very diffifult. >> in this example, llvm.select() and llvm.sature() are encountered frist >> (bootm-up), but they must be 'remembered' and the instruction cannot >> be generated (BuildMI) until the opcode is known. >> >> Which one should I do? >> > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-Chris -- http://nondot.org/sabre/ http://llvm.org/