Cy Cheng via llvm-dev
2018-Dec-07 01:46 UTC
[llvm-dev] Implement VLIW Backend on LLVM (Assembler Related Questions)
Hello, I want to implement LLVM backend for a specific VLIW hardware. I am working on defining its instruction set, and assembly language. The hardware has two pipelines, int and float. Each pipeline can do 3 operations/cycle, 3 operations forms an instruction. One of the Integer Instruction looks like this: add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq An int instruction and a float instruction forms a VLIW instruction (bundle), e.g. { add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq fadd Fi, Fj, Fk; fadd Fl, Fm, Fn; fadd Fo, Fp, Fq } I want to express above concept in this way: // Assembly Language { add Ri, Rj, Rk add Rl, Rm, Rn add Ro, Rp, Rq fadd Fi, Fj, Fk fadd Fl, Fm, Fn fadd Fo, Fp, Fq } Q1: My first question is, the instruction encoding can only be determined after parser has finished parsing the entire bundle. e.g. When parser see "add Ri, Rj, Rk", it generates one encoding, but when parser see another "add Ri, Rj, Rk", it will modify previously generated encoding. I would like to know can LLVM's assembler support this? Or I should define my instruction in this way: add_type1 Ri, Rj, Rk add_type2 Ri, Rj, Rk, Rl, Rm, Rn add_type3 Ri, Rj, Rk, Rl, Rm, Rn, Ro, Rp, Rq Q2. Some of the instructions need to setup additional configuration, e.g. { scache wa ; Set cache mode: write allocate ssize 64 ; Set write size = 64 bits sendian big ; Set big endian writing store R0, 0x1000000 ; Write "R0" to 0x1000000 } So, again, parser has to parse the entire bundle to generate correct encoding. Or I should define my instruction in this way: store R0, 0x1000000, wa, 64, big, .... (10 options can be set) Q3. The destination register can be omitted, e.g. add , Rj, Rk So can I use this form to express omitting destination, or I should define new instruction for it? e.g. add_no_dest Rj, Rk Q4. Can I define the instruction which has the same name but with different count of operands, e.g. fadd Fi, Fj, Fk fadd Fl, Fm, Fn, rounding_mode So fadd has two versions (a) normal rounding (b) special rounding mode Or I should define it in this way: fadd fadd_round_mode1 fadd_round_mode2 .. fadd_round_mode15 (16 rounding mode) Thank You, CY -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181207/6e3138b5/attachment.html>
via llvm-dev
2018-Dec-10 18:09 UTC
[llvm-dev] Implement VLIW Backend on LLVM (Assembler Related Questions)
I believe the assembler parser does not immediately emit the object-file encoding, but produces an internal machine-instruction form that is later encoded and emitted. This should give you an opportunity to make choices about encoding after the parsing is complete. I don't know enough about how instruction syntax is specified to answer your other questions. --paulr From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Cy Cheng via llvm-dev Sent: Thursday, December 06, 2018 8:47 PM To: llvm-dev at lists.llvm.org Subject: [llvm-dev] Implement VLIW Backend on LLVM (Assembler Related Questions) Hello, I want to implement LLVM backend for a specific VLIW hardware. I am working on defining its instruction set, and assembly language. The hardware has two pipelines, int and float. Each pipeline can do 3 operations/cycle, 3 operations forms an instruction. One of the Integer Instruction looks like this: add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq An int instruction and a float instruction forms a VLIW instruction (bundle), e.g. { add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq fadd Fi, Fj, Fk; fadd Fl, Fm, Fn; fadd Fo, Fp, Fq } I want to express above concept in this way: // Assembly Language { add Ri, Rj, Rk add Rl, Rm, Rn add Ro, Rp, Rq fadd Fi, Fj, Fk fadd Fl, Fm, Fn fadd Fo, Fp, Fq } Q1: My first question is, the instruction encoding can only be determined after parser has finished parsing the entire bundle. e.g. When parser see "add Ri, Rj, Rk", it generates one encoding, but when parser see another "add Ri, Rj, Rk", it will modify previously generated encoding. I would like to know can LLVM's assembler support this? Or I should define my instruction in this way: add_type1 Ri, Rj, Rk add_type2 Ri, Rj, Rk, Rl, Rm, Rn add_type3 Ri, Rj, Rk, Rl, Rm, Rn, Ro, Rp, Rq Q2. Some of the instructions need to setup additional configuration, e.g. { scache wa ; Set cache mode: write allocate ssize 64 ; Set write size = 64 bits sendian big ; Set big endian writing store R0, 0x1000000 ; Write "R0" to 0x1000000 } So, again, parser has to parse the entire bundle to generate correct encoding. Or I should define my instruction in this way: store R0, 0x1000000, wa, 64, big, .... (10 options can be set) Q3. The destination register can be omitted, e.g. add , Rj, Rk So can I use this form to express omitting destination, or I should define new instruction for it? e.g. add_no_dest Rj, Rk Q4. Can I define the instruction which has the same name but with different count of operands, e.g. fadd Fi, Fj, Fk fadd Fl, Fm, Fn, rounding_mode So fadd has two versions (a) normal rounding (b) special rounding mode Or I should define it in this way: fadd fadd_round_mode1 fadd_round_mode2 .. fadd_round_mode15 (16 rounding mode) Thank You, CY -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181210/7d7fd6f4/attachment.html>
Krzysztof Parzyszek via llvm-dev
2018-Dec-10 19:19 UTC
[llvm-dev] Implement VLIW Backend on LLVM (Assembler Related Questions)
In the intermediate language that assembler works on an instruction is represented by an MCInst. An MCInst can have other instructions as operands, and this is how the Hexagon backend implements bundles. A top-level MCInst (i.e. the entire bundle) is encoded all at once from the point of view of the target-independent mechanisms. Those mechanisms use target-specific code that each implementation needs to provide, and in your code you can handle each bundle as you want. Check MCCodeEmitter and how different targets implement it. As for the syntax---the parser needs to be able to determine the bundle boundary. (For example Hexagon uses braces {} to enclose each bundle.) The way the assembler works is that it constructs an instruction and passes it to the associated streamer. The streamer is typically an assembly streamer (i.e. printing the instruction assembly), or an object file streamer (e.g. ELF, etc.) The answers to all your questions are "yes", or "it's doable", but the degree of complexity may vary between different choices. The major suggestion that I have is to make sure that the syntax is unambiguous, specifically when it comes to bundle boundaries. Another suggestion is to maintain the "mnemonic op, op, ..." syntax for individual instructions (i.e. mnemonic followed by a list of operands). Hexagon has its own assembly syntax that doesn't follow that, and it makes things a bit more complicated. -Krzysztof On 12/6/2018 7:46 PM, Cy Cheng via llvm-dev wrote:> Hello, > > I want to implement LLVM backend for a specific VLIW hardware. I am > working on defining its instruction set, and assembly language. > > The hardware has two pipelines, int and float. Each pipeline can do 3 > operations/cycle, 3 operations forms an instruction. > > One of the Integer Instruction looks like this: > add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq > > An int instruction and a float instruction forms a VLIW instruction > (bundle), e.g. > > { > add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq > fadd Fi, Fj, Fk; fadd Fl, Fm, Fn; fadd Fo, Fp, Fq > } > > I want to express above concept in this way: > // Assembly Language > { > add Ri, Rj, Rk > add Rl, Rm, Rn > add Ro, Rp, Rq > fadd Fi, Fj, Fk > fadd Fl, Fm, Fn > fadd Fo, Fp, Fq > } > > Q1: > My first question is, the instruction encoding can only be determined > after parser has finished parsing the entire bundle. > > e.g. When parser see "add Ri, Rj, Rk", it generates one encoding, but > when parser see another "add Ri, Rj, Rk", it will modify previously > generated encoding. > > I would like to know can LLVM's assembler support this? > Or I should define my instruction in this way: > add_type1 Ri, Rj, Rk > add_type2 Ri, Rj, Rk, Rl, Rm, Rn > add_type3 Ri, Rj, Rk, Rl, Rm, Rn, Ro, Rp, Rq > > Q2. > Some of the instructions need to setup additional configuration, e.g. > { > scache wa ; Set cache mode: write allocate > ssize 64 ; Set write size = 64 bits > sendian big ; Set big endian writing > store R0, 0x1000000 ; Write "R0" to 0x1000000 > } > > So, again, parser has to parse the entire bundle to generate correct > encoding. > Or I should define my instruction in this way: > > store R0, 0x1000000, wa, 64, big, .... (10 options can be set) > > Q3. > The destination register can be omitted, e.g. > add , Rj, Rk > > So can I use this form to express omitting destination, or I should > define new instruction for it? > e.g. > add_no_dest Rj, Rk > > Q4. > Can I define the instruction which has the same name but with different > count of operands, e.g. > fadd Fi, Fj, Fk > fadd Fl, Fm, Fn, rounding_mode > > So fadd has two versions > (a) normal rounding > (b) special rounding mode > Or I should define it in this way: > fadd > fadd_round_mode1 > fadd_round_mode2 > .. > fadd_round_mode15 > (16 rounding mode) > > Thank You, > CY > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Cy Cheng via llvm-dev
2018-Dec-11 22:36 UTC
[llvm-dev] Implement VLIW Backend on LLVM (Assembler Related Questions)
Hi paulr, Thank you for your response :) Hi Krzysztof, This is really helpful! Thank you for your guidance!! I would like to trace the Hexagon's llvm implementation. I am very interested on how Hexagon implement instruction pattern matching, instruction scheduling, and register allocation, could you give me some suggestions or reading lists to help me understand Hexagon's llvm implementation? Thank you :) CY 2018年12月11日(火) 4:19 Krzysztof Parzyszek via llvm-dev < llvm-dev at lists.llvm.org>:> In the intermediate language that assembler works on an instruction is > represented by an MCInst. An MCInst can have other instructions as > operands, and this is how the Hexagon backend implements bundles. > > A top-level MCInst (i.e. the entire bundle) is encoded all at once from > the point of view of the target-independent mechanisms. Those mechanisms > use target-specific code that each implementation needs to provide, and > in your code you can handle each bundle as you want. > > Check MCCodeEmitter and how different targets implement it. > > As for the syntax---the parser needs to be able to determine the bundle > boundary. (For example Hexagon uses braces {} to enclose each bundle.) > The way the assembler works is that it constructs an instruction and > passes it to the associated streamer. The streamer is typically an > assembly streamer (i.e. printing the instruction assembly), or an object > file streamer (e.g. ELF, etc.) > > The answers to all your questions are "yes", or "it's doable", but the > degree of complexity may vary between different choices. > > The major suggestion that I have is to make sure that the syntax is > unambiguous, specifically when it comes to bundle boundaries. Another > suggestion is to maintain the "mnemonic op, op, ..." syntax for > individual instructions (i.e. mnemonic followed by a list of operands). > Hexagon has its own assembly syntax that doesn't follow that, and it > makes things a bit more complicated. > > -Krzysztof > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > hosted by The Linux Foundation > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181212/2d4587f2/attachment.html>