Cy Cheng via llvm-dev
2018-Dec-07 01:46 UTC
[llvm-dev] Implement VLIW Backend on LLVM (Assembler Related Questions)
Hello,
I want to implement LLVM backend for a specific VLIW hardware. I am working
on defining its instruction set, and assembly language.
The hardware has two pipelines, int and float. Each pipeline can do 3
operations/cycle, 3 operations forms an instruction.
One of the Integer Instruction looks like this:
add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq
An int instruction and a float instruction forms a VLIW instruction
(bundle), e.g.
{
add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq
fadd Fi, Fj, Fk; fadd Fl, Fm, Fn; fadd Fo, Fp, Fq
}
I want to express above concept in this way:
// Assembly Language
{
add Ri, Rj, Rk
add Rl, Rm, Rn
add Ro, Rp, Rq
fadd Fi, Fj, Fk
fadd Fl, Fm, Fn
fadd Fo, Fp, Fq
}
Q1:
My first question is, the instruction encoding can only be determined after
parser has finished parsing the entire bundle.
e.g. When parser see "add Ri, Rj, Rk", it generates one encoding, but
when
parser see another "add Ri, Rj, Rk", it will modify previously
generated
encoding.
I would like to know can LLVM's assembler support this?
Or I should define my instruction in this way:
add_type1 Ri, Rj, Rk
add_type2 Ri, Rj, Rk, Rl, Rm, Rn
add_type3 Ri, Rj, Rk, Rl, Rm, Rn, Ro, Rp, Rq
Q2.
Some of the instructions need to setup additional configuration, e.g.
{
scache wa ; Set cache mode: write allocate
ssize 64 ; Set write size = 64 bits
sendian big ; Set big endian writing
store R0, 0x1000000 ; Write "R0" to 0x1000000
}
So, again, parser has to parse the entire bundle to generate correct
encoding.
Or I should define my instruction in this way:
store R0, 0x1000000, wa, 64, big, .... (10 options can be set)
Q3.
The destination register can be omitted, e.g.
add , Rj, Rk
So can I use this form to express omitting destination, or I should define
new instruction for it?
e.g.
add_no_dest Rj, Rk
Q4.
Can I define the instruction which has the same name but with different
count of operands, e.g.
fadd Fi, Fj, Fk
fadd Fl, Fm, Fn, rounding_mode
So fadd has two versions
(a) normal rounding
(b) special rounding mode
Or I should define it in this way:
fadd
fadd_round_mode1
fadd_round_mode2
..
fadd_round_mode15
(16 rounding mode)
Thank You,
CY
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20181207/6e3138b5/attachment.html>
via llvm-dev
2018-Dec-10 18:09 UTC
[llvm-dev] Implement VLIW Backend on LLVM (Assembler Related Questions)
I believe the assembler parser does not immediately emit the object-file
encoding, but produces an internal machine-instruction form that is later
encoded and emitted. This should give you an opportunity to make choices about
encoding after the parsing is complete.
I don't know enough about how instruction syntax is specified to answer your
other questions.
--paulr
From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Cy Cheng
via llvm-dev
Sent: Thursday, December 06, 2018 8:47 PM
To: llvm-dev at lists.llvm.org
Subject: [llvm-dev] Implement VLIW Backend on LLVM (Assembler Related Questions)
Hello,
I want to implement LLVM backend for a specific VLIW hardware. I am working on
defining its instruction set, and assembly language.
The hardware has two pipelines, int and float. Each pipeline can do 3
operations/cycle, 3 operations forms an instruction.
One of the Integer Instruction looks like this:
add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq
An int instruction and a float instruction forms a VLIW instruction (bundle),
e.g.
{
add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq
fadd Fi, Fj, Fk; fadd Fl, Fm, Fn; fadd Fo, Fp, Fq
}
I want to express above concept in this way:
// Assembly Language
{
add Ri, Rj, Rk
add Rl, Rm, Rn
add Ro, Rp, Rq
fadd Fi, Fj, Fk
fadd Fl, Fm, Fn
fadd Fo, Fp, Fq
}
Q1:
My first question is, the instruction encoding can only be determined after
parser has finished parsing the entire bundle.
e.g. When parser see "add Ri, Rj, Rk", it generates one encoding, but
when parser see another "add Ri, Rj, Rk", it will modify previously
generated encoding.
I would like to know can LLVM's assembler support this?
Or I should define my instruction in this way:
add_type1 Ri, Rj, Rk
add_type2 Ri, Rj, Rk, Rl, Rm, Rn
add_type3 Ri, Rj, Rk, Rl, Rm, Rn, Ro, Rp, Rq
Q2.
Some of the instructions need to setup additional configuration, e.g.
{
scache wa ; Set cache mode: write allocate
ssize 64 ; Set write size = 64 bits
sendian big ; Set big endian writing
store R0, 0x1000000 ; Write "R0" to 0x1000000
}
So, again, parser has to parse the entire bundle to generate correct encoding.
Or I should define my instruction in this way:
store R0, 0x1000000, wa, 64, big, .... (10 options can be set)
Q3.
The destination register can be omitted, e.g.
add , Rj, Rk
So can I use this form to express omitting destination, or I should define new
instruction for it?
e.g.
add_no_dest Rj, Rk
Q4.
Can I define the instruction which has the same name but with different count of
operands, e.g.
fadd Fi, Fj, Fk
fadd Fl, Fm, Fn, rounding_mode
So fadd has two versions
(a) normal rounding
(b) special rounding mode
Or I should define it in this way:
fadd
fadd_round_mode1
fadd_round_mode2
..
fadd_round_mode15
(16 rounding mode)
Thank You,
CY
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20181210/7d7fd6f4/attachment.html>
Krzysztof Parzyszek via llvm-dev
2018-Dec-10 19:19 UTC
[llvm-dev] Implement VLIW Backend on LLVM (Assembler Related Questions)
In the intermediate language that assembler works on an instruction is
represented by an MCInst. An MCInst can have other instructions as
operands, and this is how the Hexagon backend implements bundles.
A top-level MCInst (i.e. the entire bundle) is encoded all at once from
the point of view of the target-independent mechanisms. Those mechanisms
use target-specific code that each implementation needs to provide, and
in your code you can handle each bundle as you want.
Check MCCodeEmitter and how different targets implement it.
As for the syntax---the parser needs to be able to determine the bundle
boundary. (For example Hexagon uses braces {} to enclose each bundle.)
The way the assembler works is that it constructs an instruction and
passes it to the associated streamer. The streamer is typically an
assembly streamer (i.e. printing the instruction assembly), or an object
file streamer (e.g. ELF, etc.)
The answers to all your questions are "yes", or "it's
doable", but the
degree of complexity may vary between different choices.
The major suggestion that I have is to make sure that the syntax is
unambiguous, specifically when it comes to bundle boundaries. Another
suggestion is to maintain the "mnemonic op, op, ..." syntax for
individual instructions (i.e. mnemonic followed by a list of operands).
Hexagon has its own assembly syntax that doesn't follow that, and it
makes things a bit more complicated.
-Krzysztof
On 12/6/2018 7:46 PM, Cy Cheng via llvm-dev wrote:> Hello,
>
> I want to implement LLVM backend for a specific VLIW hardware. I am
> working on defining its instruction set, and assembly language.
>
> The hardware has two pipelines, int and float. Each pipeline can do 3
> operations/cycle, 3 operations forms an instruction.
>
> One of the Integer Instruction looks like this:
> add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq
>
> An int instruction and a float instruction forms a VLIW instruction
> (bundle), e.g.
>
> {
> add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq
> fadd Fi, Fj, Fk; fadd Fl, Fm, Fn; fadd Fo, Fp, Fq
> }
>
> I want to express above concept in this way:
> // Assembly Language
> {
> add Ri, Rj, Rk
> add Rl, Rm, Rn
> add Ro, Rp, Rq
> fadd Fi, Fj, Fk
> fadd Fl, Fm, Fn
> fadd Fo, Fp, Fq
> }
>
> Q1:
> My first question is, the instruction encoding can only be determined
> after parser has finished parsing the entire bundle.
>
> e.g. When parser see "add Ri, Rj, Rk", it generates one encoding,
but
> when parser see another "add Ri, Rj, Rk", it will modify
previously
> generated encoding.
>
> I would like to know can LLVM's assembler support this?
> Or I should define my instruction in this way:
> add_type1 Ri, Rj, Rk
> add_type2 Ri, Rj, Rk, Rl, Rm, Rn
> add_type3 Ri, Rj, Rk, Rl, Rm, Rn, Ro, Rp, Rq
>
> Q2.
> Some of the instructions need to setup additional configuration, e.g.
> {
> scache wa ; Set cache mode: write allocate
> ssize 64 ; Set write size = 64 bits
> sendian big ; Set big endian writing
> store R0, 0x1000000 ; Write "R0" to 0x1000000
> }
>
> So, again, parser has to parse the entire bundle to generate correct
> encoding.
> Or I should define my instruction in this way:
>
> store R0, 0x1000000, wa, 64, big, .... (10 options can be set)
>
> Q3.
> The destination register can be omitted, e.g.
> add , Rj, Rk
>
> So can I use this form to express omitting destination, or I should
> define new instruction for it?
> e.g.
> add_no_dest Rj, Rk
>
> Q4.
> Can I define the instruction which has the same name but with different
> count of operands, e.g.
> fadd Fi, Fj, Fk
> fadd Fl, Fm, Fn, rounding_mode
>
> So fadd has two versions
> (a) normal rounding
> (b) special rounding mode
> Or I should define it in this way:
> fadd
> fadd_round_mode1
> fadd_round_mode2
> ..
> fadd_round_mode15
> (16 rounding mode)
>
> Thank You,
> CY
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
Cy Cheng via llvm-dev
2018-Dec-11 22:36 UTC
[llvm-dev] Implement VLIW Backend on LLVM (Assembler Related Questions)
Hi paulr, Thank you for your response :) Hi Krzysztof, This is really helpful! Thank you for your guidance!! I would like to trace the Hexagon's llvm implementation. I am very interested on how Hexagon implement instruction pattern matching, instruction scheduling, and register allocation, could you give me some suggestions or reading lists to help me understand Hexagon's llvm implementation? Thank you :) CY 2018年12月11日(火) 4:19 Krzysztof Parzyszek via llvm-dev < llvm-dev at lists.llvm.org>:> In the intermediate language that assembler works on an instruction is > represented by an MCInst. An MCInst can have other instructions as > operands, and this is how the Hexagon backend implements bundles. > > A top-level MCInst (i.e. the entire bundle) is encoded all at once from > the point of view of the target-independent mechanisms. Those mechanisms > use target-specific code that each implementation needs to provide, and > in your code you can handle each bundle as you want. > > Check MCCodeEmitter and how different targets implement it. > > As for the syntax---the parser needs to be able to determine the bundle > boundary. (For example Hexagon uses braces {} to enclose each bundle.) > The way the assembler works is that it constructs an instruction and > passes it to the associated streamer. The streamer is typically an > assembly streamer (i.e. printing the instruction assembly), or an object > file streamer (e.g. ELF, etc.) > > The answers to all your questions are "yes", or "it's doable", but the > degree of complexity may vary between different choices. > > The major suggestion that I have is to make sure that the syntax is > unambiguous, specifically when it comes to bundle boundaries. Another > suggestion is to maintain the "mnemonic op, op, ..." syntax for > individual instructions (i.e. mnemonic followed by a list of operands). > Hexagon has its own assembly syntax that doesn't follow that, and it > makes things a bit more complicated. > > -Krzysztof > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > hosted by The Linux Foundation > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181212/2d4587f2/attachment.html>