thr3ads.net - llvm dev - [llvm-dev] Parse Instruction [Sep 2015]

If this information is useful, please help other people find it:
Share via:

Pierre-Andre Saulais via llvm-dev

2015-Sep-28 13:32 UTC

[llvm-dev] Parse Instruction

Hi ES,

 From what I understand instruction parsing is divided into two parts:

- Parsing an operand list (XXXAsmParser::ParseInstruction)
- Turning the operand list into an actual instruction 
(XXXAsmParser::MatchAndEmitInstruction)

The second part does the validation (e.g. how many operands, what kind, 
etc) while the first part only does the parsing. That's why I think in 
the first part you have to handle all possible operand combinations 
(i.e. parse the first operand, and keep parsing operands as long as you 
see spaces). LLVM will reject instructions with too many operands (as 
defined in the .td files).

Is this something that would work with your assembly syntax?

Cheers,
Pierre-Andre

On 28/09/15 14:21, Sky Flyer via llvm-dev wrote:> practically I cannot use a function namly *getMnemonicAcceptInfo* 
> (mnemonic as input, and number of possible outputs as output), because 
> there are mnemonics that accepts different number of operands! :-/
>
> Any help is highly appreciated.
>
> On Mon, Sep 28, 2015 at 10:53 AM, Sky Flyer <skylake007 at
googlemail.com
> <mailto:skylake007 at googlemail.com>> wrote:
>
>     Hi all,
>
>     in most of the architectures, assembly operands are comma-separated.
>     I would like to parse an assembly code that is space-separated and
>     I am having a bit of problem.
>     In *ParseInstruction* function, I don't know what is the easiest
>     way to figure out how many operands a mnemonic expected to have.
>     In comma-separated assembly code, it just consuming commas (while
>     (getLexer().is(AsmToken::Comma))) and adds operands, but it's not
>     the case for space...
>
>     I have a dirty hack, that I manually provide such information
>     (number of operands) in a function called for example
>     getMnemonicAcceptInfo and with a for loop I parse the operand!!
>
>     What would you suggest for parsing space-separated assembly codes
>     when it comes to figuring out if a mnemonic has two operands or one?
>
>     Cheers,
>     ES
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150928/15f20c93/attachment.html>

Sky Flyer via llvm-dev

2015-Sep-28 13:41 UTC

head link

[llvm-dev] Parse Instruction

Hi Pierre-Andre

Thanks for your prompt reply.
What I mean, is located at line 4192
(http://code.woboq.org/llvm/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp.html#4192
).
It, first, has to parse the instruction, and based on the number of
operands it uses a pattern in MatchAndEmit.
My problem is, what would be a suitable substitute if operands in the
assembly code are not comma-separated, instead space-separated. (as you
know, space is automatically removed so I cannot simply switch
AsmToken::Comma to AsmToken::Space.)

Thanks a lot. :-)




On Mon, Sep 28, 2015 at 3:32 PM, Pierre-Andre Saulais <
pierre-andre at codeplay.com> wrote:
> Hi ES,
>
> From what I understand instruction parsing is divided into two parts:
>
> - Parsing an operand list (XXXAsmParser::ParseInstruction)
> - Turning the operand list into an actual instruction
> (XXXAsmParser::MatchAndEmitInstruction)
>
> The second part does the validation (e.g. how many operands, what kind,
> etc) while the first part only does the parsing. That's why I think in
the
> first part you have to handle all possible operand combinations (i.e. parse
> the first operand, and keep parsing operands as long as you see spaces).
> LLVM will reject instructions with too many operands (as defined in the .td
> files).
>
> Is this something that would work with your assembly syntax?
>
> Cheers,
> Pierre-Andre
>
>
> On 28/09/15 14:21, Sky Flyer via llvm-dev wrote:
>
> practically I cannot use a function namly *getMnemonicAcceptInfo*
> (mnemonic as input, and number of possible outputs as output), because
> there are mnemonics that accepts different number of operands! :-/
>
> Any help is highly appreciated.
>
> On Mon, Sep 28, 2015 at 10:53 AM, Sky Flyer <skylake007 at
googlemail.com>
> wrote:
>
>> Hi all,
>>
>> in most of the architectures, assembly operands are comma-separated.
>> I would like to parse an assembly code that is space-separated and I am
>> having a bit of problem.
>> In *ParseInstruction* function, I don't know what is the easiest
way to
>> figure out how many operands a mnemonic expected to have.
>> In comma-separated assembly code, it just consuming commas (while
>> (getLexer().is(AsmToken::Comma))) and adds operands, but it's not
the case
>> for space...
>>
>> I have a dirty hack, that I manually provide such information (number
of
>> operands) in a function called for example getMnemonicAcceptInfo and
with a
>> for loop I parse the operand!!
>>
>> What would you suggest for parsing space-separated assembly codes when
it
>> comes to figuring out if a mnemonic has two operands or one?
>>
>> Cheers,
>> ES
>>
>
>
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150928/5239ab2d/attachment.html>

Daniel Sanders via llvm-dev

2015-Sep-28 14:53 UTC

head link

[llvm-dev] Parse Instruction

<http://code.woboq.org/llvm/llvm/include/llvm/MC/MCParser/MCAsmLexer.h.html#llvm::AsmToken::TokenKind::EndOfStatement>Would
getLexer().isNot(AsmToken::EndOfStatement) in that condition do the trick? The
lexer is already splitting the input at spaces.

________________________________
From: llvm-dev [llvm-dev-bounces at lists.llvm.org] on behalf of Sky Flyer via
llvm-dev [llvm-dev at lists.llvm.org]
Sent: 28 September 2015 14:41
To: Pierre-Andre Saulais
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Parse Instruction

Hi Pierre-Andre

Thanks for your prompt reply.
What I mean, is located at line 4192
(http://code.woboq.org/llvm/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp.html#4192<UrlBlockedError.aspx>).
It, first, has to parse the instruction, and based on the number of operands it
uses a pattern in MatchAndEmit.
My problem is, what would be a suitable substitute if operands in the assembly
code are not comma-separated, instead space-separated. (as you know, space is
automatically removed so I cannot simply switch AsmToken::Comma to
AsmToken::Space.)

Thanks a lot. :-)

On Mon, Sep 28, 2015 at 3:32 PM, Pierre-Andre Saulais <pierre-andre at
codeplay.com<mailto:pierre-andre at codeplay.com>> wrote:
Hi ES,
>From what I understand instruction parsing is divided into two parts:
- Parsing an operand list (XXXAsmParser::ParseInstruction)
- Turning the operand list into an actual instruction
(XXXAsmParser::MatchAndEmitInstruction)

The second part does the validation (e.g. how many operands, what kind, etc)
while the first part only does the parsing. That's why I think in the first
part you have to handle all possible operand combinations (i.e. parse the first
operand, and keep parsing operands as long as you see spaces). LLVM will reject
instructions with too many operands (as defined in the .td files).

Is this something that would work with your assembly syntax?

Cheers,
Pierre-Andre

On 28/09/15 14:21, Sky Flyer via llvm-dev wrote:
practically I cannot use a function namly getMnemonicAcceptInfo (mnemonic as
input, and number of possible outputs as output), because there are mnemonics
that accepts different number of operands! :-/

Any help is highly appreciated.

On Mon, Sep 28, 2015 at 10:53 AM, Sky Flyer <skylake007 at
googlemail.com<mailto:skylake007 at googlemail.com>> wrote:
Hi all,

in most of the architectures, assembly operands are comma-separated.
I would like to parse an assembly code that is space-separated and I am having a
bit of problem.
In ParseInstruction function, I don't know what is the easiest way to figure
out how many operands a mnemonic expected to have.
In comma-separated assembly code, it just consuming commas (while
(getLexer().is(AsmToken::Comma))) and adds operands, but it's not the case
for space...

I have a dirty hack, that I manually provide such information (number of
operands) in a function called for example getMnemonicAcceptInfo and with a for
loop I parse the operand!!

What would you suggest for parsing space-separated assembly codes when it comes
to figuring out if a mnemonic has two operands or one?

Cheers,
ES

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150928/8e575f63/attachment.html>

Pierre-Andre Saulais via llvm-dev

2015-Sep-28 14:58 UTC

head link

[llvm-dev] Parse Instruction

Ah I see, I didn't think about spaces being ignored :)

I just checked and MCAsmLexer has a setSkipSpace function that could be 
used to not ignore whitespace when parsing. I haven't tried it out though.

Cheers,
Pierre-Andre

On 28/09/15 14:41, Sky Flyer wrote:> Hi Pierre-Andre
>
> Thanks for your prompt reply.
> What I mean, is located at line 4192 
>
(http://code.woboq.org/llvm/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp.html#4192
>
<%28http://code.woboq.org/llvm/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp.html#4192>).
> It, first, has to parse the instruction, and based on the number of 
> operands it uses a pattern in MatchAndEmit.
> My problem is, what would be a suitable substitute if operands in the 
> assembly code are not comma-separated, instead space-separated. (as 
> you know, space is automatically removed so I cannot simply switch 
> AsmToken::Comma to AsmToken::Space.)
>
> Thanks a lot. :-)
>
>
>
>
> On Mon, Sep 28, 2015 at 3:32 PM, Pierre-Andre Saulais 
> <pierre-andre at codeplay.com <mailto:pierre-andre at
codeplay.com>> wrote:
>
>     Hi ES,
>
>     From what I understand instruction parsing is divided into two parts:
>
>     - Parsing an operand list (XXXAsmParser::ParseInstruction)
>     - Turning the operand list into an actual instruction
>     (XXXAsmParser::MatchAndEmitInstruction)
>
>     The second part does the validation (e.g. how many operands, what
>     kind, etc) while the first part only does the parsing. That's why
>     I think in the first part you have to handle all possible operand
>     combinations (i.e. parse the first operand, and keep parsing
>     operands as long as you see spaces). LLVM will reject instructions
>     with too many operands (as defined in the .td files).
>
>     Is this something that would work with your assembly syntax?
>
>     Cheers,
>     Pierre-Andre
>
>
>     On 28/09/15 14:21, Sky Flyer via llvm-dev wrote:
>>     practically I cannot use a function namly *getMnemonicAcceptInfo*
>>     (mnemonic as input, and number of possible outputs as output),
>>     because there are mnemonics that accepts different number of
>>     operands! :-/
>>
>>     Any help is highly appreciated.
>>
>>     On Mon, Sep 28, 2015 at 10:53 AM, Sky Flyer
>>     <skylake007 at googlemail.com <mailto:skylake007 at
googlemail.com>> wrote:
>>
>>         Hi all,
>>
>>         in most of the architectures, assembly operands are
>>         comma-separated.
>>         I would like to parse an assembly code that is
>>         space-separated and I am having a bit of problem.
>>         In *ParseInstruction* function, I don't know what is the
>>         easiest way to figure out how many operands a mnemonic
>>         expected to have.
>>         In comma-separated assembly code, it just consuming commas
>>         (while (getLexer().is(AsmToken::Comma))) and adds operands,
>>         but it's not the case for space...
>>
>>         I have a dirty hack, that I manually provide such information
>>         (number of operands) in a function called for example
>>         getMnemonicAcceptInfo and with a for loop I parse the operand!!
>>
>>         What would you suggest for parsing space-separated assembly
>>         codes when it comes to figuring out if a mnemonic has two
>>         operands or one?
>>
>>         Cheers,
>>         ES
>>
>>
>>
>>
>>     _______________________________________________
>>     LLVM Developers mailing list
>>     llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150928/484e1269/attachment.html>

llvm dev - Sep 2015 - Parse Instruction

[llvm-dev] Parse Instruction

[llvm-dev] Parse Instruction

[llvm-dev] Parse Instruction

[llvm-dev] Parse Instruction