Hi ES, From what I understand instruction parsing is divided into two parts: - Parsing an operand list (XXXAsmParser::ParseInstruction) - Turning the operand list into an actual instruction (XXXAsmParser::MatchAndEmitInstruction) The second part does the validation (e.g. how many operands, what kind, etc) while the first part only does the parsing. That's why I think in the first part you have to handle all possible operand combinations (i.e. parse the first operand, and keep parsing operands as long as you see spaces). LLVM will reject instructions with too many operands (as defined in the .td files). Is this something that would work with your assembly syntax? Cheers, Pierre-Andre On 28/09/15 14:21, Sky Flyer via llvm-dev wrote:> practically I cannot use a function namly *getMnemonicAcceptInfo* > (mnemonic as input, and number of possible outputs as output), because > there are mnemonics that accepts different number of operands! :-/ > > Any help is highly appreciated. > > On Mon, Sep 28, 2015 at 10:53 AM, Sky Flyer <skylake007 at googlemail.com > <mailto:skylake007 at googlemail.com>> wrote: > > Hi all, > > in most of the architectures, assembly operands are comma-separated. > I would like to parse an assembly code that is space-separated and > I am having a bit of problem. > In *ParseInstruction* function, I don't know what is the easiest > way to figure out how many operands a mnemonic expected to have. > In comma-separated assembly code, it just consuming commas (while > (getLexer().is(AsmToken::Comma))) and adds operands, but it's not > the case for space... > > I have a dirty hack, that I manually provide such information > (number of operands) in a function called for example > getMnemonicAcceptInfo and with a for loop I parse the operand!! > > What would you suggest for parsing space-separated assembly codes > when it comes to figuring out if a mnemonic has two operands or one? > > Cheers, > ES > > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150928/15f20c93/attachment.html>
Hi Pierre-Andre Thanks for your prompt reply. What I mean, is located at line 4192 (http://code.woboq.org/llvm/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp.html#4192 ). It, first, has to parse the instruction, and based on the number of operands it uses a pattern in MatchAndEmit. My problem is, what would be a suitable substitute if operands in the assembly code are not comma-separated, instead space-separated. (as you know, space is automatically removed so I cannot simply switch AsmToken::Comma to AsmToken::Space.) Thanks a lot. :-) On Mon, Sep 28, 2015 at 3:32 PM, Pierre-Andre Saulais < pierre-andre at codeplay.com> wrote:> Hi ES, > > From what I understand instruction parsing is divided into two parts: > > - Parsing an operand list (XXXAsmParser::ParseInstruction) > - Turning the operand list into an actual instruction > (XXXAsmParser::MatchAndEmitInstruction) > > The second part does the validation (e.g. how many operands, what kind, > etc) while the first part only does the parsing. That's why I think in the > first part you have to handle all possible operand combinations (i.e. parse > the first operand, and keep parsing operands as long as you see spaces). > LLVM will reject instructions with too many operands (as defined in the .td > files). > > Is this something that would work with your assembly syntax? > > Cheers, > Pierre-Andre > > > On 28/09/15 14:21, Sky Flyer via llvm-dev wrote: > > practically I cannot use a function namly *getMnemonicAcceptInfo* > (mnemonic as input, and number of possible outputs as output), because > there are mnemonics that accepts different number of operands! :-/ > > Any help is highly appreciated. > > On Mon, Sep 28, 2015 at 10:53 AM, Sky Flyer <skylake007 at googlemail.com> > wrote: > >> Hi all, >> >> in most of the architectures, assembly operands are comma-separated. >> I would like to parse an assembly code that is space-separated and I am >> having a bit of problem. >> In *ParseInstruction* function, I don't know what is the easiest way to >> figure out how many operands a mnemonic expected to have. >> In comma-separated assembly code, it just consuming commas (while >> (getLexer().is(AsmToken::Comma))) and adds operands, but it's not the case >> for space... >> >> I have a dirty hack, that I manually provide such information (number of >> operands) in a function called for example getMnemonicAcceptInfo and with a >> for loop I parse the operand!! >> >> What would you suggest for parsing space-separated assembly codes when it >> comes to figuring out if a mnemonic has two operands or one? >> >> Cheers, >> ES >> > > > > _______________________________________________ > LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150928/5239ab2d/attachment.html>
<http://code.woboq.org/llvm/llvm/include/llvm/MC/MCParser/MCAsmLexer.h.html#llvm::AsmToken::TokenKind::EndOfStatement>Would getLexer().isNot(AsmToken::EndOfStatement) in that condition do the trick? The lexer is already splitting the input at spaces. ________________________________ From: llvm-dev [llvm-dev-bounces at lists.llvm.org] on behalf of Sky Flyer via llvm-dev [llvm-dev at lists.llvm.org] Sent: 28 September 2015 14:41 To: Pierre-Andre Saulais Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] Parse Instruction Hi Pierre-Andre Thanks for your prompt reply. What I mean, is located at line 4192 (http://code.woboq.org/llvm/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp.html#4192<UrlBlockedError.aspx>). It, first, has to parse the instruction, and based on the number of operands it uses a pattern in MatchAndEmit. My problem is, what would be a suitable substitute if operands in the assembly code are not comma-separated, instead space-separated. (as you know, space is automatically removed so I cannot simply switch AsmToken::Comma to AsmToken::Space.) Thanks a lot. :-) On Mon, Sep 28, 2015 at 3:32 PM, Pierre-Andre Saulais <pierre-andre at codeplay.com<mailto:pierre-andre at codeplay.com>> wrote: Hi ES,>From what I understand instruction parsing is divided into two parts:- Parsing an operand list (XXXAsmParser::ParseInstruction) - Turning the operand list into an actual instruction (XXXAsmParser::MatchAndEmitInstruction) The second part does the validation (e.g. how many operands, what kind, etc) while the first part only does the parsing. That's why I think in the first part you have to handle all possible operand combinations (i.e. parse the first operand, and keep parsing operands as long as you see spaces). LLVM will reject instructions with too many operands (as defined in the .td files). Is this something that would work with your assembly syntax? Cheers, Pierre-Andre On 28/09/15 14:21, Sky Flyer via llvm-dev wrote: practically I cannot use a function namly getMnemonicAcceptInfo (mnemonic as input, and number of possible outputs as output), because there are mnemonics that accepts different number of operands! :-/ Any help is highly appreciated. On Mon, Sep 28, 2015 at 10:53 AM, Sky Flyer <skylake007 at googlemail.com<mailto:skylake007 at googlemail.com>> wrote: Hi all, in most of the architectures, assembly operands are comma-separated. I would like to parse an assembly code that is space-separated and I am having a bit of problem. In ParseInstruction function, I don't know what is the easiest way to figure out how many operands a mnemonic expected to have. In comma-separated assembly code, it just consuming commas (while (getLexer().is(AsmToken::Comma))) and adds operands, but it's not the case for space... I have a dirty hack, that I manually provide such information (number of operands) in a function called for example getMnemonicAcceptInfo and with a for loop I parse the operand!! What would you suggest for parsing space-separated assembly codes when it comes to figuring out if a mnemonic has two operands or one? Cheers, ES _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150928/8e575f63/attachment.html>
Ah I see, I didn't think about spaces being ignored :) I just checked and MCAsmLexer has a setSkipSpace function that could be used to not ignore whitespace when parsing. I haven't tried it out though. Cheers, Pierre-Andre On 28/09/15 14:41, Sky Flyer wrote:> Hi Pierre-Andre > > Thanks for your prompt reply. > What I mean, is located at line 4192 > (http://code.woboq.org/llvm/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp.html#4192 > <%28http://code.woboq.org/llvm/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp.html#4192>). > It, first, has to parse the instruction, and based on the number of > operands it uses a pattern in MatchAndEmit. > My problem is, what would be a suitable substitute if operands in the > assembly code are not comma-separated, instead space-separated. (as > you know, space is automatically removed so I cannot simply switch > AsmToken::Comma to AsmToken::Space.) > > Thanks a lot. :-) > > > > > On Mon, Sep 28, 2015 at 3:32 PM, Pierre-Andre Saulais > <pierre-andre at codeplay.com <mailto:pierre-andre at codeplay.com>> wrote: > > Hi ES, > > From what I understand instruction parsing is divided into two parts: > > - Parsing an operand list (XXXAsmParser::ParseInstruction) > - Turning the operand list into an actual instruction > (XXXAsmParser::MatchAndEmitInstruction) > > The second part does the validation (e.g. how many operands, what > kind, etc) while the first part only does the parsing. That's why > I think in the first part you have to handle all possible operand > combinations (i.e. parse the first operand, and keep parsing > operands as long as you see spaces). LLVM will reject instructions > with too many operands (as defined in the .td files). > > Is this something that would work with your assembly syntax? > > Cheers, > Pierre-Andre > > > On 28/09/15 14:21, Sky Flyer via llvm-dev wrote: >> practically I cannot use a function namly *getMnemonicAcceptInfo* >> (mnemonic as input, and number of possible outputs as output), >> because there are mnemonics that accepts different number of >> operands! :-/ >> >> Any help is highly appreciated. >> >> On Mon, Sep 28, 2015 at 10:53 AM, Sky Flyer >> <skylake007 at googlemail.com <mailto:skylake007 at googlemail.com>> wrote: >> >> Hi all, >> >> in most of the architectures, assembly operands are >> comma-separated. >> I would like to parse an assembly code that is >> space-separated and I am having a bit of problem. >> In *ParseInstruction* function, I don't know what is the >> easiest way to figure out how many operands a mnemonic >> expected to have. >> In comma-separated assembly code, it just consuming commas >> (while (getLexer().is(AsmToken::Comma))) and adds operands, >> but it's not the case for space... >> >> I have a dirty hack, that I manually provide such information >> (number of operands) in a function called for example >> getMnemonicAcceptInfo and with a for loop I parse the operand!! >> >> What would you suggest for parsing space-separated assembly >> codes when it comes to figuring out if a mnemonic has two >> operands or one? >> >> Cheers, >> ES >> >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150928/484e1269/attachment.html>