thr3ads.net - llvm dev - [llvm-dev] Upper case vs lower case in printed and parsed MIR [Apr 2019]

If this information is useful, please help other people find it:
Share via:

Markus Lavin via llvm-dev

2019-Apr-11 17:40 UTC

[llvm-dev] Upper case vs lower case in printed and parsed MIR

I am confused about the rules for when upper and lower case letters should be
used in MIR.

As an example our downstream target has upper case letters in its sub-register
indices and as a result we cannot import exported MIR without manually
'lower casing' it first which is obviously rather annoying.

Looking in https://llvm.org/docs/MIRLangRef.html it is stated that instruction
names are case sensitive.

For register names it appear that they are lower cased before printing (see
printReg in TargetRegisterInfo.cpp) and to match the definitions are also lower
cased before loaded into the parser (see PerTargetMIParsingState::initNames2Regs
in MIParser.cpp). For sub-register index names the latter happens but they are
currently printed with their original casing witch leads to our problem.

What is the right solution here, should they be lower cased when printing as
well (as I tried to do in https://reviews.llvm.org/D60311)?

To me it seems that preserving the original casing from the .td file would be
the most correct thing to do but then it would be inconsistent with e.g.
register names and would only add to the confusion it seems.

Thanks
Markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190411/b5aa817e/attachment.html>

Björn Pettersson A via llvm-dev

2019-Apr-15 09:05 UTC

head link

[llvm-dev] Upper case vs lower case in printed and parsed MIR

> -----Original Message-----
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of
Markus Lavin
> via llvm-dev
> Sent: den 11 april 2019 19:40
> To: llvm-dev at lists.llvm.org
> Subject: [llvm-dev] Upper case vs lower case in printed and parsed MIR
> 
> I am confused about the rules for when upper and lower case letters should
> be used in MIR.
> 
> As an example our downstream target has upper case letters in its sub-
> register indices and as a result we cannot import exported MIR without
> manually 'lower casing' it first which is obviously rather
annoying.
> 
> Looking in https://llvm.org/docs/MIRLangRef.html it is stated that
> instruction names are case sensitive.
> 
> For register names it appear that they are lower cased before printing (see
> printReg in TargetRegisterInfo.cpp) and to match the definitions are also
> lower cased before loaded into the parser
> (see PerTargetMIParsingState::initNames2Regs in MIParser.cpp). For sub-
> register index names the latter happens but they are currently printed with
> their original casing witch leads to our problem.
Do we know if making register names case sensitive would be a big churn
(e.g. in .mir test cases)? I assume it would have quite big impact.
> 
> What is the right solution here, should they be lower cased when printing
> as well (as I tried to do in https://reviews.llvm.org/D60311)?
> 
> To me it seems that preserving the original casing from the .td file would
> be the most correct thing to do but then it would be inconsistent with e.g.
> register names and would only add to the confusion it seems.
Handling register names and subregister names consistently actually sounds
reasonable. Even though my first feeling was that it would be nice to print
the subregister names with the same casing as used in the code.

There are however more strings so it is hard to know where to draw the line
when it comes to forcing lower case in the API, if instruction names are
still should be case sensitive. Should for example register class names
follow the same rule as for registers?

If we want to force lower case (e.g. for register/subregister names), then
I think we want to avoid doing the lower casing at runtime and instead
tablegen should put the lower case names in the tables already from the start.

There is ofcourse the option of making the parsing case-insensitive (for
register names, subregister names). I guess that would make least churn,
since things would be backwards compatible with existing test cases (and
printouts could be handled just like today, even if it isn't consistent).


@Markus: We could even rename things in the RegisterInfo.td file for our OOT
target to use lower case names (and by that hide the problems we currently see).
But that would not help the community. So it would be nice to hear what
others have to say about this.

Björn Pettersson A via llvm-dev

2019-May-02 13:18 UTC

head link

[llvm-dev] Upper case vs lower case in printed and parsed MIR

Today the MIR parser expects lower case subregister names only.
It then matches the parsed name with an (at runtime) lower casing
of the subregister names stored in tablegen'd tables.

Since there has been no more comments on these questions from
Markus I suggest that D60311 should be abandoned, and then we should
make a solution (focusing on subregister names) where the MIR parser
should be case sensitive when parsing subregister names.

(If we run into trouble with an unexpected amount of churn in any MIR
testcases we could go for making the parser case insensitive.)

Thus, the MIR parser won't be fully backwards compatible. And for example
downstream repo maintainers (like myself) would need to update MIR
tests to use the correct case in subregister names.

/Björn
> -----Original Message-----
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of
Björn
> Pettersson A via llvm-dev
> Sent: den 15 april 2019 11:06
> To: Markus Lavin <markus.lavin at ericsson.com>; llvm-dev at
lists.llvm.org
> Subject: Re: [llvm-dev] Upper case vs lower case in printed and parsed MIR
> 
> > -----Original Message-----
> > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of
Markus
> Lavin
> > via llvm-dev
> > Sent: den 11 april 2019 19:40
> > To: llvm-dev at lists.llvm.org
> > Subject: [llvm-dev] Upper case vs lower case in printed and parsed MIR
> >
> > I am confused about the rules for when upper and lower case letters
> should
> > be used in MIR.
> >
> > As an example our downstream target has upper case letters in its sub-
> > register indices and as a result we cannot import exported MIR without
> > manually 'lower casing' it first which is obviously rather
annoying.
> >
> > Looking in https://llvm.org/docs/MIRLangRef.html it is stated that
> > instruction names are case sensitive.
> >
> > For register names it appear that they are lower cased before printing
> (see
> > printReg in TargetRegisterInfo.cpp) and to match the definitions are
also
> > lower cased before loaded into the parser
> > (see PerTargetMIParsingState::initNames2Regs in MIParser.cpp). For
sub-
> > register index names the latter happens but they are currently printed
> with
> > their original casing witch leads to our problem.
> 
> Do we know if making register names case sensitive would be a big churn
> (e.g. in .mir test cases)? I assume it would have quite big impact.
> 
> >
> > What is the right solution here, should they be lower cased when
printing
> > as well (as I tried to do in https://reviews.llvm.org/D60311)?
> >
> > To me it seems that preserving the original casing from the .td file
> would
> > be the most correct thing to do but then it would be inconsistent with
> e.g.
> > register names and would only add to the confusion it seems.
> 
> Handling register names and subregister names consistently actually sounds
> reasonable. Even though my first feeling was that it would be nice to print
> the subregister names with the same casing as used in the code.
> 
> There are however more strings so it is hard to know where to draw the line
> when it comes to forcing lower case in the API, if instruction names are
> still should be case sensitive. Should for example register class names
> follow the same rule as for registers?
> 
> If we want to force lower case (e.g. for register/subregister names), then
> I think we want to avoid doing the lower casing at runtime and instead
> tablegen should put the lower case names in the tables already from the
> start.
> 
> There is ofcourse the option of making the parsing case-insensitive (for
> register names, subregister names). I guess that would make least churn,
> since things would be backwards compatible with existing test cases (and
> printouts could be handled just like today, even if it isn't
consistent).
> 
> 
> @Markus: We could even rename things in the RegisterInfo.td file for our
> OOT
> target to use lower case names (and by that hide the problems we currently
> see).
> But that would not help the community. So it would be nice to hear what
> others have to say about this.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Seemingly Similar Threads

Search for more maybe matching threads

llvm dev - Apr 2019 - Upper case vs lower case in printed and parsed MIR

[llvm-dev] Upper case vs lower case in printed and parsed MIR

[llvm-dev] Upper case vs lower case in printed and parsed MIR

[llvm-dev] Upper case vs lower case in printed and parsed MIR

Seemingly Similar Threads