Thomas Preudhomme via llvm-dev
2018-Jul-17 09:02 UTC
[llvm-dev] Syntax for FileCheck numeric variables and expressions
To be clear, I do not intend to add support for hex specifier in the current patch, I just want to make sure the syntax we choose is going to allow it later. My immediate use case is decimal integer and I intend to write the code so that it's easy to extend to more type of numeric variables and expressions later. This way we'll only add specifier that are actually required by actual testcases. Best regards, Thomas On Mon, 16 Jul 2018 at 18:39, <paul.robinson at sony.com> wrote:> > > > > -----Original Message----- > > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of > > Thomas Preudhomme via llvm-dev > > Sent: Monday, July 16, 2018 6:24 AM > > To: jh7370.2008 at my.bristol.ac.uk > > Cc: llvm-dev at lists.llvm.org > > Subject: Re: [llvm-dev] Syntax for FileCheck numeric variables and > > expressions > > > > Hi James, > > > > I like that suggestion very much but I think keeping the order of the > > two sides as initially proposed makes more sense. In printf/scanf the > > string is first because the primary use of these functions is to do > > I/O and so you first specify what you are going to output/input and > > then where to capture variables. The primary objective of FileCheck > > variables and expressions is to capture/print them, the specifier is > > an addon to allow some conversion. Does it make sense? > > My immediate reaction is that I'd rather not have FileCheck get into > the business of handling printf specifiers. OTOH, while LLVM tools > do typically print lowercase hex, that's not guaranteed, and looking > at the output of other tools can be useful too. So, a way to specify > the case for a hex conversion seems worthwhile. > > I had also been thinking in terms of the trailing colon to distinguish > definition from use, as James suggested, that's sort-of consistent > with the current syntax. > > This is starting to make parsing the insides of [[]] much more involved, > so you'll want to pay attention to making that code well-structured and > readable. > --paulr > > > > > In the interest of speeding things up I plan to start implementing > > this proposal starting tomorrow unless someone gives some more > > feedback. > > > > Best regards, > > > > Thomas > > > > On Fri, 13 Jul 2018 at 15:51, James Henderson > > <jh7370.2008 at my.bristol.ac.uk> wrote: > > > > > > Hi Thomas, > > > > > > In general, I think this is a good proposal. However, I don't think that > > using ">" or "<" to specify base (at least alone) is a good idea, as it > > might clash with future ideas to do comparisons etc. I also think it would > > be nice to have the syntax consistent between definition and use. My first > > thought on a reasonable alternative was to use commas to separate the two > > parts, so something like: > > > > > > [[# VAR, 16:]] to capture a hexadecimal number (where the spaces are > > optional). [[# VAR, 16]] to use a variable, converted to a hexadecimal > > string. In both cases, the base component is optional, and defaults to > > decimal. > > > > > > This led me to thing that it might be better to use something similar to > > printf style for the latter half, so to capture a hexadecimal number with > > a leading "0x" would be: "0x[[# VAR, %x:]]" and to use it would be "0x[[# > > VAR, %x]]". Indeed, that would allow straightforward conversions between > > formats, so say you defined it by capturing a decimal integer and using it > > to match a hexadecimal in upper case, with leading 0x and 8 digits > > following the 0x: > > > > > > CHECK: [[# VAR, %d:]] # Defines > > > CHECK: 0x[[# VAR + 1, %8X]] # Uses > > > > > > Of course, if we go down that route, it would probably make more sense > > to reverse the two sides (e.g. to become "[[# %d, VAR:]]" to capture a > > decimal and "[[# %8X, VAR + 1]]" to use it). > > > > > > Regards, > > > > > > James > > > > > > On 12 July 2018 at 15:34, Thomas Preudhomme via llvm-dev <llvm- > > dev at lists.llvm.org> wrote: > > >> > > >> Hi all, > > >> > > >> I've written a patch to extend FileCheck to support matching > > >> arithmetic expressions involving variable [1] (eg. to match REG+1 > > >> where REG is a variable with a numeric value). It was suggested to me > > >> in the review to introduce the concept of numeric variable and to > > >> allow for specifying the base the value are written in. > > >> > > >> [1] https://reviews.llvm.org/D49084 > > >> > > >> I think the syntax should satisfy the below requirements: > > >> > > >> * based off the [[]] construct since anything else might overload an > > >> existing valid syntax (eg. $$ is supposed to match literally now) > > >> * consistent with syntax for expressions using @LINE > > >> * consistent with using ':' to define regular variable > > >> * allows to specify base of the number a numeric variable is being set > > to > > >> * allows to specify base of the result of the numeric expression > > >> > > >> I've come up with the following syntax for which I'd like feedback: > > >> > > >> Numeric variable definition: [[#X<base:]] (eg. [[#ADDR<16:]]) where X > > >> is the numeric variable being defined and <base is optional in which > > >> case base defaults to 10 > > >> Numeric variable use: [[#X>base]] (eg. [[#ADDR]]>2) where <base is > > >> optional in which case base defaults 10 > > >> Numeric expression: [[exp>base]] (eg. [[#ADDR+2>16]] where expression > > >> must contain at least one numeric variable > > >> > > >> > > >> I'm not a big fan of the > for the output base being inside the > > >> expression but [[exp]]>base would match >base literally. > > >> > > >> Any suggestions / opinions? > > >> > > >> Best regards, > > >> > > >> Thomas > > >> _______________________________________________ > > >> LLVM Developers mailing list > > >> llvm-dev at lists.llvm.org > > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Alexander Richardson via llvm-dev
2018-Jul-17 20:59 UTC
[llvm-dev] Syntax for FileCheck numeric variables and expressions
On Tue, 17 Jul 2018 at 10:02 Thomas Preudhomme via llvm-dev < llvm-dev at lists.llvm.org> wrote:> To be clear, I do not intend to add support for hex specifier in the > current patch, I just want to make sure the syntax we choose is going > to allow it later. My immediate use case is decimal integer and I > intend to write the code so that it's easy to extend to more type of > numeric variables and expressions later. This way we'll only add > specifier that are actually required by actual testcases. > >I also added FileCheck expressions to our fork of LLVM in order to allow testing both a 128-bit and a 256-bits versions of our CHERI ISA in a single test case [1]. I used [[@EXPR foo * 2 + 1]] for FileCheck expressions [2]. I'm not particularly happy with this syntax since it is quite verbose (but then again we don't need it that often so it doesn't really matter). It also doesn't allow saving the expression result so it needs to be repeated everywhere. I could probably use [[@EXPR:OUTVAR INVAR + 42]] but I haven't really had the need for that yet. We currently need the following two features: - Simple arithmetic with multiple operations. Example: `cld $gp, $zero, [[@EXPR 2 * $CAP_SIZE - 8]]($c11)` - Conversion to hex (upper and lower case since not all tools are consistent here) and to decimal. Example: // READOBJ-NEXT: 0x50 R_MIPS_64/R_MIPS_NONE/R_MIPS_NONE .data 0x[[@EXPR hex($CAP_SIZE * 2)]] Alex [1] For most test cases the simple -DVAR=value flag in FileCheck is good enough: we have a %cheri_FileCheck lit substitution that expands to `FileCheck '-D$CAP_SIZE=16/32'` . This works for most IR level tests since usually the only thing that is different is "align 16" vs "align 32". However, when checking the assembly output or linker addresses we often need something more complex. [2] A test case showing all the currently supported expressions can be found here: < https://github.com/CTSRD-CHERI/llvm/blob/master/test/FileCheck/expressions.txt>> On Mon, 16 Jul 2018 at 18:39, <paul.robinson at sony.com> wrote: > > > > > > > > > -----Original Message----- > > > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of > > > Thomas Preudhomme via llvm-dev > > > Sent: Monday, July 16, 2018 6:24 AM > > > To: jh7370.2008 at my.bristol.ac.uk > > > Cc: llvm-dev at lists.llvm.org > > > Subject: Re: [llvm-dev] Syntax for FileCheck numeric variables and > > > expressions > > > > > > Hi James, > > > > > > I like that suggestion very much but I think keeping the order of the > > > two sides as initially proposed makes more sense. In printf/scanf the > > > string is first because the primary use of these functions is to do > > > I/O and so you first specify what you are going to output/input and > > > then where to capture variables. The primary objective of FileCheck > > > variables and expressions is to capture/print them, the specifier is > > > an addon to allow some conversion. Does it make sense? > > > > My immediate reaction is that I'd rather not have FileCheck get into > > the business of handling printf specifiers. OTOH, while LLVM tools > > do typically print lowercase hex, that's not guaranteed, and looking > > at the output of other tools can be useful too. So, a way to specify > > the case for a hex conversion seems worthwhile. > > > > I had also been thinking in terms of the trailing colon to distinguish > > definition from use, as James suggested, that's sort-of consistent > > with the current syntax. > > > > This is starting to make parsing the insides of [[]] much more involved, > > so you'll want to pay attention to making that code well-structured and > > readable. > > --paulr > > > > > > > > In the interest of speeding things up I plan to start implementing > > > this proposal starting tomorrow unless someone gives some more > > > feedback. > > > > > > Best regards, > > > > > > Thomas > > > > > > On Fri, 13 Jul 2018 at 15:51, James Henderson > > > <jh7370.2008 at my.bristol.ac.uk> wrote: > > > > > > > > Hi Thomas, > > > > > > > > In general, I think this is a good proposal. However, I don't think > that > > > using ">" or "<" to specify base (at least alone) is a good idea, as it > > > might clash with future ideas to do comparisons etc. I also think it > would > > > be nice to have the syntax consistent between definition and use. My > first > > > thought on a reasonable alternative was to use commas to separate the > two > > > parts, so something like: > > > > > > > > [[# VAR, 16:]] to capture a hexadecimal number (where the spaces are > > > optional). [[# VAR, 16]] to use a variable, converted to a hexadecimal > > > string. In both cases, the base component is optional, and defaults to > > > decimal. > > > > > > > > This led me to thing that it might be better to use something > similar to > > > printf style for the latter half, so to capture a hexadecimal number > with > > > a leading "0x" would be: "0x[[# VAR, %x:]]" and to use it would be > "0x[[# > > > VAR, %x]]". Indeed, that would allow straightforward conversions > between > > > formats, so say you defined it by capturing a decimal integer and > using it > > > to match a hexadecimal in upper case, with leading 0x and 8 digits > > > following the 0x: > > > > > > > > CHECK: [[# VAR, %d:]] # Defines > > > > CHECK: 0x[[# VAR + 1, %8X]] # Uses > > > > > > > > Of course, if we go down that route, it would probably make more > sense > > > to reverse the two sides (e.g. to become "[[# %d, VAR:]]" to capture a > > > decimal and "[[# %8X, VAR + 1]]" to use it). > > > > > > > > Regards, > > > > > > > > James > > > > > > > > On 12 July 2018 at 15:34, Thomas Preudhomme via llvm-dev <llvm- > > > dev at lists.llvm.org> wrote: > > > >> > > > >> Hi all, > > > >> > > > >> I've written a patch to extend FileCheck to support matching > > > >> arithmetic expressions involving variable [1] (eg. to match REG+1 > > > >> where REG is a variable with a numeric value). It was suggested to > me > > > >> in the review to introduce the concept of numeric variable and to > > > >> allow for specifying the base the value are written in. > > > >> > > > >> [1] https://reviews.llvm.org/D49084 > > > >> > > > >> I think the syntax should satisfy the below requirements: > > > >> > > > >> * based off the [[]] construct since anything else might overload an > > > >> existing valid syntax (eg. $$ is supposed to match literally now) > > > >> * consistent with syntax for expressions using @LINE > > > >> * consistent with using ':' to define regular variable > > > >> * allows to specify base of the number a numeric variable is being > set > > > to > > > >> * allows to specify base of the result of the numeric expression > > > >> > > > >> I've come up with the following syntax for which I'd like feedback: > > > >> > > > >> Numeric variable definition: [[#X<base:]] (eg. [[#ADDR<16:]]) where > X > > > >> is the numeric variable being defined and <base is optional in which > > > >> case base defaults to 10 > > > >> Numeric variable use: [[#X>base]] (eg. [[#ADDR]]>2) where <base is > > > >> optional in which case base defaults 10 > > > >> Numeric expression: [[exp>base]] (eg. [[#ADDR+2>16]] where > expression > > > >> must contain at least one numeric variable > > > >> > > > >> > > > >> I'm not a big fan of the > for the output base being inside the > > > >> expression but [[exp]]>base would match >base literally. > > > >> > > > >> Any suggestions / opinions? > > > >> > > > >> Best regards, > > > >> > > > >> Thomas > > > >> _______________________________________________ > > > >> LLVM Developers mailing list > > > >> llvm-dev at lists.llvm.org > > > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > > > > _______________________________________________ > > > LLVM Developers mailing list > > > llvm-dev at lists.llvm.org > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180717/9e7271c1/attachment.html>
Thomas Preudhomme via llvm-dev
2018-Jul-18 12:50 UTC
[llvm-dev] Syntax for FileCheck numeric variables and expressions
Hi Alex, Thanks for the feedback. My first thought was that introducing the new pseudo var @EXPR is a nice way to generalize that syntax beyond @LINE since it would also evaluate to an arithmetic value. On the other hand there is a small inconsistency because @LINE evaluates to a value which can be part of an expression while @EXPR is an expression, and so the @ syntax as a whole becomes defined as introducing something which is not a regular variable, ie. a negative definition. I'll stick with the # syntax because # is usually associated with numbers and can be defined as introducing an integer expression/variable. The one question I wonder is if the # should be next to the variable name or next to the [[ as proposed by James. I like the former better *but* I think the latter makes more sense since [[#VAR + 1]] would suggest that the [[<something>]] syntax already allows numeric expression without numeric variable which is not the case. Having the # right at the start also clearly indicates that the whole expression might have a conversion specifier. Finally, the # syntax can allow defining a variable with the result of an arithmetic expression: [[#BAR, %x:]] [[# FOO:BAR+12]] So BAR takes an hex value in lower case syntax, value gets added 12 (in decimal) and the result is put into FOO. In which case there should be no format specifier when defining FOO. ie. format specifier for definition is only when there's nothing about the colon. Of course we could allow hex immediate with 0x syntax if needed. Again, I'm not advocating for implementing all this from the start, but make sure that the syntax would allow it if we realize we need this later and I think Jame's proposal does. It seems this syntax would suit all your current uses (albeit the rewriting necessary), did I miss something? Best regards, Thomas On Tue, 17 Jul 2018 at 21:59, Alexander Richardson <arichardson.kde at gmail.com> wrote:> > > > On Tue, 17 Jul 2018 at 10:02 Thomas Preudhomme via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> To be clear, I do not intend to add support for hex specifier in the >> current patch, I just want to make sure the syntax we choose is going >> to allow it later. My immediate use case is decimal integer and I >> intend to write the code so that it's easy to extend to more type of >> numeric variables and expressions later. This way we'll only add >> specifier that are actually required by actual testcases. >> > > I also added FileCheck expressions to our fork of LLVM in order to allow testing both a 128-bit and a 256-bits versions of our CHERI ISA in a single test case [1]. > I used [[@EXPR foo * 2 + 1]] for FileCheck expressions [2]. I'm not particularly happy with this syntax since it is quite verbose (but then again we don't need it that often so it doesn't really matter). It also doesn't allow saving the expression result so it needs to be repeated everywhere. I could probably use [[@EXPR:OUTVAR INVAR + 42]] but I haven't really had the need for that yet. > > We currently need the following two features: > > - Simple arithmetic with multiple operations. Example: > `cld $gp, $zero, [[@EXPR 2 * $CAP_SIZE - 8]]($c11)` > > - Conversion to hex (upper and lower case since not all tools are consistent here) and to decimal. > Example: // READOBJ-NEXT: 0x50 R_MIPS_64/R_MIPS_NONE/R_MIPS_NONE .data 0x[[@EXPR hex($CAP_SIZE * 2)]] > > Alex > > [1] For most test cases the simple -DVAR=value flag in FileCheck is good enough: we have a %cheri_FileCheck lit substitution that expands to `FileCheck '-D$CAP_SIZE=16/32'` . This works for most IR level tests since usually the only thing that is different is "align 16" vs "align 32". However, when checking the assembly output or linker addresses we often need something more complex. > > [2] A test case showing all the currently supported expressions can be found here: <https://github.com/CTSRD-CHERI/llvm/blob/master/test/FileCheck/expressions.txt> > > >> >> On Mon, 16 Jul 2018 at 18:39, <paul.robinson at sony.com> wrote: >> > >> > >> > >> > > -----Original Message----- >> > > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of >> > > Thomas Preudhomme via llvm-dev >> > > Sent: Monday, July 16, 2018 6:24 AM >> > > To: jh7370.2008 at my.bristol.ac.uk >> > > Cc: llvm-dev at lists.llvm.org >> > > Subject: Re: [llvm-dev] Syntax for FileCheck numeric variables and >> > > expressions >> > > >> > > Hi James, >> > > >> > > I like that suggestion very much but I think keeping the order of the >> > > two sides as initially proposed makes more sense. In printf/scanf the >> > > string is first because the primary use of these functions is to do >> > > I/O and so you first specify what you are going to output/input and >> > > then where to capture variables. The primary objective of FileCheck >> > > variables and expressions is to capture/print them, the specifier is >> > > an addon to allow some conversion. Does it make sense? >> > >> > My immediate reaction is that I'd rather not have FileCheck get into >> > the business of handling printf specifiers. OTOH, while LLVM tools >> > do typically print lowercase hex, that's not guaranteed, and looking >> > at the output of other tools can be useful too. So, a way to specify >> > the case for a hex conversion seems worthwhile. >> > >> > I had also been thinking in terms of the trailing colon to distinguish >> > definition from use, as James suggested, that's sort-of consistent >> > with the current syntax. >> > >> > This is starting to make parsing the insides of [[]] much more involved, >> > so you'll want to pay attention to making that code well-structured and >> > readable. >> > --paulr >> > >> > > >> > > In the interest of speeding things up I plan to start implementing >> > > this proposal starting tomorrow unless someone gives some more >> > > feedback. >> > > >> > > Best regards, >> > > >> > > Thomas >> > > >> > > On Fri, 13 Jul 2018 at 15:51, James Henderson >> > > <jh7370.2008 at my.bristol.ac.uk> wrote: >> > > > >> > > > Hi Thomas, >> > > > >> > > > In general, I think this is a good proposal. However, I don't think that >> > > using ">" or "<" to specify base (at least alone) is a good idea, as it >> > > might clash with future ideas to do comparisons etc. I also think it would >> > > be nice to have the syntax consistent between definition and use. My first >> > > thought on a reasonable alternative was to use commas to separate the two >> > > parts, so something like: >> > > > >> > > > [[# VAR, 16:]] to capture a hexadecimal number (where the spaces are >> > > optional). [[# VAR, 16]] to use a variable, converted to a hexadecimal >> > > string. In both cases, the base component is optional, and defaults to >> > > decimal. >> > > > >> > > > This led me to thing that it might be better to use something similar to >> > > printf style for the latter half, so to capture a hexadecimal number with >> > > a leading "0x" would be: "0x[[# VAR, %x:]]" and to use it would be "0x[[# >> > > VAR, %x]]". Indeed, that would allow straightforward conversions between >> > > formats, so say you defined it by capturing a decimal integer and using it >> > > to match a hexadecimal in upper case, with leading 0x and 8 digits >> > > following the 0x: >> > > > >> > > > CHECK: [[# VAR, %d:]] # Defines >> > > > CHECK: 0x[[# VAR + 1, %8X]] # Uses >> > > > >> > > > Of course, if we go down that route, it would probably make more sense >> > > to reverse the two sides (e.g. to become "[[# %d, VAR:]]" to capture a >> > > decimal and "[[# %8X, VAR + 1]]" to use it). >> > > > >> > > > Regards, >> > > > >> > > > James >> > > > >> > > > On 12 July 2018 at 15:34, Thomas Preudhomme via llvm-dev <llvm- >> > > dev at lists.llvm.org> wrote: >> > > >> >> > > >> Hi all, >> > > >> >> > > >> I've written a patch to extend FileCheck to support matching >> > > >> arithmetic expressions involving variable [1] (eg. to match REG+1 >> > > >> where REG is a variable with a numeric value). It was suggested to me >> > > >> in the review to introduce the concept of numeric variable and to >> > > >> allow for specifying the base the value are written in. >> > > >> >> > > >> [1] https://reviews.llvm.org/D49084 >> > > >> >> > > >> I think the syntax should satisfy the below requirements: >> > > >> >> > > >> * based off the [[]] construct since anything else might overload an >> > > >> existing valid syntax (eg. $$ is supposed to match literally now) >> > > >> * consistent with syntax for expressions using @LINE >> > > >> * consistent with using ':' to define regular variable >> > > >> * allows to specify base of the number a numeric variable is being set >> > > to >> > > >> * allows to specify base of the result of the numeric expression >> > > >> >> > > >> I've come up with the following syntax for which I'd like feedback: >> > > >> >> > > >> Numeric variable definition: [[#X<base:]] (eg. [[#ADDR<16:]]) where X >> > > >> is the numeric variable being defined and <base is optional in which >> > > >> case base defaults to 10 >> > > >> Numeric variable use: [[#X>base]] (eg. [[#ADDR]]>2) where <base is >> > > >> optional in which case base defaults 10 >> > > >> Numeric expression: [[exp>base]] (eg. [[#ADDR+2>16]] where expression >> > > >> must contain at least one numeric variable >> > > >> >> > > >> >> > > >> I'm not a big fan of the > for the output base being inside the >> > > >> expression but [[exp]]>base would match >base literally. >> > > >> >> > > >> Any suggestions / opinions? >> > > >> >> > > >> Best regards, >> > > >> >> > > >> Thomas >> > > >> _______________________________________________ >> > > >> LLVM Developers mailing list >> > > >> llvm-dev at lists.llvm.org >> > > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > > >> > > > >> > > _______________________________________________ >> > > LLVM Developers mailing list >> > > llvm-dev at lists.llvm.org >> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev