Ivan Krylov
2022-Jul-16 09:24 UTC
[Rd] Feature Request: Allow Underscore Separated Numbers
On Fri, 15 Jul 2022 12:34:24 -0700 Bill Dunlap <williamwdunlap at gmail.com> wrote:> The token '._1' (period underscore digit) is currently parsed as a > symbol (name). It would become a number if underscore were ignored > as in the first proposal. The just-between-digits alternative would > avoid this change.Thanks for spotting this! Here's a patch that allows underscores only between digits and only inside the significand of a number: --- src/main/gram.y (revision 82598) +++ src/main/gram.y (working copy) @@ -2526,7 +2526,7 @@ YYTEXT_PUSH(c, yyp); /* We don't care about other than ASCII digits */ while (isdigit(c = xxgetc()) || c == '.' || c == 'e' || c == 'E' - || c == 'x' || c == 'X' || c == 'L') + || c == 'x' || c == 'X' || c == 'L' || c == '_') { count++; if (c == 'L') /* must be at the end. Won't allow 1Le3 (at present). */ @@ -2538,11 +2538,16 @@ if (count > 2 || last != '0') break; /* 0x must be first */ YYTEXT_PUSH(c, yyp); while(isdigit(c = xxgetc()) || ('a' <= c && c <= 'f') || - ('A' <= c && c <= 'F') || c == '.') { + ('A' <= c && c <= 'F') || c == '.' || c == '_') { if (c == '.') { if (seendot) return ERROR; seendot = 1; } + if (c == '_') { + /* disallow underscores following 0x or followed by non-digit */ + if (nd == 0 || typeofnext() >= 2) break; + continue; + } YYTEXT_PUSH(c, yyp); nd++; } @@ -2588,6 +2593,11 @@ break; seendot = 1; } + /* underscores in significand followed by a digit must be skipped */ + if (c == '_') { + if (seenexp || typeofnext() >= 2) break; + continue; + } YYTEXT_PUSH(c, yyp); last = c; } -- Best regards, Ivan
Duncan Murdoch
2022-Jul-16 15:17 UTC
[Rd] Feature Request: Allow Underscore Separated Numbers
On 16/07/2022 5:24 a.m., Ivan Krylov wrote:> On Fri, 15 Jul 2022 12:34:24 -0700 > Bill Dunlap <williamwdunlap at gmail.com> wrote: > >> The token '._1' (period underscore digit) is currently parsed as a >> symbol (name). It would become a number if underscore were ignored >> as in the first proposal. The just-between-digits alternative would >> avoid this change. > > Thanks for spotting this! Here's a patch that allows underscores > only between digits and only inside the significand of a number:I think there's an issue with hex values. For example: > 0xa_2 [1] 162 > 0x2_a Error: unexpected input in "0x2_" So "a" counts as a digit in 0xa_2, but not as a digit in 0x2_a. Duncan Murdoch> > --- src/main/gram.y (revision 82598) > +++ src/main/gram.y (working copy) > @@ -2526,7 +2526,7 @@ > YYTEXT_PUSH(c, yyp); > /* We don't care about other than ASCII digits */ > while (isdigit(c = xxgetc()) || c == '.' || c == 'e' || c == 'E' > - || c == 'x' || c == 'X' || c == 'L') > + || c == 'x' || c == 'X' || c == 'L' || c == '_') > { > count++; > if (c == 'L') /* must be at the end. Won't allow 1Le3 (at present). */ > @@ -2538,11 +2538,16 @@ > if (count > 2 || last != '0') break; /* 0x must be first */ > YYTEXT_PUSH(c, yyp); > while(isdigit(c = xxgetc()) || ('a' <= c && c <= 'f') || > - ('A' <= c && c <= 'F') || c == '.') { > + ('A' <= c && c <= 'F') || c == '.' || c == '_') { > if (c == '.') { > if (seendot) return ERROR; > seendot = 1; > } > + if (c == '_') { > + /* disallow underscores following 0x or followed by non-digit */ > + if (nd == 0 || typeofnext() >= 2) break; > + continue; > + } > YYTEXT_PUSH(c, yyp); > nd++; > } > @@ -2588,6 +2593,11 @@ > break; > seendot = 1; > } > + /* underscores in significand followed by a digit must be skipped */ > + if (c == '_') { > + if (seenexp || typeofnext() >= 2) break; > + continue; > + } > YYTEXT_PUSH(c, yyp); > last = c; > } > >