Feature Request: Allow Underscore Separated Numbers
On 16/07/2022 5:24 a.m., Ivan Krylov wrote:
On Fri, 15 Jul 2022 12:34:24 -0700 Bill Dunlap <williamwdunlap at gmail.com> wrote:
The token '._1' (period underscore digit) is currently parsed as a symbol (name). It would become a number if underscore were ignored as in the first proposal. The just-between-digits alternative would avoid this change.
Thanks for spotting this! Here's a patch that allows underscores only between digits and only inside the significand of a number:
I think there's an issue with hex values. For example: > 0xa_2 [1] 162 > 0x2_a Error: unexpected input in "0x2_" So "a" counts as a digit in 0xa_2, but not as a digit in 0x2_a. Duncan Murdoch
--- src/main/gram.y (revision 82598)
+++ src/main/gram.y (working copy)
@@ -2526,7 +2526,7 @@
YYTEXT_PUSH(c, yyp);
/* We don't care about other than ASCII digits */
while (isdigit(c = xxgetc()) || c == '.' || c == 'e' || c == 'E'
- || c == 'x' || c == 'X' || c == 'L')
+ || c == 'x' || c == 'X' || c == 'L' || c == '_')
{
count++;
if (c == 'L') /* must be at the end. Won't allow 1Le3 (at present). */
@@ -2538,11 +2538,16 @@
if (count > 2 || last != '0') break; /* 0x must be first */
YYTEXT_PUSH(c, yyp);
while(isdigit(c = xxgetc()) || ('a' <= c && c <= 'f') ||
- ('A' <= c && c <= 'F') || c == '.') {
+ ('A' <= c && c <= 'F') || c == '.' || c == '_') {
if (c == '.') {
if (seendot) return ERROR;
seendot = 1;
}
+ if (c == '_') {
+ /* disallow underscores following 0x or followed by non-digit */
+ if (nd == 0 || typeofnext() >= 2) break;
+ continue;
+ }
YYTEXT_PUSH(c, yyp);
nd++;
}
@@ -2588,6 +2593,11 @@
break;
seendot = 1;
}
+ /* underscores in significand followed by a digit must be skipped */
+ if (c == '_') {
+ if (seenexp || typeofnext() >= 2) break;
+ continue;
+ }
YYTEXT_PUSH(c, yyp);
last = c;
}