Hello R-devel! The following sequence of commands results in an error message on a POSIX system: tab="`echo -ne "\t"`" LC_ALL=C Rscript -e " $tab 1" # ARGUMENT '~+~1' __ignored__ Tabs can sneak into the -e argument from indented multi-line arguments in shell scripts: Rscript -e ' foo() bar() ... ' R.sh does a good job of escaping spaces and newlines, but since shells are also supposed to split on a tab [*], it's a good idea to escape tabs too: Index: src/scripts/R.sh.in ==================================================================--- src/scripts/R.sh.in (revision 80090) +++ src/scripts/R.sh.in (working copy) @@ -192,7 +192,7 @@ -e) if test -n "`echo ${2} | ${SED} 's/^-.*//'`"; then a=`(echo "${2}" && echo) | ${SED} -e 's/ /~+~/g' | \ - ${SED} -e :a -e N -e '$!ba' -e 's/\n/~n~/g' -e 's/~n~$//g'` + ${SED} -e :a -e N -e '$!ba' -e 's/\n/~n~/g' -e 's/~n~$//g' -e 's/\t/~t~/g'` shift else error "option '${1}' requires a non-empty argument" Index: src/unix/system.c ==================================================================--- src/unix/system.c (revision 80090) +++ src/unix/system.c (working copy) @@ -170,6 +170,9 @@ } else if(*q == '~' && *(q+1) == 'n' && *(q+2) == '~') { q += 2; *p++ = '\n'; + } else if(*q == '~' && *(q+1) == 't' && *(q+2) == '~') { + q += 2; + *p++ = '\t'; } else *p++ = *q; } return p; I have verified that with the patch above, Rscript -e " $tab 1" no longer fails. While we're at it, perhaps it could be a good idea to replace the magic number 10000 with a the size of the character array above it: Index: src/unix/system.c ==================================================================--- src/unix/system.c (revision 80090) +++ src/unix/system.c (working copy) @@ -429,7 +432,7 @@ } else if(!strcmp(*av, "-e")) { ac--; av++; Rp->R_Interactive = FALSE; - if(strlen(cmdlines) + strlen(*av) + 2 <= 10000) { + if(strlen(cmdlines) + strlen(*av) + 2 <= sizeof(cmdlines)) { char *p = cmdlines+strlen(cmdlines); p = unescape_arg(p, *av); *p++ = '\n'; *p = '\0'; It might also be a good idea to make it possible to represent the escape sequences themselves in the unescaped stream in a fully reversible transformation ('~' <-> '~~~', ' ' <-> '~+~', '\n' <-> '~n~', '\t' <-> '~t~'), making it possible to round-trip character sequences like '~+~' through the escaping and unescaping process (thankfully, '~+~' is not frequently needed in R programs), though expressing that as a sed command is beyond me. Right now, Rscript -e '"~+~"' doesn't print "~+~". Perhaps the bigger question to ask is whether this escaping is unavoidable. Is it documented? Since the args variable is only appended (not prepended), it is likely possible to rewrite the 'while test -n "${1}"; do' loop in terms of 'set -- "$@" ...', which is POSIX-compatible and doesn't require any escaping: set -- "${@}" dummy # append one argument to skip it later for arg in "${@}"; do # it's safe to modify $@ in the for loop [**] # TODO: on first iteration only, empty the $@ and don't check $prev_arg case "${prev_arg}" in # ... -g|--gui) if test -n "`echo "${arg}" | ${SED} 's/^-.*//'`"; then gui="${arg}" set -- "${@}" "${prev_arg}" "${arg}" else error "option '${prev_arg}' requires an argument" fi ;; # ... -e) if ! test -n "`echo "${arg}" | ${SED} 's/^-.*//'`"; then error "option '${prev_arg}' requires a non-empty argument" fi set -- "${@}" -e "${arg}" ;; # ... esac prev_arg="${arg}" # no shift needed done # Later: use "${@}" instead of ${args} Or is it documented behaviour that arguments following an empty argument are not escaped by the shell script but are passed to "${R_HOME}/bin/exec${R_ARCH}/R"? LC_ALL=C R -q -e " $tab 1" # ARGUMENT '~+~1' __ignored__ # # > # > # > LC_ALL=C R '' -q -e " $tab 1" # ARGUMENT '' __ignored__ # # > 1 # [1] 1 # > # > -- Best regards, Ivan [*] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_05 [**] "First, the list of words following in shall be expanded to generate a list of items..." https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_04_03