maechler at stat.math.ethz.ch
2009-Apr-25 13:00 UTC
[Rd] incorrect output and segfaults from sprintf with %*d (PR#13675)
On Fri, Apr 24, 2009 at 14:40, Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:> maechler at stat.math.ethz.ch wrote: >> >> =A0 =A0 vQ> sprintf has a documented limit on strings included in the output using the >> =A0 =A0 vQ> format '%s'. =A0It appears that there is a limit on the length of strings included >> =A0 =A0 vQ> with, e.g., the format '%d' beyond which surprising things happen (output >> =A0 =A0 vQ> modified for conciseness): >> =A0 =A0 >>> >> >> =A0 =A0 vQ> ... and this limit is *not* documented. >> >> =A0 =A0 MM> well, it is basically (+ a few bytes ?) >> =A0 =A0 MM> the same =A08192 =A0limit that *is* documented. >> >> indeed, I was right with that.. >> > > hmm, i'd guess this limit is valid for all strings included in the > output with any format? =A0not just %s (and, as it appears, undocumentedly > %d)?yes.>> =A0 =A0 vQ> while snprintf would help avoid buffer overflow, it may not be a >> =A0 =A0 vQ> solution to the issue of confused output. >> >> =A0 =A0 MM> I think it would / will. =A0We would be able to give warnings and >> =A0 =A0 MM> errors, by checking the =A0snprintf() =A0return codes. >> >> My current working code gives an error for all the above >> examples, e.g., >> >> =A0> sprintf('%9999d', 1) >> =A0Error in sprintf("%9999d", 1) : >> =A0 =A0required resulting string length 9999 is > maximal 8191 >> >> it passes =A0'make check-devel' and I am inclined to commit that >> code to R-devel (e.g. tomorrow). >> >> Yes, the documentation will also have to be amended, but apart >> from that, would people see a big problem with the "8192" limit >> which now is suddenly of greater importance >> {{as I said all along; =A0hence my question to Wacek (and the >> =A0 R-develers) =A0if anybody found that limit too low}} >> > > i didn't find the limit itself problematic. =A0(so far?)ok.> btw. (i do know what that means ;)), after your recent fix: > > =A0 =A0sprintf('%q%s', 1) > =A0 =A0# Error in sprintf("%q%s", 1) : > =A0 =A0# =A0use format %f, %e, %g or %a for numeric objects > > =A0 =A0sprintf('%s', 1) > =A0 =A0# [1] "1" > > you may want to add '%s' (and '%x', and ...) to the error message. =A0or > perhaps make it say sth like 'invalid format: ...'. =A0the problem is not > that %q is not applicable to numeric, but that it is not a valid format > at all.yes. As a matter of fact, "%q%s" is dealt with as *one* format chunk, since "%q" is not a valid format. The code I have just committed now gives a longer erro message, that should be more helpful. Thank you for the suggestion!> there's also an issue with the additional arguments supplied after the > format: =A0any superfluous arguments are ignored (this is not documented, > as far as i can see),I think we should signal an error if there are too many arguments. Could anyone think of a case where the current behavior is desirable ?> but they *are* evaluated nevertheless, e.g.: > > =A0 =A0sprintf('%d', 0, {print(1)}) > =A0 =A0# "1" > =A0 =A0# [1] "0" > > it might be a good idea to document this behaviour.actually I think it should be changed to be more strict (see above). Thank you for the constructive feedback! Martin