Trying to copy the (binary) header of a input file directly to an output file, I've had repeatable seg faults. The call: writeChar(hdr, outfh, nchars=6144) when hdr just contains one empty string seems to be the culprit. The stack traces weren't all that illuminating, with sig 11 in memory-related functions following this. But in src/main/connections.c it looks like do_writechar doesn't check the length of strings when given an explicit nchars argument; so I think the strncpy() call will read too far. [This happened because I didn't remember that R lets null terminate strings; so I did a readChar(infh, nchars=6144) through some nulls at the start of the header, and ended up with a much shorter string than I was expecting. As far as I can tell do_readchar still behaves in these circumstances, and in any case I can produce the fault without it.] Using readBin and writeBin with integer() and size=1 seems to be the solution for header copying, but the faults still seemed worth reporting. I'm currently using R 1.8.0 on NetBSD/i386 1.6.1. Mark <><
Could you please give a reproducible example? On Fri, 14 Nov 2003 mjw@celos.net wrote:> Trying to copy the (binary) header of a input file directly > to an output file, I've had repeatable seg faults. The call: > > writeChar(hdr, outfh, nchars=6144) > > when hdr just contains one empty string seems to be the > culprit. The stack traces weren't all that illuminating, > with sig 11 in memory-related functions following this. But > in src/main/connections.c it looks like do_writechar doesn't > check the length of strings when given an explicit nchars > argument; so I think the strncpy() call will read too far.All R strings should be null-terminated, so strncpy will only copy the number of characters present (plus the null terminator) if less than n. I can see that writeChars might write rubbish out, but not why it should segfault. It is also unclear to me what to do in this case: flag a user error?> [This happened because I didn't remember that R lets null > terminate strings; so I did a readChar(infh, nchars=6144) > through some nulls at the start of the header, and ended up > with a much shorter string than I was expecting. As far as > I can tell do_readchar still behaves in these circumstances, > and in any case I can produce the fault without it.] > > Using readBin and writeBin with integer() and size=1 seems > to be the solution for header copying, but the faults still > seemed worth reporting.It's certainly the documented way. -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Gabor Grothendieck
2003-Nov-15 02:25 UTC
[Rd] writeChar potential buffer overrun (PR#5090)
On Windows 2000 and R 1.8.0 it crashes R. I get a popup windows with a message that says: "Rgui.exe has generated errors and will be closed by Windows. You will need to restart the program. An error log is being created." although it does not say where this error log is. I have attached a jpg screenshot although I am not sure it will get through to the list. --- Date: Fri, 14 Nov 2003 23:17:06 +0000 From: Mark White <mjw@celos.net> To: Prof Brian Ripley <ripley@stats.ox.ac.uk> Cc: <R-bugs@biostat.ku.dk>, <r-devel@stat.math.ethz.ch> Subject: Re: [Rd] writeChar potential buffer overrun (PR#5090) Prof Brian Ripley writes:> Could you please give a reproducible example?This breaks a new R session for me: zz <- file("/tmp/test.raw", "wb") hdr <- "" writeChar(hdr,zz,nchars=10000000) Causes sig 11 on two machines, one with R 1.7.1 and the other with R 1.8.0 (both on NetBSD, so YMMV on other OSs, I suppose.)> On Fri, 14 Nov 2003 mjw@celos.net wrote: > > Trying to copy the (binary) header of a input file directly > > to an output file, I've had repeatable seg faults. The call: > > > > writeChar(hdr, outfh, nchars=6144) > > > > when hdr just contains one empty string seems to be the > > culprit. The stack traces weren't all that illuminating, > > with sig 11 in memory-related functions following this. But > > in src/main/connections.c it looks like do_writechar doesn't > > check the length of strings when given an explicit nchars > > argument; so I think the strncpy() call will read too far. > > All R strings should be null-terminated, so strncpy will only copy the > number of characters present (plus the null terminator) if less than n.Quite true; I'd forgotten strncpy stopped at null.> I can see that writeChars might write rubbish out, but not why it should > segfault.Ok, I've just had a poke at it with ddd. The above example faults in the memset() call (line 2772 connections.c) in both R versions. I think the problem is underallocation of buf: len = 0; /* line 2757 */ for(i = 0; i < n; i++) { tlen = strlen(CHAR(STRING_ELT(object, i))); if (tlen > len) len = tlen; } buf = (char *) R_alloc(len + slen, sizeof(char)); which sets len to the longest string in object (in this case, 0 bytes), then allocates len+slen to buf. gdb confirms len=0 and slen=1 at this point. But a little later len = INTEGER(nchars)[i]; /* line 2770 */ s = CHAR(STRING_ELT(object, i)); memset(buf, '\0', len + slen); strncpy(buf, s, len); len is now set to [the first element of] nchars, which hasn't been checked, and is 10000000 (gdb confirms). So the call to memset() copies way over the end of allocated buf. Does that sound rational? I'm not very familiar with R's internals. I guess small overruns might not actually fault, because buf is within R's existing heap?> It is also unclear to me what to do in this case: flag a user > error?I'm not sure; perhaps the most logical thing is indeed to pad with nulls (as I think it would do with this fault fixed), and not raise an error at all.> > Using readBin and writeBin with integer() and size=1 seems > > to be the solution for header copying, but the faults still > > seemed worth reporting. > > It's certainly the documented way.Yup. I tried readChar/writeChar thinking it might be quicker (no type conversion?) but it's obviously not feasible with null-terminated strings. Mark <>< --- On Fri 11/14, Mark White < mjw@celos.net > wrote: From: Mark White [mailto: mjw@celos.net] To: ripley@stats.ox.ac.uk Cc: R-bugs@biostat.ku.dk, r-devel@stat.math.ethz.ch Date: Fri, 14 Nov 2003 23:17:06 +0000 Subject: Re: [Rd] writeChar potential buffer overrun (PR#5090) Prof Brian Ripley writes:<br>> Could you please give a reproducible example?<br><br>This breaks a new R session for me:<br><br> zz <- file("/tmp/test.raw", "wb")<br> hdr <- ""<br> writeChar(hdr,zz,nchars=10000000)<br><br>Causes sig 11 on two machines, one with R 1.7.1 and the<br>other with R 1.8.0 (both on NetBSD, so YMMV on other OSs, I<br>suppose.)<br><br>> On Fri, 14 Nov 2003 mjw@celos.net wrote:<br>> > Trying to copy the (binary) header of a input file directly<br>> > to an output file, I've had repeatable seg faults. The call:<br>> > <br>> > writeChar(hdr, outfh, nchars=6144)<br>> > <br>> > when hdr just contains one empty string seems to be the<br>> > culprit. The stack traces weren't all that illuminating,<br>> > with sig 11 in memory-related functions following this. But<br>> > in src/main/connections.c it looks like do_writechar doesn't<br>> > check the length of strings when given an explicit nchars<br>> > argument; so I think the strncpy() call will read too far.<br>> <br>> All R strings should be null-terminated, so strncpy will only copy the<br>> number of characters present (plus the null terminator) if less than n.<br><br>Quite true; I'd forgotten strncpy stopped at null.<br><br>> I can see that writeChars might write rubbish out, but not why it should <br>> segfault. <br><br>Ok, I've just had a poke at it with ddd. The above example<br>faults in the memset() call (line 2772 connections.c) in both<br>R versions. I think the problem is underallocation of buf:<br><br> len = 0; /* line 2757 */<br> for(i = 0; i < n; i++) {<br> tlen = strlen(CHAR(STRING_ELT(object, i)));<br> if (tlen > len) len = tlen;<br> }<br> buf = (char *) R_alloc(len + slen, sizeof(char));<br><br>which sets len to the longest string in object (in this<br>case, 0 bytes), then allocates len+slen to buf. gdb<br>confirms len=0 and slen=1 at this point. But a little later<br><br> len = INTEGER(nchars)[i]; /* line 2770 */<br> s = CHAR(STRING_ELT(object, i));<br> mems et(buf, '\0', len + slen);<br> strncpy(buf, s, len);<br><br>len is now set to [the first element of] nchars, which<br>hasn't been checked, and is 10000000 (gdb confirms). So the<br>call to memset() copies way over the end of allocated buf.<br><br>Does that sound rational? I'm not very familiar with R's<br>internals. I guess small overruns might not actually fault,<br>because buf is within R's existing heap?<br><br>> It is also unclear to me what to do in this case: flag a user <br>> error?<br><br>I'm not sure; perhaps the most logical thing is indeed to<br>pad with nulls (as I think it would do with this fault<br>fixed), and not raise an error at all.<br><br>> > Using readBin and writeBin with integer() and size=1 seems<br>> > to be the solution for header copying, but the faults still<br>> > seemed worth reporting.<br>> <br>> It's certainly the documented way.<br><br>Yup. I tried readChar/writeChar thinking it might be<br>quicker (no type conversion?) but it's obviously not<br>feasible with null-terminated strings.<br><br>Mark <><<br><br>______________________________________________<br>R-devel@stat.math.ethz.ch mailing list<br>https://www.stat.math.ethz.ch/mailman/listinfo/r-devel<br> _______________________________________________