On Wed, May 25, 2016 at 12:31 PM, Martin Maechler <maechler at stat.math.ethz.ch> wrote:> Better than segfaulting, yes, but really agree with Bill (and > Gabe), also for Rf_mkChar(NULL): > I think both functions should give an error in such a case > rather than returning NA_character_ > > It is an accident of some kind if they got NULL, no?Not necessarily. A char* of NULL can be a string which is not initiated or simply unavailable due to configuration. The example from my original email was in curl package which exposes the version string of libz that was used to build libcurl: mkString(data->libz_version) This worked on all platforms that I tested. However a user found that if libcurl was configured --without-libz (which is uncommon) the libz_version string does not get set by libcurl and is always NULL. I had not foreseen this and it would lead to a segfault. I think making mkString() return NA for null strings lead to the most robust behavior. Raising an exception seems a little harsh to me, as there is no way the user would be able to recover from this, and there might not be an actual problem at all.
On Wed, May 25, 2016 at 4:23 AM, Jeroen Ooms <jeroen.ooms at stat.ucla.edu> wrote:> On Wed, May 25, 2016 at 12:31 PM, Martin Maechler > <maechler at stat.math.ethz.ch> wrote: > > Better than segfaulting, yes, but really agree with Bill (and > > Gabe), also for Rf_mkChar(NULL): > > I think both functions should give an error in such a case > > rather than returning NA_character_ > > > > It is an accident of some kind if they got NULL, no? > > Not necessarily. A char* of NULL can be a string which is not > initiated or simply unavailable due to configuration. > > The example from my original email was in curl package which exposes > the version string of libz that was used to build libcurl: > > mkString(data->libz_version) > > This worked on all platforms that I tested. However a user found that > if libcurl was configured --without-libz (which is uncommon) the > libz_version string does not get set by libcurl and is always NULL. I > had not foreseen this and it would lead to a segfault. > > I think making mkString() return NA for null strings lead to the most > robust behavior. Raising an exception seems a little harsh to me, as > there is no way the user would be able to recover from this, and there > might not be an actual problem at all. > >Robust in the sense of no error being thrown, but perhaps only correct by accident. NULL is not a valid C string --- should functions always return NA on invalid input? As Gabe mentions, in the cited use case, it's not clear whether the appropriate value is NA, "", or something else entirely. Generalization seems risky at this point.> ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > >[[alternative HTML version deleted]]
On Wed, May 25, 2016 at 7:22 AM, Michael Lawrence <lawrence.michael at gene.com> wrote:> On Wed, May 25, 2016 at 4:23 AM, Jeroen Ooms <jeroen.ooms at stat.ucla.edu> > wrote: >I'm not disagreeing with what's been said in this thread, but I can't help but recall that I brought up this exact issue probably 15 years ago and was told (by Brian, I believe) "don't do that" (pass a null pointer), which was perfectly fine. The real issue was not the behavior but that it was not documented or consistent. I've lived by the mantra since that you can never trust a pointer in R code. User must always check for NULL. I just wrote my own functions mkXXX_safe that wrap the internals and check the pointer. THK http://www.keittlab.org/ [[alternative HTML version deleted]]