I would like to propose that Rf_mkString(NULL) and Rf_mkChar(NULL) return NA rather than segfault. Case: the mkString() and mkChar() functions are convenient to wrap strings returned by e.g. external C libraries into an R vector. However sometimes a library returns NULL instead of a string when the result is unavailable. In some C libraries this can happen unexpectedly or is even undocumented. A good R package author always checks results for a null pointer, and deals with it accordingly. But sometimes we make assumptions. There was an example in the 'curl' package where a documented version string was suddenly NULL if libcurl was built with some unusual configuration. These problems are hard to catch and I don't see any benefit of segfaulting for such edge cases. Some packages use a macro like this to protect against such problems: #define make_string(x) x ? Rf_mkString(x) : ScalarString(NA_STRING) But I think it would make sense if this was the default behavior in Rf_mkString and Rf_mkChar.
On Thu, May 12, 2016 at 1:20 PM, Jeroen Ooms <jeroen.ooms at stat.ucla.edu> wrote:> I would like to propose that Rf_mkString(NULL) and Rf_mkChar(NULL) > return NA rather than segfault.An example implementation: https://git.io/vroxm With this patch, mkChar(NULL), mkCharCE(NULL, ce) would return NA_STRING rather than segfault at strlen(NULL). This automatically fixes mkString(NULL) as well which wraps mkChar (See Rinlinedfuns.h).
Shouldn't Rf_mkString(NULL) return (the c-level equivalent of) character() rather than the NA_character_? An empty string and NULL aren't the same. It seems reasonable for Rf_mkChar to give NA_character_ though. ~G On Tue, May 24, 2016 at 8:42 AM, Jeroen Ooms <jeroen.ooms at stat.ucla.edu> wrote:> On Thu, May 12, 2016 at 1:20 PM, Jeroen Ooms <jeroen.ooms at stat.ucla.edu> > wrote: > > I would like to propose that Rf_mkString(NULL) and Rf_mkChar(NULL) > > return NA rather than segfault. > > An example implementation: https://git.io/vroxm > > With this patch, mkChar(NULL), mkCharCE(NULL, ce) would return > NA_STRING rather than segfault at strlen(NULL). This automatically > fixes mkString(NULL) as well which wraps mkChar (See Rinlinedfuns.h). > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Gabriel Becker, PhD Associate Scientist (Bioinformatics) Genentech Research [[alternative HTML version deleted]]
Why should Rf_mkString(NULL) produce NA_STRING instead of "" (R_BlankString)? I prefer that passing in a nil pointer would cause an error instead, as the nil may arise by accident, perhaps a pointer to freed memory, and I would like to be notified that my code is bad instead of getting a random NA_STRING. Bill Dunlap TIBCO Software wdunlap tibco.com On Tue, May 24, 2016 at 8:42 AM, Jeroen Ooms <jeroen.ooms at stat.ucla.edu> wrote:> On Thu, May 12, 2016 at 1:20 PM, Jeroen Ooms <jeroen.ooms at stat.ucla.edu> > wrote: > > I would like to propose that Rf_mkString(NULL) and Rf_mkChar(NULL) > > return NA rather than segfault. > > An example implementation: https://git.io/vroxm > > With this patch, mkChar(NULL), mkCharCE(NULL, ce) would return > NA_STRING rather than segfault at strlen(NULL). This automatically > fixes mkString(NULL) as well which wraps mkChar (See Rinlinedfuns.h). > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Maybe Matching Threads
- Suggestion: mkString(NULL) should be NA
- Suggestion: mkString(NULL) should be NA
- Suggestion: mkString(NULL) should be NA
- A question about the API mkchar()
- Sys.getenv(): Error in substring(x, m + 1L) : invalid multibyte string at '<ff>' if an environment variable contains \xFF