Hi, I am troubled by the use of NULL or NA to indicate missing/non-specified function arguments. In the R code that I have looked at, it seems that both forms are used (NULL seems to be used more often though). Sometimes both variants are in the same declaration, e.g. format.default <- function(x, trim = FALSE, digits = NULL, nsmall = 0, justify = c("left", "right", "centre", "none"), width = NULL, na.encode = TRUE, scientific = NA, big.mark = "", big.interval = 3, small.mark = "", small.interval = 5, decimal.mark = ".", zero.print = NULL, ...) Is there a right way? And if both forms are used, how do I know which one is right? Thanks a lot and best regards, Hans-Peter
There is also a third way, namely use the missing function in the code: f <- function(x) if (missing(x)) print("missing") else print(x) f() On 10/16/06, Hans-Peter <gchappi at gmail.com> wrote:> Hi, > > I am troubled by the use of NULL or NA to indicate > missing/non-specified function arguments. > > In the R code that I have looked at, it seems that both forms are used > (NULL seems to be used more often though). Sometimes both variants are > in the same declaration, e.g. > > format.default <- > function(x, trim = FALSE, digits = NULL, nsmall = 0, > justify = c("left", "right", "centre", "none"), > width = NULL, na.encode = TRUE, scientific = NA, > big.mark = "", big.interval = 3, > small.mark = "", small.interval = 5, decimal.mark = ".", > zero.print = NULL, ...) > > Is there a right way? And if both forms are used, how do I know which > one is right? > > Thanks a lot and best regards, > Hans-Peter > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On 10/16/2006 8:47 AM, Hans-Peter wrote:> Hi, > > I am troubled by the use of NULL or NA to indicate > missing/non-specified function arguments. > > In the R code that I have looked at, it seems that both forms are used > (NULL seems to be used more often though). Sometimes both variants are > in the same declaration, e.g. > > format.default <- > function(x, trim = FALSE, digits = NULL, nsmall = 0, > justify = c("left", "right", "centre", "none"), > width = NULL, na.encode = TRUE, scientific = NA, > big.mark = "", big.interval = 3, > small.mark = "", small.interval = 5, decimal.mark = ".", > zero.print = NULL, ...) > > Is there a right way? And if both forms are used, how do I know which > one is right?As Gabor said, the third way is to give no default, but test missing() in the code. There are differences between the options, but I don't think there's a single "right way". Some differences: If you want to allow a vector of parameters, some of which are missing and some are not, then you'd probably want NA, so that something like c(1,2,NA) was possible. The length of NA is 1, but the length of NULL is 0, so it would be harder to expand NULL to the same length as x. Taking the length of a truly missing parameter, or trying to change it, will trigger an error. That is, rep(param, length(x)) will work for NA, but not the others. It's also convenient to declare an error if length(scientific) != 1. Using NULL or NA is a little clearer to a user who just takes a quick look at the function header, rather than carefully reading the man page, to find what parameters are needed. NA is logical, NULL is NULL. So in format.default, there could be a test is.logical(scientific) which will default to TRUE. So generally my advice would be: - Be consistent with similar existing functions. - Choose what you think will be convenient in current and predicted future versions of your function. Duncan Murdoch
2006/10/16, Duncan Murdoch <murdoch at stats.uwo.ca>:> As Gabor said, the third way is to give no default, but test missing() > in the code.I forgot this one, thank you. In my case it is probably not suited as I just pass the arguments to a C (Pascal) function and do the checking there. [explanations snipped]> So generally my advice would be: > - Be consistent with similar existing functions. > - Choose what you think will be convenient in current and predicted > future versions of your function.Ok, thank you. - Until now I always used NA but will (apart from your advice) prefer NULL from now on. In C (Pascal) code NULL is also easier to check than NA* Thanks again and best regards, Hans-Peter Suter -- * function riIsNull( _s: pSExp ): aRBoolean; cdecl; vs. function IsNaScalar( _x: pSExp ): boolean; begin result:= (riLength( _x ) = 1) and (riTypeOf( _x ) in [setLglSxp, setRealSxp]) and (rIsNa( riReal( riCoerceVector( _x, setRealSxp ) )[0] ) <> 0); end;
On 10/16/06, Hans-Peter <gchappi at gmail.com> wrote:> 2006/10/16, Duncan Murdoch <murdoch at stats.uwo.ca>: > > As Gabor said, the third way is to give no default, but test missing() > > in the code. > > I forgot this one, thank you. In my case it is probably not suited as > I just pass the arguments to a C (Pascal) function and do the checking > there.The R interface need not be identical to the C or Pascal interface. Its pretty easy to convert making use of the fact that a nonexistent else leg returns NULL: f <- function(x) { x <- if (!missing(x)) x; x } f() # NULL
Hans-Peter <gchappi at gmail.com> wrote:> I am troubled by the use of NULL or NA to indicate > missing/non-specified function arguments.I suggest using NULL for arguments which are vectors or lists of unspecified length, and NA for "scalars" (arguments whose length should always be one, such as na.rm). I rarely use missing arguments, as they are harder to pass down to other functions. -- David Brahm (brahm at alum.mit.edu)
On 10/16/2006 5:06 PM, Brahm, David wrote:> Hans-Peter <gchappi at gmail.com> wrote: >> I am troubled by the use of NULL or NA to indicate >> missing/non-specified function arguments. > > I suggest using NULL for arguments which are vectors or lists of > unspecified length, and NA for "scalars" (arguments whose length > should always be one, such as na.rm). I rarely use missing arguments, > as they are harder to pass down to other functions.To be a little more precise: they are easy to pass down, but hard to do anything else with. For example:> f <- function(x) if (missing(x)) print("missing!") else print(x) > g <- function(x) f(x) > h <- function(x) { y <- x; f(y) } > g()[1] "missing!"> h()Error in h() : argument "x" is missing, with no default Duncan Murdoch