Fabian Scheipl
2007-Nov-05 15:16 UTC
[Rd] Should numeric()/character() etc initialize with NA instead of 0 or ""?
Wouldn't it make programming more error-resistant if vectors were initialized with missing data, instad of zeroes or ""? That way, if you assign values to a vector elementwise and you miss some elements (because their indices were not selected or because the assignment didn't work out, see below for code examples) this would be immediately obvious from the value of the vector elements themselves and programming errors would be far less easy to overlook. e.g. x <- numeric(n) or for( i in seq(along = x) ) { try(x[i] <- function.which.might.crash( args[i] )) } or x <- numeric(n) x[condition1] <- foo(args1) x[condition2] <- foo(args2) ... x[conditionN] <- foo(argsN) will produce x without any NAs even if function.which.might.crash() actually did crash during the loop or if there are indices for which none of conditions 1 to N were true and you cannot distinguish between zeroes which are real results and zeroes that remained unchanged since initialization of the vector. In a sense, initializing with NAs would also be more consistent with vector(n, mode = "list"), which produces a list of n NULL-objects. (numeric(10) is just a wrapper for vector(10, mode="numeric")) Let me know what you think. Regards, Fabian [[alternative HTML version deleted]]
Prof Brian Ripley
2007-Nov-05 17:30 UTC
[Rd] Should numeric()/character() etc initialize with NA instead of 0 or ""?
On Mon, 5 Nov 2007, Fabian Scheipl wrote:> Wouldn't it make programming more error-resistant if vectors were > initialized with missing data, instad of zeroes or ""?Lots of code relies on this. It's common programming practice (and not just in R/S).> That way, if you assign values to a vector elementwise and you miss some > elements > (because their indices were not selected or because the assignment didn't > work out, see below for code examples) > this would be immediately obvious from the value of the vector elements > themselves > and programming errors would be far less easy to overlook.But using x <- rep(NA_real_, n) does this for you, and is much clearer to the reader. Using x <- numeric(n) is only appropriate if you want '0.0' elements.> e.g. > > x <- numeric(n) or > for( i in seq(along = x) ) > { > try(x[i] <- function.which.might.crash( args[i] )) > } > > or > > x <- numeric(n) > x[condition1] <- foo(args1) > x[condition2] <- foo(args2) > ... > x[conditionN] <- foo(argsN) > > will produce x without any NAs even if function.which.might.crash() actually > did crash during the loop or > if there are indices for which none of conditions 1 to N were true and you > cannot distinguish between zeroes which > are real results and zeroes that remained unchanged since initialization of > the vector. > > In a sense, initializing with NAs would also be more consistent with > vector(n, mode = "list"), which produces a list of n NULL-objects. > (numeric(10) is just a wrapper for vector(10, mode="numeric")) > > Let me know what you think. > > Regards, > Fabian > > [[alternative HTML version deleted]]You were specifically asked not to do that.> ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595