ryszard.czerminski@pharma.novartis.com
2003-Oct-24 17:45 UTC
[R] how to remove NaN columns ?
How can I remove columns with NaN entries ? Here is my simple example:> data <- read.csv("test.csv") > xdata <- data[3:length(data)] > xs <- lapply(xdata, function(x){(x - mean(x))/sqrt(var(x))}) > x <- data.frame(xs) > xC D E F 1 -0.7071068 NaN -0.7071068 -0.7071068 2 0.7071068 NaN 0.7071068 0.7071068 I am sure it is possible to remove column D (with NaN's) in some simple fashion, using is.nan function without explicitly looping through, and I am sure I was able to do it in the past, but I cannot recall how. Your help will be greatly appreciated. Ryszard [[alternative HTML version deleted]]
As an example:> dat <- data.frame(x=1:5, y=NaN, z=5:1) > datx y z 1 1 NaN 5 2 2 NaN 4 3 3 NaN 3 4 4 NaN 2 5 5 NaN 1> bad <- sapply(dat, function(x) all(is.nan(x))) > dat[,!bad]x z 1 1 5 2 2 4 3 3 3 4 4 2 5 5 1 HTH, Andy> -----Original Message----- > From: ryszard.czerminski at pharma.novartis.com > [mailto:ryszard.czerminski at pharma.novartis.com] > Sent: Friday, October 24, 2003 1:46 PM > To: r-help at stat.math.ethz.ch > Subject: [R] how to remove NaN columns ? > > > How can I remove columns with NaN entries ? > > Here is my simple example: > > > data <- read.csv("test.csv") > > xdata <- data[3:length(data)] > > xs <- lapply(xdata, function(x){(x - mean(x))/sqrt(var(x))}) x <- > > data.frame(xs) x > C D E F > 1 -0.7071068 NaN -0.7071068 -0.7071068 > 2 0.7071068 NaN 0.7071068 0.7071068 > > I am sure it is possible to remove column D (with NaN's) in > some simple > fashion, using is.nan function > without explicitly looping through, and I am sure I was able > to do it in > the past, but I cannot recall how. > > Your help will be greatly appreciated. > > Ryszard > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo> /r-help >
> > xs <- lapply(xdata, function(x){(x - mean(x))/sqrt(var(x))})Incidentally, the function 'scale' does just that. -- __________________________________________________ [ ] [ Giovanni Petris GPetris at uark.edu ] [ Department of Mathematical Sciences ] [ University of Arkansas - Fayetteville, AR 72701 ] [ Ph: (479) 575-6324, 575-8630 (fax) ] [ http://definetti.uark.edu/~gpetris/ ] [__________________________________________________]
ryszard.czerminski at pharma.novartis.com wrote:> How can I remove columns with NaN entries ? > > Here is my simple example: > > >>data <- read.csv("test.csv") >>xdata <- data[3:length(data)] >>xs <- lapply(xdata, function(x){(x - mean(x))/sqrt(var(x))}) >>x <- data.frame(xs) >>x > > C D E F > 1 -0.7071068 NaN -0.7071068 -0.7071068 > 2 0.7071068 NaN 0.7071068 0.7071068 > > I am sure it is possible to remove column D (with NaN's) in some simple > fashion, using is.nan function > without explicitly looping through, and I am sure I was able to do it in > the past, but I cannot recall how. >In addition to Andy's helpful suggestion, if your data is a matrix rather than a data.frame, you can use which() with arr.ind=TRUE. For this example, Andy's suggestion is cleaner, however. > foo <- as.matrix(foo) > foo C D E F 1 -0.7071068 NaN -0.7071068 -0.7071068 2 0.7071068 NaN 0.7071068 0.7071068 > which(is.nan(foo)) [1] 3 4 > which(is.nan(foo),arr.ind=TRUE) row col 1 1 2 2 2 2 > unique(which(is.nan(foo),arr.ind=TRUE)[,2]) [1] 2 > -- Indigo Industrial Controls Ltd. http://www.indigoindustrial.co.nz 64-21-343-545 jasont at indigoindustrial.co.nz
ryszard.czerminski@pharma.novartis.com
2003-Oct-24 19:23 UTC
[R] how to remove NaN columns ?
Nice! I noticed that in generated structure it has two attributes attr(,"scaled:center") and attr(,"scaled:scale") How can I access them ? R Giovanni Petris <GPetris@uark.edu> 10/24/2003 02:50 PM Please respond to Giovanni Petris To: Ryszard Czerminski/PH/Novartis@PH cc: r-help@stat.math.ethz.ch Subject: Re: [R] how to remove NaN columns ?> > xs <- lapply(xdata, function(x){(x - mean(x))/sqrt(var(x))})Incidentally, the function 'scale' does just that. -- __________________________________________________ [ ] [ Giovanni Petris GPetris@uark.edu ] [ Department of Mathematical Sciences ] [ University of Arkansas - Fayetteville, AR 72701 ] [ Ph: (479) 575-6324, 575-8630 (fax) ] [ http://definetti.uark.edu/~gpetris/ ] [__________________________________________________] [[alternative HTML version deleted]]
ryszard.czerminski@pharma.novartis.com
2003-Oct-27 20:34 UTC
[R] how to remove NaN columns ?
I received a lot of good advice how to remove NaN columns - thank you all !!! The simplest mechanism to center/scale and remove NaN columns seems to be> xdata <- data.frame(A = 3:1, B = 1:3, rep(9, 3)) > xs <- scale(xdata) > mask <- sapply(as.data.frame(xs), function(x) all(is.nan(x))) > scaled.centered.x <- as.data.frame(xs)[!mask] > scaled.centered.xA B 1 1 -1 2 0 0 3 -1 1 Note that as.data.frame(xs) is important, because apparently xs from scale() function is not a data frame and if as.data.frame() is omitted in the code fragment above we get something like this> mask <- sapply(xs, function(x) all(is.nan(x))) > xs[!mask]1 2 3 <NA> <NA> <NA> 1 0 -1 -1 0 1 Maybe somebody more knowledgeable could give some explanation for this behavior. Ryszard [[alternative HTML version deleted]]