siddu479
2012-Oct-16 08:08 UTC
[R] Excluding all teh columns from a data frame if the standard deviation of that column is zero(0).
Hi All, I have a data frame where nearly 10K columns of data, where most of them have standard deviation( of all rows) as zero. I want to exclude all the columns from the data frame and proceed to further processing. I tried like blow. *data <- read.csv("data.CSV", header=T) for(i in 2:ncol(data)) if(sd(data[,i])==0){ df[,i] <-NULL } * where I have the data columns from 2:ncol, but getting the error "Error in df[, i] <- NULL : object of type 'closure' is not subsettable" Can any one suggest the right method to accomplish this. ----- Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: http://r.789695.n4.nabble.com/Excluding-all-teh-columns-from-a-data-frame-if-the-standard-deviation-of-that-column-is-zero-0-tp4646310.html Sent from the R help mailing list archive at Nabble.com.
R. Michael Weylandt
2012-Oct-16 10:24 UTC
[R] Excluding all teh columns from a data frame if the standard deviation of that column is zero(0).
On Tue, Oct 16, 2012 at 9:08 AM, siddu479 <onlyfordigitalstuff at gmail.com> wrote:> Hi All, > > I have a data frame where nearly 10K columns of data, where most of them > have standard deviation( of all rows) as zero. > I want to exclude all the columns from the data frame and proceed to further > processing. > > I tried like blow. > *data <- read.csv("data.CSV", header=T) > > for(i in 2:ncol(data)) > if(sd(data[,i])==0){ > df[,i] <-NULL > } > * > where I have the data columns from 2:ncol, but getting the error "Error in > df[, i] <- NULL : object of type 'closure' is not subsettable" > > Can any one suggest the right method to accomplish this. >A perfect example of why "df" is a bad function name. Here you are getting the function ( = closure, more or less) df, density function of the F distribution, instead of the uninitialized variable "df". Since the function can't be subsetted, you get the error. In fact, I think you really just want this one liner: !(apply(data, 2, sd) == 0) which can be used to subset. In the same vein as the df problem, data is also a bad function name (it's also a pre-defined function used for loading, surprise surprise!, data) but R is smart enough to keep them straight in this simple example. In your real script, however, I'd strongly suggest you change it. Cheers, Michael
Rui Barradas
2012-Oct-16 10:35 UTC
[R] Excluding all teh columns from a data frame if the standard deviation of that column is zero(0).
Hello, You're calling your dataset 'data' and 'df' in the same instruction, hence the error. (Even if you were to call it different names in different instructions...) Also, both 'data' and 'df' are really bad names for objects, they're already are R functions. Name your dataset something else. And be consistent in the use of that name. Hope this helps, Rui Barradas Em 16-10-2012 09:08, siddu479 escreveu:> Hi All, > > I have a data frame where nearly 10K columns of data, where most of them > have standard deviation( of all rows) as zero. > I want to exclude all the columns from the data frame and proceed to further > processing. > > I tried like blow. > *data <- read.csv("data.CSV", header=T) > > for(i in 2:ncol(data)) > if(sd(data[,i])==0){ > df[,i] <-NULL > } > * > where I have the data columns from 2:ncol, but getting the error "Error in > df[, i] <- NULL : object of type 'closure' is not subsettable" > > Can any one suggest the right method to accomplish this. > > > > > > > ----- > Sidda > Business Analyst Lead > Applied Materials Inc. > > -- > View this message in context: http://r.789695.n4.nabble.com/Excluding-all-teh-columns-from-a-data-frame-if-the-standard-deviation-of-that-column-is-zero-0-tp4646310.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
arun
2012-Oct-16 12:04 UTC
[R] Excluding all teh columns from a data frame if the standard deviation of that column is zero(0).
HI, May be this helps. set.seed(1) ?dat1<-data.frame(col1=rep(5,10),col2=rnorm(10,15),col3=rep(25,10),col4=rnorm(10,25)) ?dat1[sapply(dat1,function(x) sd(x)!=0)] #?????? col2???? col4 #1? 14.37355 26.51178 #2? 15.18364 25.38984 #3? 14.16437 24.37876 #4? 16.59528 22.78530 #5? 15.32951 26.12493 #6? 14.17953 24.95507 #7? 15.48743 24.98381 #8? 15.73832 25.94384 #9? 15.57578 25.82122 #10 14.69461 25.59390 A.K. ----- Original Message ----- From: siddu479 <onlyfordigitalstuff at gmail.com> To: r-help at r-project.org Cc: Sent: Tuesday, October 16, 2012 4:08 AM Subject: [R] Excluding all teh columns from a data frame if the standard deviation of that column is zero(0). Hi All, ? I have a data frame where nearly 10K columns of data, where most of them have standard deviation( of all rows) as zero. I want to exclude all the columns from the data frame and proceed to further processing. I tried like blow.? *data <- read.csv("data.CSV", header=T) for(i in 2:ncol(data)) if(sd(data[,i])==0){ df[,i] <-NULL } * where I have the data columns from 2:ncol, but getting the error "Error in df[, i] <- NULL : object of type 'closure' is not subsettable" Can any one suggest the right method to accomplish this. ----- Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: http://r.789695.n4.nabble.com/Excluding-all-teh-columns-from-a-data-frame-if-the-standard-deviation-of-that-column-is-zero-0-tp4646310.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Reasonably Related Threads
- listing the files in a directory using regular expressions
- transforming a .csv file column names as per a particular column rows using R code
- Getting error while running unix commands within R using system() function
- Object of type 'closure' not subsettable
- Spelling (PR#6570)