aajit75
2011-Oct-31 13:09 UTC
[R] How to get Quartiles when data contains both numeric variables and factors
When data contains both factor and numeric variables, how to get quartiles for all numeric variables? n <- 100 x1 <- runif(n) x2 <- runif(n) x3 <- x1 + x2 + runif(n)/10 x4 <- x1 + x2 + x3 + runif(n)/10 x5 <- factor(sample(c('a','b','c'),n,replace=TRUE)) x6 <- factor(1*(x5=='a' | x5=='c')) data1 <- cbind(x1,x2,x3,x4,x5,x6) data <- data.frame(data1) data <- within(data,{x5 <- factor(x5)}) x <- data qs <- sapply(x, function(x) quantile(x, c(0.01, 0.99))) I get an error: Error in quantile.default(x, c(min_pct, max_pct)) : factors are not allowed Thanks for the help. -- View this message in context: http://r.789695.n4.nabble.com/How-to-get-Quartiles-when-data-contains-both-numeric-variables-and-factors-tp3955750p3955750.html Sent from the R help mailing list archive at Nabble.com.
andrija djurovic
2011-Oct-31 15:39 UTC
[R] How to get Quartiles when data contains both numeric variables and factors
Hi, you are almost there:>sapply(x, function(x) quantile(as.numeric(x), c(0.01, 0.99)))x1 x2 x3 x4 x5 x6 1% 0.0351777 0.007628441 0.225533 0.4459064 1 1 99% 0.9938919 0.964901423 1.826894 3.6226944 3 2 Andrija On Mon, Oct 31, 2011 at 2:09 PM, aajit75 <aajit75@yahoo.co.in> wrote:> When data contains both factor and numeric variables, how to get quartiles > for all numeric variables? > n <- 100 > x1 <- runif(n) > x2 <- runif(n) > x3 <- x1 + x2 + runif(n)/10 > x4 <- x1 + x2 + x3 + runif(n)/10 > x5 <- factor(sample(c('a','b','c'),n,replace=TRUE)) > x6 <- factor(1*(x5=='a' | x5=='c')) > data1 <- cbind(x1,x2,x3,x4,x5,x6) > data <- data.frame(data1) > > data <- within(data,{x5 <- factor(x5)}) > x <- data > > qs <- sapply(x, function(x) quantile(x, c(0.01, 0.99))) > > I get an error: Error in quantile.default(x, c(min_pct, max_pct)) : factors > are not allowed > > Thanks for the help. > > > -- > View this message in context: > http://r.789695.n4.nabble.com/How-to-get-Quartiles-when-data-contains-both-numeric-variables-and-factors-tp3955750p3955750.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
R. Michael Weylandt
2011-Oct-31 15:39 UTC
[R] How to get Quartiles when data contains both numeric variables and factors
Just add something to skip the non-numeric variables: e.g., lapply(x, function(x) if(is.numeric(x)) quantile(x,c(0.01, 0.99)) else levels(x)) If you want to use sapply(), you'll need the factor case to return something that is 2x1 so it can all be simplified nicely. Michael On Mon, Oct 31, 2011 at 9:09 AM, aajit75 <aajit75 at yahoo.co.in> wrote:> When data contains both factor and numeric variables, how to get quartiles > for all numeric variables? > n <- 100 > x1 <- runif(n) > x2 <- runif(n) > x3 <- x1 + x2 + runif(n)/10 > x4 <- x1 + x2 + x3 + runif(n)/10 > x5 <- factor(sample(c('a','b','c'),n,replace=TRUE)) > x6 <- factor(1*(x5=='a' | x5=='c')) > data1 <- cbind(x1,x2,x3,x4,x5,x6) > data <- data.frame(data1) > > data <- within(data,{x5 <- factor(x5)}) > x <- data > > qs <- sapply(x, function(x) quantile(x, c(0.01, 0.99))) > > I get an error: Error in quantile.default(x, c(min_pct, max_pct)) : factors > are not allowed > > Thanks for the help. > > > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-get-Quartiles-when-data-contains-both-numeric-variables-and-factors-tp3955750p3955750.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
andrija djurovic
2011-Oct-31 15:45 UTC
[R] How to get Quartiles when data contains both numeric variables and factors
Or this:> str(x)'data.frame': 100 obs. of 6 variables: $ x1: num 0.4548 0.0352 0.6353 0.6017 0.8588 ... $ x2: num 0.849 0.335 0.986 0.617 0.212 ... $ x3: num 1.35 0.46 1.67 1.23 1.14 ... $ x4: num 2.67 0.91 3.31 2.48 2.28 ... $ x5: Factor w/ 3 levels "1","2","3": 3 1 3 3 3 1 2 3 3 1 ... $ x6: num 2 2 2 2 2 2 1 2 2 2 ...> sapply(x[,sapply(x,is.numeric)], function(x) quantile(as.numeric(x),c(0.01, 0.99))) x1 x2 x3 x4 x6 1% 0.0351777 0.007628441 0.225533 0.4459064 1 99% 0.9938919 0.964901423 1.826894 3.6226944 2 Hope this helps. Andrija On Mon, Oct 31, 2011 at 2:09 PM, aajit75 <aajit75@yahoo.co.in> wrote:> When data contains both factor and numeric variables, how to get quartiles > for all numeric variables? > n <- 100 > x1 <- runif(n) > x2 <- runif(n) > x3 <- x1 + x2 + runif(n)/10 > x4 <- x1 + x2 + x3 + runif(n)/10 > x5 <- factor(sample(c('a','b','c'),n,replace=TRUE)) > x6 <- factor(1*(x5=='a' | x5=='c')) > data1 <- cbind(x1,x2,x3,x4,x5,x6) > data <- data.frame(data1) > > data <- within(data,{x5 <- factor(x5)}) > x <- data > > qs <- sapply(x, function(x) quantile(x, c(0.01, 0.99))) > > I get an error: Error in quantile.default(x, c(min_pct, max_pct)) : factors > are not allowed > > Thanks for the help. > > > -- > View this message in context: > http://r.789695.n4.nabble.com/How-to-get-Quartiles-when-data-contains-both-numeric-variables-and-factors-tp3955750p3955750.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]