Li, Yan (IED)
2007-Oct-18 14:38 UTC
[R] How to avoid conversion to factors (data frame to zoo)
Hi all, I was trying to convert a data frame to a zoo object so I can use some time series functions like lag(). But it seems then everything became a factor, so I have to convert it back to numeric to run the correct regressions. Is there a way to avoid it? Here is an example: ############################# a <- data.frame(nn =as.character(c("a", "b", "c", "d")), dd as.Date("2007-08-01")+ c(1,2,3,4), x = rnorm(4), y = rnorm(4)) a str(a) b <- zoo(a, order.by = a$dd) b str(b) c <- data.frame(b) str(c) #############################################333 Results of this example:> ann dd x y 1 a 2007-08-02 0.388 -0.394 2 b 2007-08-03 -0.054 -0.059 3 c 2007-08-04 -1.377 1.100 4 d 2007-08-05 -0.415 0.763> str(a)'data.frame': 4 obs. of 4 variables: $ nn: Factor w/ 4 levels "a","b","c","d": 1 2 3 4 $ dd:Class 'Date' num [1:4] 13727 13728 13729 13730 $ x : num 0.3877 -0.0538 -1.3771 -0.4150 $ y : num -0.3943 -0.0593 1.1000 0.7632> b <- zoo(a, order.by = a$dd) > bnn dd x y 2007-08-02 a 2007-08-02 0.388 -0.394 2007-08-03 b 2007-08-03 -0.054 -0.059 2007-08-04 c 2007-08-04 -1.377 1.100 2007-08-05 d 2007-08-05 -0.415 0.763> str(b)chr [1:4, 1:4] "a" "b" "c" "d" "2007-08-02" "2007-08-03" ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:4] "1" "2" "3" "4" ..$ : chr [1:4] "nn" "dd" "x" "y" - attr(*, "index")=Class 'Date' num [1:4] 13727 13728 13729 13730> > c <- data.frame(b) > str(c)'data.frame': 4 obs. of 4 variables: $ nn: Factor w/ 4 levels "a","b","c","d": 1 2 3 4 ..- attr(*, "names")= chr "1" "2" "3" "4" $ dd: Factor w/ 4 levels "2007-08-02","2007-08-03",..: 1 2 3 4 ..- attr(*, "names")= chr "1" "2" "3" "4" $ x : Factor w/ 4 levels " 0.388","-0.054",..: 1 2 4 3 ..- attr(*, "names")= chr "1" "2" "3" "4" $ y : Factor w/ 4 levels " 0.763"," 1.100",..: 4 3 2 1 ..- attr(*, "names")= chr "1" "2" "3" "4" ##################################3 So after converting to zoo, all the variables became factors. How can I keep this from happening? Thank you very much for any advice. Yan -------------------------------------------------------- This is not an offer (or solicitation of an offer) to bu...{{dropped:22}}
Gabor Grothendieck
2007-Oct-18 15:16 UTC
[R] How to avoid conversion to factors (data frame to zoo)
A zoo variable is a vector or matrix with an index so you can't mix types (factors and numeric) in a single zoo variable; however, you can represented the factor numerically: library(zoo) set.seed(1) a <- data.frame( nn = letters[1:4], dd = as.Date("2007-08-01") + 1:4, x = rnorm(4), y = rnorm(4) ) z <- zoo(data.matrix(a[-2]), a$dd) z # or you can put the factor in a second zoo variable: z2 <- zoo(as.matrix(a[3:4]), a$dd) zf <- zoo(a$nn, a$dd) z2 zf On 10/18/07, Li, Yan (IED) <Yan.Y.Li at morganstanley.com> wrote:> Hi all, > > I was trying to convert a data frame to a zoo object so I can use some > time series functions like lag(). But it seems then everything became a > factor, so I have to convert it back to numeric to run the correct > regressions. Is there a way to avoid it? Here is an example: > > ############################# > a <- data.frame(nn =as.character(c("a", "b", "c", "d")), dd > as.Date("2007-08-01")+ c(1,2,3,4), x = rnorm(4), y = rnorm(4)) > a > str(a) > > > b <- zoo(a, order.by = a$dd) > b > str(b) > > c <- data.frame(b) > str(c) > > #############################################333 > Results of this example: > > a > nn dd x y > 1 a 2007-08-02 0.388 -0.394 > 2 b 2007-08-03 -0.054 -0.059 > 3 c 2007-08-04 -1.377 1.100 > 4 d 2007-08-05 -0.415 0.763 > > str(a) > 'data.frame': 4 obs. of 4 variables: > $ nn: Factor w/ 4 levels "a","b","c","d": 1 2 3 4 > $ dd:Class 'Date' num [1:4] 13727 13728 13729 13730 > $ x : num 0.3877 -0.0538 -1.3771 -0.4150 > $ y : num -0.3943 -0.0593 1.1000 0.7632 > > b <- zoo(a, order.by = a$dd) > > b > nn dd x y > 2007-08-02 a 2007-08-02 0.388 -0.394 > 2007-08-03 b 2007-08-03 -0.054 -0.059 > 2007-08-04 c 2007-08-04 -1.377 1.100 > 2007-08-05 d 2007-08-05 -0.415 0.763 > > str(b) > chr [1:4, 1:4] "a" "b" "c" "d" "2007-08-02" "2007-08-03" ... > - attr(*, "dimnames")=List of 2 > ..$ : chr [1:4] "1" "2" "3" "4" > ..$ : chr [1:4] "nn" "dd" "x" "y" > - attr(*, "index")=Class 'Date' num [1:4] 13727 13728 13729 13730 > > > > c <- data.frame(b) > > str(c) > 'data.frame': 4 obs. of 4 variables: > $ nn: Factor w/ 4 levels "a","b","c","d": 1 2 3 4 > ..- attr(*, "names")= chr "1" "2" "3" "4" > $ dd: Factor w/ 4 levels "2007-08-02","2007-08-03",..: 1 2 3 4 > ..- attr(*, "names")= chr "1" "2" "3" "4" > $ x : Factor w/ 4 levels " 0.388","-0.054",..: 1 2 4 3 > ..- attr(*, "names")= chr "1" "2" "3" "4" > $ y : Factor w/ 4 levels " 0.763"," 1.100",..: 4 3 2 1 > ..- attr(*, "names")= chr "1" "2" "3" "4" > ##################################3 > So after converting to zoo, all the variables became factors. How can I > keep this from happening? Thank you very much for any advice. > > Yan > -------------------------------------------------------- > > This is not an offer (or solicitation of an offer) to bu...{{dropped:22}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
John Kane
2007-Oct-18 15:21 UTC
[R] How to avoid conversion to factors (data frame to zoo)
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/98227.html Your Rprofile has the setting options(stringsAsFactors = TRUE) If you override it globally by using options(stringsAsFactors = FALSE) it will give you what you want but observe Gabor's caveat. I don't know of the other solution re settings stringsAsFactors in the data.frame will work but it looks like it should --- "Li, Yan (IED)" <Yan.Y.Li at MorganStanley.com> wrote:> Hi all, > > I was trying to convert a data frame to a zoo object > so I can use some > time series functions like lag(). But it seems then > everything became a > factor, so I have to convert it back to numeric to > run the correct > regressions. Is there a way to avoid it? Here is an > example: > > ############################# > a <- data.frame(nn =as.character(c("a", "b", "c", > "d")), dd > as.Date("2007-08-01")+ c(1,2,3,4), x = rnorm(4), y > rnorm(4)) > a > str(a) > > > b <- zoo(a, order.by = a$dd) > b > str(b) > > c <- data.frame(b) > str(c) > > #############################################333 > Results of this example: > > a > nn dd x y > 1 a 2007-08-02 0.388 -0.394 > 2 b 2007-08-03 -0.054 -0.059 > 3 c 2007-08-04 -1.377 1.100 > 4 d 2007-08-05 -0.415 0.763 > > str(a) > 'data.frame': 4 obs. of 4 variables: > $ nn: Factor w/ 4 levels "a","b","c","d": 1 2 3 4 > $ dd:Class 'Date' num [1:4] 13727 13728 13729 > 13730 > $ x : num 0.3877 -0.0538 -1.3771 -0.4150 > $ y : num -0.3943 -0.0593 1.1000 0.7632 > > b <- zoo(a, order.by = a$dd) > > b > nn dd x y > 2007-08-02 a 2007-08-02 0.388 -0.394 > 2007-08-03 b 2007-08-03 -0.054 -0.059 > 2007-08-04 c 2007-08-04 -1.377 1.100 > 2007-08-05 d 2007-08-05 -0.415 0.763 > > str(b) > chr [1:4, 1:4] "a" "b" "c" "d" "2007-08-02" > "2007-08-03" ... > - attr(*, "dimnames")=List of 2 > ..$ : chr [1:4] "1" "2" "3" "4" > ..$ : chr [1:4] "nn" "dd" "x" "y" > - attr(*, "index")=Class 'Date' num [1:4] 13727 > 13728 13729 13730 > > > > c <- data.frame(b) > > str(c) > 'data.frame': 4 obs. of 4 variables: > $ nn: Factor w/ 4 levels "a","b","c","d": 1 2 3 4 > ..- attr(*, "names")= chr "1" "2" "3" "4" > $ dd: Factor w/ 4 levels > "2007-08-02","2007-08-03",..: 1 2 3 4 > ..- attr(*, "names")= chr "1" "2" "3" "4" > $ x : Factor w/ 4 levels " 0.388","-0.054",..: 1 2 > 4 3 > ..- attr(*, "names")= chr "1" "2" "3" "4" > $ y : Factor w/ 4 levels " 0.763"," 1.100",..: 4 3 > 2 1 > ..- attr(*, "names")= chr "1" "2" "3" "4" > ##################################3 > So after converting to zoo, all the variables became > factors. How can I > keep this from happening? Thank you very much for > any advice. > > Yan >--------------------------------------------------------> > This is not an offer (or solicitation of an offer) > to bu...{{dropped:22}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >