Li, Yan (IED)
2007-Oct-18 14:38 UTC
[R] How to avoid conversion to factors (data frame to zoo)
Hi all,
I was trying to convert a data frame to a zoo object so I can use some
time series functions like lag(). But it seems then everything became a
factor, so I have to convert it back to numeric to run the correct
regressions. Is there a way to avoid it? Here is an example:
#############################
a <- data.frame(nn =as.character(c("a", "b",
"c", "d")), dd as.Date("2007-08-01")+ c(1,2,3,4),
x = rnorm(4), y = rnorm(4))
a
str(a)
b <- zoo(a, order.by = a$dd)
b
str(b)
c <- data.frame(b)
str(c)
#############################################333
Results of this example:> a
nn dd x y
1 a 2007-08-02 0.388 -0.394
2 b 2007-08-03 -0.054 -0.059
3 c 2007-08-04 -1.377 1.100
4 d 2007-08-05 -0.415 0.763> str(a)
'data.frame': 4 obs. of 4 variables:
$ nn: Factor w/ 4 levels
"a","b","c","d": 1 2 3 4
$ dd:Class 'Date' num [1:4] 13727 13728 13729 13730
$ x : num 0.3877 -0.0538 -1.3771 -0.4150
$ y : num -0.3943 -0.0593 1.1000 0.7632> b <- zoo(a, order.by = a$dd)
> b
nn dd x y
2007-08-02 a 2007-08-02 0.388 -0.394
2007-08-03 b 2007-08-03 -0.054 -0.059
2007-08-04 c 2007-08-04 -1.377 1.100
2007-08-05 d 2007-08-05 -0.415 0.763> str(b)
chr [1:4, 1:4] "a" "b" "c" "d"
"2007-08-02" "2007-08-03" ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:4] "1" "2" "3" "4"
..$ : chr [1:4] "nn" "dd" "x" "y"
- attr(*, "index")=Class 'Date' num [1:4] 13727 13728 13729
13730>
> c <- data.frame(b)
> str(c)
'data.frame': 4 obs. of 4 variables:
$ nn: Factor w/ 4 levels
"a","b","c","d": 1 2 3 4
..- attr(*, "names")= chr "1" "2" "3"
"4"
$ dd: Factor w/ 4 levels "2007-08-02","2007-08-03",..: 1 2
3 4
..- attr(*, "names")= chr "1" "2" "3"
"4"
$ x : Factor w/ 4 levels " 0.388","-0.054",..: 1 2 4 3
..- attr(*, "names")= chr "1" "2" "3"
"4"
$ y : Factor w/ 4 levels " 0.763"," 1.100",..: 4 3 2 1
..- attr(*, "names")= chr "1" "2" "3"
"4"
##################################3
So after converting to zoo, all the variables became factors. How can I
keep this from happening? Thank you very much for any advice.
Yan
--------------------------------------------------------
This is not an offer (or solicitation of an offer) to bu...{{dropped:22}}
Gabor Grothendieck
2007-Oct-18 15:16 UTC
[R] How to avoid conversion to factors (data frame to zoo)
A zoo variable is a vector or matrix with an index so you can't
mix types (factors and numeric) in a single zoo variable; however,
you can represented the factor numerically:
library(zoo)
set.seed(1)
a <- data.frame(
nn = letters[1:4],
dd = as.Date("2007-08-01") + 1:4,
x = rnorm(4),
y = rnorm(4)
)
z <- zoo(data.matrix(a[-2]), a$dd)
z
# or you can put the factor in a second zoo variable:
z2 <- zoo(as.matrix(a[3:4]), a$dd)
zf <- zoo(a$nn, a$dd)
z2
zf
On 10/18/07, Li, Yan (IED) <Yan.Y.Li at morganstanley.com>
wrote:> Hi all,
>
> I was trying to convert a data frame to a zoo object so I can use some
> time series functions like lag(). But it seems then everything became a
> factor, so I have to convert it back to numeric to run the correct
> regressions. Is there a way to avoid it? Here is an example:
>
> #############################
> a <- data.frame(nn =as.character(c("a", "b",
"c", "d")), dd > as.Date("2007-08-01")+
c(1,2,3,4), x = rnorm(4), y = rnorm(4))
> a
> str(a)
>
>
> b <- zoo(a, order.by = a$dd)
> b
> str(b)
>
> c <- data.frame(b)
> str(c)
>
> #############################################333
> Results of this example:
> > a
> nn dd x y
> 1 a 2007-08-02 0.388 -0.394
> 2 b 2007-08-03 -0.054 -0.059
> 3 c 2007-08-04 -1.377 1.100
> 4 d 2007-08-05 -0.415 0.763
> > str(a)
> 'data.frame': 4 obs. of 4 variables:
> $ nn: Factor w/ 4 levels
"a","b","c","d": 1 2 3 4
> $ dd:Class 'Date' num [1:4] 13727 13728 13729 13730
> $ x : num 0.3877 -0.0538 -1.3771 -0.4150
> $ y : num -0.3943 -0.0593 1.1000 0.7632
> > b <- zoo(a, order.by = a$dd)
> > b
> nn dd x y
> 2007-08-02 a 2007-08-02 0.388 -0.394
> 2007-08-03 b 2007-08-03 -0.054 -0.059
> 2007-08-04 c 2007-08-04 -1.377 1.100
> 2007-08-05 d 2007-08-05 -0.415 0.763
> > str(b)
> chr [1:4, 1:4] "a" "b" "c" "d"
"2007-08-02" "2007-08-03" ...
> - attr(*, "dimnames")=List of 2
> ..$ : chr [1:4] "1" "2" "3" "4"
> ..$ : chr [1:4] "nn" "dd" "x" "y"
> - attr(*, "index")=Class 'Date' num [1:4] 13727 13728
13729 13730
> >
> > c <- data.frame(b)
> > str(c)
> 'data.frame': 4 obs. of 4 variables:
> $ nn: Factor w/ 4 levels
"a","b","c","d": 1 2 3 4
> ..- attr(*, "names")= chr "1" "2"
"3" "4"
> $ dd: Factor w/ 4 levels "2007-08-02","2007-08-03",..:
1 2 3 4
> ..- attr(*, "names")= chr "1" "2"
"3" "4"
> $ x : Factor w/ 4 levels " 0.388","-0.054",..: 1 2 4 3
> ..- attr(*, "names")= chr "1" "2"
"3" "4"
> $ y : Factor w/ 4 levels " 0.763"," 1.100",..: 4 3 2 1
> ..- attr(*, "names")= chr "1" "2"
"3" "4"
> ##################################3
> So after converting to zoo, all the variables became factors. How can I
> keep this from happening? Thank you very much for any advice.
>
> Yan
> --------------------------------------------------------
>
> This is not an offer (or solicitation of an offer) to bu...{{dropped:22}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
John Kane
2007-Oct-18 15:21 UTC
[R] How to avoid conversion to factors (data frame to zoo)
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/98227.html Your Rprofile has the setting options(stringsAsFactors = TRUE) If you override it globally by using options(stringsAsFactors = FALSE) it will give you what you want but observe Gabor's caveat. I don't know of the other solution re settings stringsAsFactors in the data.frame will work but it looks like it should --- "Li, Yan (IED)" <Yan.Y.Li at MorganStanley.com> wrote:> Hi all, > > I was trying to convert a data frame to a zoo object > so I can use some > time series functions like lag(). But it seems then > everything became a > factor, so I have to convert it back to numeric to > run the correct > regressions. Is there a way to avoid it? Here is an > example: > > ############################# > a <- data.frame(nn =as.character(c("a", "b", "c", > "d")), dd > as.Date("2007-08-01")+ c(1,2,3,4), x = rnorm(4), y > rnorm(4)) > a > str(a) > > > b <- zoo(a, order.by = a$dd) > b > str(b) > > c <- data.frame(b) > str(c) > > #############################################333 > Results of this example: > > a > nn dd x y > 1 a 2007-08-02 0.388 -0.394 > 2 b 2007-08-03 -0.054 -0.059 > 3 c 2007-08-04 -1.377 1.100 > 4 d 2007-08-05 -0.415 0.763 > > str(a) > 'data.frame': 4 obs. of 4 variables: > $ nn: Factor w/ 4 levels "a","b","c","d": 1 2 3 4 > $ dd:Class 'Date' num [1:4] 13727 13728 13729 > 13730 > $ x : num 0.3877 -0.0538 -1.3771 -0.4150 > $ y : num -0.3943 -0.0593 1.1000 0.7632 > > b <- zoo(a, order.by = a$dd) > > b > nn dd x y > 2007-08-02 a 2007-08-02 0.388 -0.394 > 2007-08-03 b 2007-08-03 -0.054 -0.059 > 2007-08-04 c 2007-08-04 -1.377 1.100 > 2007-08-05 d 2007-08-05 -0.415 0.763 > > str(b) > chr [1:4, 1:4] "a" "b" "c" "d" "2007-08-02" > "2007-08-03" ... > - attr(*, "dimnames")=List of 2 > ..$ : chr [1:4] "1" "2" "3" "4" > ..$ : chr [1:4] "nn" "dd" "x" "y" > - attr(*, "index")=Class 'Date' num [1:4] 13727 > 13728 13729 13730 > > > > c <- data.frame(b) > > str(c) > 'data.frame': 4 obs. of 4 variables: > $ nn: Factor w/ 4 levels "a","b","c","d": 1 2 3 4 > ..- attr(*, "names")= chr "1" "2" "3" "4" > $ dd: Factor w/ 4 levels > "2007-08-02","2007-08-03",..: 1 2 3 4 > ..- attr(*, "names")= chr "1" "2" "3" "4" > $ x : Factor w/ 4 levels " 0.388","-0.054",..: 1 2 > 4 3 > ..- attr(*, "names")= chr "1" "2" "3" "4" > $ y : Factor w/ 4 levels " 0.763"," 1.100",..: 4 3 > 2 1 > ..- attr(*, "names")= chr "1" "2" "3" "4" > ##################################3 > So after converting to zoo, all the variables became > factors. How can I > keep this from happening? Thank you very much for > any advice. > > Yan >--------------------------------------------------------> > This is not an offer (or solicitation of an offer) > to bu...{{dropped:22}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >