df<-data.frame() df[1:8,1]<-c("1","2","5","3","1","4","3","5") ##identifier 1 df[1:8,2]<-c("c","a","b","c","a","b","b","a") ##identifier 2 df[1:8,3]<-c(1,2,3,4,5,6,7,8) ##value ##Each unique combination of identifiers identifies a datapoint ##What I am trying to do is create a matrix with values of the third column ## and rows/column names of the 1st and 2nd column. As you can see, a lot of ## combinations of identifier1 and identifier 2, for example (1,b), do not ## have a value attached to them, and I would like those values in the matrix ## to be NA. ##From the ecodist, I have tried crosstab(df[,1],df[,2],df[,3]) ##It works perfectly in this case, but when I use it on my actual data.frame ##It ALWAYS crashes..differences between my real data.frame ## 1)Actual df is 3 million rows, 2)df[,1] and df[,2] are ## of class "factor" rather than class "character" ## knowing this I have tried as.character(df[,1]) & same for df2 ##but to no avail -- View this message in context: http://r.789695.n4.nabble.com/Converting-3-columns-of-a-data-frame-into-a-matrix-tp2263540p2263540.html Sent from the R help mailing list archive at Nabble.com.
##I have also tried the reshape package library(reshape) mm <- melt(df, id=c("date_", "id")) mm1 <- cast(mm, date_~id) aba <- mm1[,-1] final <- as.matrix(aba) colnames(final) <- df$id rownames(final) <- mm1$date_ final ##Again it works perfectly here, but when I get to my real dataset, I get a "Cannot allocate vector of size 27.4 MB" error It seems that the main problem is memory size. Is there an efficient way to do this? colnames(df)<-c("date_","id","totret") library(reshape) mm <- melt(df, id=c("date_", "id")) mm1 <- cast(mm, date_~id) aba <- mm1[,2:2365] final <- as.matrix(aba) colnames(final) <- Returns.nodup$id rownames(final) <- mm1$date_ Jeff08 wrote:> > df<-data.frame() > df[1:8,1]<-c("1","2","5","3","1","4","3","5") ##identifier 1 > df[1:8,2]<-c("c","a","b","c","a","b","b","a") ##identifier 2 > df[1:8,3]<-c(1,2,3,4,5,6,7,8) ##value > > ##Each unique combination of identifiers identifies a datapoint > > > ##What I am trying to do is create a matrix with values of the third > column > ## and rows/column names of the 1st and 2nd column. As you can see, a lot > of > ## combinations of identifier1 and identifier 2, for example (1,b), do not > ## have a value attached to them, and I would like those values in the > matrix > ## to be NA. > > ##From the ecodist, I have tried crosstab(df[,1],df[,2],df[,3]) > ##It works perfectly in this case, but when I use it on my actual > data.frame > ##It ALWAYS crashes..differences between my real data.frame > ## 1)Actual df is 3 million rows, 2)df[,1] and df[,2] are > ## of class "factor" rather than class "character" > ## knowing this I have tried as.character(df[,1]) & same for df2 > ##but to no avail >-- View this message in context: http://r.789695.n4.nabble.com/Converting-3-columns-of-a-data-frame-into-a-matrix-tp2263540p2263695.html Sent from the R help mailing list archive at Nabble.com.
Okay, it crosstab seems to work when I clear out a bunch of my variables, went from 800 mb Vcol to 100 mb Can anyone explain how memory works in R, because my lack of understanding with memory was clearly the problem. Jeff08 wrote:> > ##I have also tried the reshape package > library(reshape) > mm <- melt(df, id=c("date_", "id")) > mm1 <- cast(mm, date_~id) > > aba <- mm1[,-1] > > final <- as.matrix(aba) > colnames(final) <- df$id > rownames(final) <- mm1$date_ > final > > ##Again it works perfectly here, but when I get to my real dataset, I get > a "Cannot allocate vector of size 27.4 MB" error > > It seems that the main problem is memory size. Is there an efficient way > to do this? > > colnames(df)<-c("date_","id","totret") > library(reshape) > mm <- melt(df, id=c("date_", "id")) > mm1 <- cast(mm, date_~id) > > aba <- mm1[,2:2365] > > final <- as.matrix(aba) > colnames(final) <- Returns.nodup$id > rownames(final) <- mm1$date_ > > > Jeff08 wrote: >> >> df<-data.frame() >> df[1:8,1]<-c("1","2","5","3","1","4","3","5") ##identifier 1 >> df[1:8,2]<-c("c","a","b","c","a","b","b","a") ##identifier 2 >> df[1:8,3]<-c(1,2,3,4,5,6,7,8) ##value >> >> ##Each unique combination of identifiers identifies a datapoint >> >> >> ##What I am trying to do is create a matrix with values of the third >> column >> ## and rows/column names of the 1st and 2nd column. As you can see, a lot >> of >> ## combinations of identifier1 and identifier 2, for example (1,b), do >> not >> ## have a value attached to them, and I would like those values in the >> matrix >> ## to be NA. >> >> ##From the ecodist, I have tried crosstab(df[,1],df[,2],df[,3]) >> ##It works perfectly in this case, but when I use it on my actual >> data.frame >> ##It ALWAYS crashes..differences between my real data.frame >> ## 1)Actual df is 3 million rows, 2)df[,1] and df[,2] are >> ## of class "factor" rather than class "character" >> ## knowing this I have tried as.character(df[,1]) & same for df2 >> ##but to no avail >> > >-- View this message in context: http://r.789695.n4.nabble.com/Converting-3-columns-of-a-data-frame-into-a-matrix-tp2263540p2263824.html Sent from the R help mailing list archive at Nabble.com.