Dear Listmembers, I'm trying to fill up a dataframe depending on an arbitrary list of references: Here is my code, which works: dat <- data.frame(c(60001,60001,60050,60050,60050),c(27,129,618,27,1579)) LR <- sort(unique(dat[,1])) LC <- sort(unique(dat[,2])) m <- as.data.frame(matrix(data=NA, nrow=length(LR), ncol=length(LC), dimnames=list(LR,LC))) for(i in 1:nrow(dat)){ m[as.character(dat[i,1]), as.character(dat[i,2])] <- 1 } m[is.na(m)] <- 0 Now I'm trying to prevent the loop, because it take ages for a list of 20000 entries, but I run out of ideas. Should I inflate my list beforehand and how? Can I adress the dataframe fields more effieciently? Thanks for your help. -- Dr. Florian Jansen Geobotany & Nature Conservation Institute of Botany and Landscape Ecology Ernst-Moritz-Arndt-University Grimmer Str. 88 17487 Greifswald Germany +49 (0)3834 86 4147
Try to assign some names to your initial variables: dat <- data.frame(A=c(60001,60001,60050,60050,60050), B=c(27,129,618,27,1579)) And what you want is simply:> table(dat)B A 27 129 618 1579 60001 1 1 0 0 60050 1 0 1 1 Why do you need it as a dataframe anyway? Hth, Adrian On Monday 24 September 2007, Florian Jansen wrote:> Dear Listmembers, > > I'm trying to fill up a dataframe depending on an arbitrary list of > references: > > Here is my code, which works: > > dat <- data.frame(c(60001,60001,60050,60050,60050),c(27,129,618,27,1579)) > LR <- sort(unique(dat[,1])) > LC <- sort(unique(dat[,2])) > m <- as.data.frame(matrix(data=NA, nrow=length(LR), ncol=length(LC), > dimnames=list(LR,LC))) > > for(i in 1:nrow(dat)){ > m[as.character(dat[i,1]), as.character(dat[i,2])] <- 1 > } > m[is.na(m)] <- 0 > > Now I'm trying to prevent the loop, because it take ages for a list of > 20000 entries, but I run out of ideas. > Should I inflate my list beforehand and how? Can I adress the dataframe > fields more effieciently? > > Thanks for your help.-- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101
Wayne.W.Jones at shell.com
2007-Sep-24 12:16 UTC
[R] Performance problems to fill up a dataframe
Use table: dat <- data.frame(c(60001,60001,60050,60050,60050),c(27,129,618,27,1579)) table(dat[,1],dat[,2]) 27 129 618 1579 60001 1 1 0 0 60050 1 0 1 1 Regards Wayne -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]On Behalf Of Florian Jansen Sent: 24 September 2007 10:37 To: r-help at r-project.org Subject: [R] Performance problems to fill up a dataframe Dear Listmembers, I'm trying to fill up a dataframe depending on an arbitrary list of references: Here is my code, which works: dat <- data.frame(c(60001,60001,60050,60050,60050),c(27,129,618,27,1579)) LR <- sort(unique(dat[,1])) LC <- sort(unique(dat[,2])) m <- as.data.frame(matrix(data=NA, nrow=length(LR), ncol=length(LC), dimnames=list(LR,LC))) for(i in 1:nrow(dat)){ m[as.character(dat[i,1]), as.character(dat[i,2])] <- 1 } m[is.na(m)] <- 0 Now I'm trying to prevent the loop, because it take ages for a list of 20000 entries, but I run out of ideas. Should I inflate my list beforehand and how? Can I adress the dataframe fields more effieciently? Thanks for your help. -- Dr. Florian Jansen Geobotany & Nature Conservation Institute of Botany and Landscape Ecology Ernst-Moritz-Arndt-University Grimmer Str. 88 17487 Greifswald Germany +49 (0)3834 86 4147 ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.