Marius Hofert
2011-Jun-21 22:13 UTC
[R] Tricky (?) conversion from data.frame to matrix where not all pairs exist
Dear expeRts, In the minimal example below, I have a data.frame containing three "blocks" of years (the years are subsets of 2000 to 2002). For each year and block a certain "value" is given. I would like to create a matrix that has row names given by all years ("2000", "2001", "2002"), and column names given by all blocks ("a", "b", "c"); the entries are then given by the corresponding value or zero if not year-block combination exists. What's a short way to achieve this? Of course one can setup a matrix and use for loops (see below)... but that's not nice. The problem is that the years are not running from 2000 to 2002 for all three "blocks" (the second block only has year 2001, the third one has only 2000 and 2001). In principle, table() nicely solves such a problem (see below) and fills in zeros. This is what I would like in the end, but all non-zero entries should be given by df$value, not (as table() does) by their counts. Cheers, Marius (df <- data.frame(year=c(2000, 2001, 2002, 2001, 2000, 2001), block=c("a","a","a","b","c","c"), value=1:6)) table(df[,1:2]) # complements the years and fills in 0 year <- c(2000, 2001, 2002) block <- c("a", "b", "c") res <- matrix(0, nrow=3, ncol=3, dimnames=list(year, block)) for(i in 1:3){ # year for(j in 1:3){ # block for(k in 1:nrow(df)){ if(df[k,"year"]==year[i] && df[k,"block"]==block[j]) res[i,j] <- df[k,"value"] } } } res # does the job; but seems complicated
Dennis Murphy
2011-Jun-21 22:35 UTC
[R] Tricky (?) conversion from data.frame to matrix where not all pairs exist
Hi: xtabs(value ~ year + block, data = df) block year a b c 2000 1 0 5 2001 2 4 6 2002 3 0 0 HTH, Dennis On Tue, Jun 21, 2011 at 3:13 PM, Marius Hofert <m_hofert at web.de> wrote:> Dear expeRts, > > In the minimal example below, I have a data.frame containing three "blocks" of years > (the years are subsets of 2000 to 2002). For each year and block a certain "value" is given. > I would like to create a matrix that has row names given by all years ("2000", "2001", "2002"), > and column names given by all blocks ("a", "b", "c"); the entries are then given by the > corresponding value or zero if not year-block combination exists. > > What's a short way to achieve this? > > Of course one can setup a matrix and use for loops (see below)... but that's not nice. > The problem is that the years are not running from 2000 to 2002 for all three "blocks" > (the second block only has year 2001, the third one has only 2000 and 2001). > In principle, table() nicely solves such a problem (see below) and fills in zeros. > This is what I would like in the end, but all non-zero entries should be given by df$value, > not (as table() does) by their counts. > > Cheers, > > Marius > > (df <- data.frame(year=c(2000, 2001, 2002, 2001, 2000, 2001), > ? ? ? ? ? ? ? ? ?block=c("a","a","a","b","c","c"), value=1:6)) > table(df[,1:2]) # complements the years and fills in 0 > > year <- c(2000, 2001, 2002) > block <- c("a", "b", "c") > res <- matrix(0, nrow=3, ncol=3, dimnames=list(year, block)) > for(i in 1:3){ # year > ? ?for(j in 1:3){ # block > ? ? ? ?for(k in 1:nrow(df)){ > ? ? ? ? ? ?if(df[k,"year"]==year[i] && df[k,"block"]==block[j]) res[i,j] <- df[k,"value"] > ? ? ? ?} > ? ?} > } > res # does the job; but seems complicated > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
William Dunlap
2011-Jun-21 22:35 UTC
[R] Tricky (?) conversion from data.frame to matrix where not allpairs exist
Using a 2-column integer matrix of subscripts, a column of row indices and a column of corresponding column indices will do the job: > res <- matrix(0, nrow=3, ncol=3, dimnames=list(year, block)) > res[cbind(match(df$year,rownames(res)), match(df$block,colnames(res)))] <- df$value > res a b c 2000 1 0 5 2001 2 4 6 2002 3 0 0 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Marius Hofert > Sent: Tuesday, June 21, 2011 3:14 PM > To: Help R > Subject: [R] Tricky (?) conversion from data.frame to matrix > where not allpairs exist > > Dear expeRts, > > In the minimal example below, I have a data.frame containing > three "blocks" of years > (the years are subsets of 2000 to 2002). For each year and > block a certain "value" is given. > I would like to create a matrix that has row names given by > all years ("2000", "2001", "2002"), > and column names given by all blocks ("a", "b", "c"); the > entries are then given by the > corresponding value or zero if not year-block combination exists. > > What's a short way to achieve this? > > Of course one can setup a matrix and use for loops (see > below)... but that's not nice. > The problem is that the years are not running from 2000 to > 2002 for all three "blocks" > (the second block only has year 2001, the third one has only > 2000 and 2001). > In principle, table() nicely solves such a problem (see > below) and fills in zeros. > This is what I would like in the end, but all non-zero > entries should be given by df$value, > not (as table() does) by their counts. > > Cheers, > > Marius > > (df <- data.frame(year=c(2000, 2001, 2002, 2001, 2000, 2001), > block=c("a","a","a","b","c","c"), value=1:6)) > table(df[,1:2]) # complements the years and fills in 0 > > year <- c(2000, 2001, 2002) > block <- c("a", "b", "c") > res <- matrix(0, nrow=3, ncol=3, dimnames=list(year, block)) > for(i in 1:3){ # year > for(j in 1:3){ # block > for(k in 1:nrow(df)){ > if(df[k,"year"]==year[i] && > df[k,"block"]==block[j]) res[i,j] <- df[k,"value"] > } > } > } > res # does the job; but seems complicated > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Reasonably Related Threads
- How to 'extend' a data.frame based on given variable combinations ?
- How to efficiently compare each row in a matrix with each row in another matrix?
- How to colorize the panel backgrounds of pairs()?
- How to create an array of lists of multiple components?
- R process killed when allocating too large matrix (Mac OS X)