VinceD
2008-Jul-09 16:15 UTC
[R] build matrix with the content of one column of a data frame in function of two factors
Hello, First, thanks for your help (and sorry for my english !) I have a data frame in which each row represents a vote (in percent, only 20,40, 60,80,100) of one person on one content, with three columns : name (the name of the voters), content_id, vote : str(votesredac) 'data.frame': 1000 obs. of 3 variables: $ name : chr "Guillemette Faure" "Guillemette Faure" "Guillemette Faure" "Pascal Rich\xe9" ... $ content_id: num 385241 384926 384635 383266 384814 ... $ value : num 100 100 100 20 100 100 20 100 100 100 ... I want to build a matrix with one column for each content and one line for each user, containing the votes. This matrix can content NAs (each person hasn't voted on all contents). If each row and column was labelled, would be better. Thanks again ! -- View this message in context: http://www.nabble.com/build-matrix-with-the-content-of-one-column-of-a-data-frame-in-function-of-two-factors-tp18364752p18364752.html Sent from the R help mailing list archive at Nabble.com.
VinceD
2008-Jul-09 16:59 UTC
[R] build matrix with the content of one column of a data frame in function of two factors
Seems that the following makes what I want : attach(votesredac) tapply(value, list(name, content_id), mean) Only thing is, I don't need to make a mean - there is only one or no value. VinceD wrote:> > Hello, > > First, thanks for your help (and sorry for my english !) > > I have a data frame in which each row represents a vote (in percent, only > 20,40, 60,80,100) of one person on one content, with three columns : name > (the name of the voters), content_id, vote : > > str(votesredac) > 'data.frame': 1000 obs. of 3 variables: > $ name : chr "Guillemette Faure" "Guillemette Faure" "Guillemette > Faure" "Pascal Rich\xe9" ... > $ content_id: num 385241 384926 384635 383266 384814 ... > $ value : num 100 100 100 20 100 100 20 100 100 100 ... > > I want to build a matrix with one column for each content and one line for > each user, containing the votes. This matrix can content NAs (each person > hasn't voted on all contents). If each row and column was labelled, would > be better. > > > Thanks again ! >-- View this message in context: http://www.nabble.com/build-matrix-with-the-content-of-one-column-of-a-data-frame-in-function-of-two-factors-tp18364752p18366233.html Sent from the R help mailing list archive at Nabble.com.
VinceD
2008-Jul-10 13:11 UTC
[R] build matrix with the content of one column of a data frame in function of two factors
So the solution is : tapply(content, list(factor1, factor2), mean) An example of what it does :> my.dataname item vote 1 Ricardo Coke 20 2 Ricardo Fanta 60 3 Ricardo Pepsi 100 4 Marie Pepsi 40 5 Marie Coke 60 6 Julia Fanta 60 7 Julia Coke 100> attach(my.data)> tapply(vote, list(name, item), mean) -> tastes > tastes Coke Fanta Pepsi Julia 100 60 NA Marie 60 NA 40 Ricardo 20 60 100 And then, you can compute the distance between people if you want :> dist(tastes, diag = T)Julia Marie Ricardo Julia 0.00000 Marie 69.28203 0.00000 Ricardo 97.97959 88.31761 0.00000 That's it ! -- View this message in context: http://www.nabble.com/build-matrix-with-the-content-of-one-column-of-a-data-frame-in-function-of-two-factors-tp18364752p18382632.html Sent from the R help mailing list archive at Nabble.com.