Hi guys, I need some help to analyzing my data. I start to describe my data: I have 21 matrices, every matrix on the rows has users and on columns has items, in my case films. Element of index (i, j) represent the rating expressed by user i about item j. I have a matrix for each of professions. An example of a this type of matrix is: item 1 item 2 item 3 item4 id user 1 1 ? ? 5 id user 2 ? 3 3 ? id user 3 2 ? 3 2 id user 4 ? ? ? 4 ... So user 1 don't like item 1 but he likes so much item 4, for item 2 and 3 he hasn't expressed a rating, etc. I need to construct a tensor with n users, m items and 21 occupations. After I have construct the tensor I want apply Parafac. I read data from a CSV file and build each matrix for each occupation. Didier Leibovici (author of PTAk package) suggested to me: ok that's bit clearer you have 21 matrices ( 1 for each occupations) of users rating their preferences (from 1 to 5 but without rating all of them: missing values) of m items. but I suppose the users are not the same across the 21 occupations (one has only one occupation .... if you're talking about working/living occupation) so you can't create a tensor n users x m items x 21 occupations but you can build the contingencies of preferences m items x 21 occupations x 5 ratings One way to build your tensor m x 21 x 5 is: M1 is the first occupation (users x m) ... UserItem <-rbind(M1,M2, ...M21) m=1682 for (j in 1:m){ UserItem[,j] =factor(UserItem[,j],levels=1:5) } occ=factor(c(rep(1,dim(M1)[1]),rep(2,dim(M2)[1]), ...,rep(21,dim(M21)[1])),levels=1:21) Z <- array(rep(0,m*21*5),c(m,21,5), list(paste("item",1:m,sep=""),paste("Occ",1:21,sep=""),c("pr1","pr2","pr3","pr4","pr5"))) for ( i in 1:m){ as.matrix(table(occ, UserItem[,2])) Z[i,,]=table(occ, UserItem[,i]) } Z.CAND <- CANPARA(Z,dim=7) I have implemented this code but I have one error in correspondance of: for ( i in 1:m){ Z[i,,]=table(occ,UserItem[,i]) } and error is: Error in Z[i,,]=table(occ,UserItem[,i]) the number of elements to be replaced is not a multiple of the length of substitution Can anyone help me to understand this code and how I can resolve the error? Thanks. Best regards. Giuseppe
Amusing that someone named RICCI is asking about tensors .... (sorry!) Kjetil On Tue, Jul 17, 2012 at 6:31 AM, Peppe Ricci <peppepegasus at gmail.com> wrote:> Hi guys, > > I need some help to analyzing my data. > I start to describe my data: I have 21 matrices, every matrix on the > rows has users and on columns has items, in my case films. > Element of index (i, j) represent the rating expressed by user i about item j. > I have a matrix for each of professions. > An example of a this type of matrix is: > > item 1 item 2 item 3 item4 > id user 1 1 ? ? 5 > id user 2 ? 3 3 ? > id user 3 2 ? 3 2 > id user 4 ? ? ? 4 > ... > So user 1 don't like item 1 but he likes so much item 4, for item 2 > and 3 he hasn't expressed a rating, etc. > I need to construct a tensor with n users, m items and 21 occupations. > After I have construct the tensor I want apply Parafac. > I read data from a CSV file and build each matrix for each occupation. > > Didier Leibovici (author of PTAk package) suggested to me: > > ok that's bit clearer you have 21 matrices ( 1 for each occupations) > of users rating their preferences (from 1 to 5 but without rating all > of them: missing values) of m items. > but I suppose the users are not the same across the 21 occupations > (one has only one occupation .... if you're talking about > working/living occupation) > so you can't create a tensor n users x m items x 21 occupations > but you can build the contingencies of preferences m items x 21 > occupations x 5 ratings > > One way to build your tensor m x 21 x 5 is: > M1 is the first occupation (users x m) ... > UserItem <-rbind(M1,M2, ...M21) > > m=1682 > > for (j in 1:m){ > UserItem[,j] =factor(UserItem[,j],levels=1:5) > } > occ=factor(c(rep(1,dim(M1)[1]),rep(2,dim(M2)[1]), > ...,rep(21,dim(M21)[1])),levels=1:21) > > Z <- array(rep(0,m*21*5),c(m,21,5), > list(paste("item",1:m,sep=""),paste("Occ",1:21,sep=""),c("pr1","pr2","pr3","pr4","pr5"))) > for ( i in 1:m){ > as.matrix(table(occ, UserItem[,2])) > Z[i,,]=table(occ, UserItem[,i]) > } > > Z.CAND <- CANPARA(Z,dim=7) > > I have implemented this code but I have one error in correspondance of: > > for ( i in 1:m){ > Z[i,,]=table(occ,UserItem[,i]) > } > > and error is: > > Error in > Z[i,,]=table(occ,UserItem[,i]) > the number of elements to be replaced is not a multiple of the length > of substitution > > Can anyone help me to understand this code and how I can resolve the error? > Thanks. > Best regards. > Giuseppe > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Tue, Jul 17, 2012 at 12:31:38PM +0200, Peppe Ricci wrote:> Hi guys, > > I need some help to analyzing my data. > I start to describe my data: I have 21 matrices, every matrix on the > rows has users and on columns has items, in my case films. > Element of index (i, j) represent the rating expressed by user i about item j. > I have a matrix for each of professions. > An example of a this type of matrix is: > > item 1 item 2 item 3 item4 > id user 1 1 ? ? 5 > id user 2 ? 3 3 ? > id user 3 2 ? 3 2 > id user 4 ? ? ? 4 > ... > So user 1 don't like item 1 but he likes so much item 4, for item 2 > and 3 he hasn't expressed a rating, etc. > I need to construct a tensor with n users, m items and 21 occupations. > After I have construct the tensor I want apply Parafac. > I read data from a CSV file and build each matrix for each occupation. > > Didier Leibovici (author of PTAk package) suggested to me: > > ok that's bit clearer you have 21 matrices ( 1 for each occupations) > of users rating their preferences (from 1 to 5 but without rating all > of them: missing values) of m items. > but I suppose the users are not the same across the 21 occupations > (one has only one occupation .... if you're talking about > working/living occupation) > so you can't create a tensor n users x m items x 21 occupations > but you can build the contingencies of preferences m items x 21 > occupations x 5 ratings > > One way to build your tensor m x 21 x 5 is: > M1 is the first occupation (users x m) ... > UserItem <-rbind(M1,M2, ...M21) > > m=1682 > > for (j in 1:m){ > UserItem[,j] =factor(UserItem[,j],levels=1:5) > } > occ=factor(c(rep(1,dim(M1)[1]),rep(2,dim(M2)[1]), > ...,rep(21,dim(M21)[1])),levels=1:21) > > Z <- array(rep(0,m*21*5),c(m,21,5), > list(paste("item",1:m,sep=""),paste("Occ",1:21,sep=""),c("pr1","pr2","pr3","pr4","pr5"))) > for ( i in 1:m){ > as.matrix(table(occ, UserItem[,2])) > Z[i,,]=table(occ, UserItem[,i]) > } > > Z.CAND <- CANPARA(Z,dim=7) > > I have implemented this code but I have one error in correspondance of: > > for ( i in 1:m){ > Z[i,,]=table(occ,UserItem[,i]) > } > > and error is: > > Error in > Z[i,,]=table(occ,UserItem[,i]) > the number of elements to be replaced is not a multiple of the length > of substitutionHi. The problem in this code is that the command UserItem <- rbind(M1, M2, ..., M21) produces a matrix and not a data.frame. Due to this, the commands UserItem[, j] <- factor(UserItem[, j], levels=1:5) do not convert the columns to factors, but the columns remain numeric. Due to this, the table created as table(occ, UserItem[, i]) may not have the full size, since the columns correspond only to preferences, which do occur in UserItem[, i], and not to all possible preferences. Changing UserItem <- rbind(M1, M2, ..., M21) to UserItem <- data.frame(rbind(M1, M2, ..., M21)) can resolve the problem, since then the columns will be coerced to factors, whose list of levels is complete, even if some level is not used. For better clarity, consider the definition of the array in an equivalent form Z <- array(0, dim=c(m, 21, 5), dimnames=list(paste("item", 1:m, sep=""), paste("Occ", 1:21, sep=""), c("pr1", "pr2", "pr3", "pr4", "pr5"))) which contains the names of the used arguments of the function array(). Hope this helps. Petr Savicky.