You need to spend some time with a basic R tutorial. Your data is messed up because you did not use a simple text editor somewhere along the way. R understands ', but not ? or ?. The best way to send data to the list is to use dput:> dput(myData)structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6, 5, 4, 3, 2, 1), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L ), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("X", "Y", "S", "Z"), row.names = c(NA, -8L), class = "data.frame") Combining two labels just requires the paste0() function:> sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 1], x[, 2]))S1A S1B S2A S2B 22 38 38 22 David C -----Original Message----- From: Gang Chen [mailto:gangchen6 at gmail.com] Sent: Wednesday, August 24, 2016 11:56 AM To: David L Carlson Cc: Jim Lemon; r-help mailing list Subject: Re: [R] aggregate Thanks a lot, David! I want to further expand the operation a little bit. With a new dataframe: myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4, 3, 2, 1), S=c(?S1?, ?S1?, ?S1?, ?S1?, ?S2?, ?S2?, ?S2?, ?S2?), Z=c(?A?, ?A?, ?B?, ?B?, ?A?, ?A?, ?B?, ?B?))> myDataX Y S Z 1 1 8 S1 A 2 2 7 S1 A 3 3 6 S1 B 4 4 5 S1 B 5 5 4 S2 A 6 6 3 S2 A 7 7 2 S2 B 8 8 1 S2 B I would like to obtain the same cross product between columns X and Y, but at each combination level of factors S and Z. In other words, the cross product would be still performed each two rows in the new dataframe myData. How can I achieve that? On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson <dcarlson at tamu.edu> wrote:> Your is fine, but it will be a little simpler if you use sapply() instead: > >> data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z), > + function(x) crossprod(x[, 1], x[, 2]))) > Z CP > A A 10 > B B 10 > > David C > > > -----Original Message----- > From: Gang Chen [mailto:gangchen6 at gmail.com] > Sent: Wednesday, August 24, 2016 10:17 AM > To: David L Carlson > Cc: Jim Lemon; r-help mailing list > Subject: Re: [R] aggregate > > Thank you all for the suggestions! Yes, I'm looking for the cross > product between the two columns of X and Y. > > A follow-up question: what is a nice way to merge the output of > > lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2])) > > with the column Z in myData so that I would get a new dataframe as the > following (the 2nd column is the cross product between X and Y)? > > Z CP > A 10 > B 10 > > Is the following legitimate? > > data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData, > myData$Z), function(x) crossprod(x[, 1], x[, 2])))) > > > On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarlson at tamu.edu> wrote: >> Thank you for the reproducible example, but it is not clear what cross product you want. Jim's solution gives you the cross product of the 2-column matrix with itself. If you want the cross product between the columns you need something else. The aggregate function will not work since it will treat the columns separately: >> >>> A <- as.matrix(myData[myData$Z=="A", 1:2]) >>> A >> X Y >> 1 1 4 >> 2 2 3 >>> crossprod(A) # Same as t(A) %*% A >> X Y >> X 5 10 >> Y 10 25 >>> crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2] >> [,1] >> [1,] 10 >>> >>> # For all the groups >>> lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2]))) >> $A >> X Y >> X 5 10 >> Y 10 25 >> >> $B >> X Y >> X 25 10 >> Y 10 5 >> >>> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2])) >> $A >> [,1] >> [1,] 10 >> >> $B >> [,1] >> [1,] 10 >> >> ------------------------------------- >> David L Carlson >> Department of Anthropology >> Texas A&M University >> College Station, TX 77840-4352 >> >> >> -----Original Message----- >> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon >> Sent: Tuesday, August 23, 2016 6:02 PM >> To: Gang Chen; r-help mailing list >> Subject: Re: [R] aggregate >> >> Hi Gang Chen, >> If I have the right idea: >> >> for(zval in levels(myData$Z)) >> crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")])) >> >> Jim >> >> On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote: >>> This is a simple question: With a dataframe like the following >>> >>> myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B')) >>> >>> how can I get the cross product between X and Y for each level of >>> factor Z? My difficulty is that I don't know how to deal with the fact >>> that crossprod() acts on two variables in this case. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.
Thanks again for patiently offering great help, David! I just learned dput() and paste0() now. Hopefully this is my last question. Suppose a new dataframe is as below (one more numeric column): myData <- structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6, 5, 4, 3, 2, 1), N =c(rep(2.1, 4), rep(3.2, 4)), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L ), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("X", "Y", "N", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")> myDataX Y N S Z 1 1 8 2.1 S1 A 2 2 7 2.1 S1 A 3 3 6 2.1 S1 B 4 4 5 2.1 S1 B 5 5 4 3.2 S2 A 6 6 3 3.2 S2 A 7 7 2 3.2 S2 B 8 8 1 3.2 S2 B Once I obtain the cross product,> sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 1], x[, 2]))S1A S1B S2A S2B 22 38 38 22 how can I easily add the other 3 columns (N, S, and Z) in a new dataframe? For S and Z, I can play with the names from the cross product output, but I have trouble dealing with the numeric column N. On Wed, Aug 24, 2016 at 1:07 PM, David L Carlson <dcarlson at tamu.edu> wrote:> You need to spend some time with a basic R tutorial. Your data is messed up because you did not use a simple text editor somewhere along the way. R understands ', but not ? or ?. The best way to send data to the list is to use dput: > >> dput(myData) > structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6, > 5, 4, 3, 2, 1), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L > ), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L, > 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("X", > "Y", "S", "Z"), row.names = c(NA, -8L), class = "data.frame") > > Combining two labels just requires the paste0() function: > >> sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 1], x[, 2])) > S1A S1B S2A S2B > 22 38 38 22 > > David C > > -----Original Message----- > From: Gang Chen [mailto:gangchen6 at gmail.com] > Sent: Wednesday, August 24, 2016 11:56 AM > To: David L Carlson > Cc: Jim Lemon; r-help mailing list > Subject: Re: [R] aggregate > > Thanks a lot, David! I want to further expand the operation a little > bit. With a new dataframe: > > myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4, > 3, 2, 1), S=c(?S1?, ?S1?, ?S1?, ?S1?, ?S2?, ?S2?, ?S2?, ?S2?), > Z=c(?A?, ?A?, ?B?, ?B?, ?A?, ?A?, ?B?, ?B?)) > >> myData > > X Y S Z > 1 1 8 S1 A > 2 2 7 S1 A > 3 3 6 S1 B > 4 4 5 S1 B > 5 5 4 S2 A > 6 6 3 S2 A > 7 7 2 S2 B > 8 8 1 S2 B > > I would like to obtain the same cross product between columns X and Y, > but at each combination level of factors S and Z. In other words, the > cross product would be still performed each two rows in the new > dataframe myData. How can I achieve that? > > On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson <dcarlson at tamu.edu> wrote: >> Your is fine, but it will be a little simpler if you use sapply() instead: >> >>> data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z), >> + function(x) crossprod(x[, 1], x[, 2]))) >> Z CP >> A A 10 >> B B 10 >> >> David C >> >> >> -----Original Message----- >> From: Gang Chen [mailto:gangchen6 at gmail.com] >> Sent: Wednesday, August 24, 2016 10:17 AM >> To: David L Carlson >> Cc: Jim Lemon; r-help mailing list >> Subject: Re: [R] aggregate >> >> Thank you all for the suggestions! Yes, I'm looking for the cross >> product between the two columns of X and Y. >> >> A follow-up question: what is a nice way to merge the output of >> >> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2])) >> >> with the column Z in myData so that I would get a new dataframe as the >> following (the 2nd column is the cross product between X and Y)? >> >> Z CP >> A 10 >> B 10 >> >> Is the following legitimate? >> >> data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData, >> myData$Z), function(x) crossprod(x[, 1], x[, 2])))) >> >> >> On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarlson at tamu.edu> wrote: >>> Thank you for the reproducible example, but it is not clear what cross product you want. Jim's solution gives you the cross product of the 2-column matrix with itself. If you want the cross product between the columns you need something else. The aggregate function will not work since it will treat the columns separately: >>> >>>> A <- as.matrix(myData[myData$Z=="A", 1:2]) >>>> A >>> X Y >>> 1 1 4 >>> 2 2 3 >>>> crossprod(A) # Same as t(A) %*% A >>> X Y >>> X 5 10 >>> Y 10 25 >>>> crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2] >>> [,1] >>> [1,] 10 >>>> >>>> # For all the groups >>>> lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2]))) >>> $A >>> X Y >>> X 5 10 >>> Y 10 25 >>> >>> $B >>> X Y >>> X 25 10 >>> Y 10 5 >>> >>>> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2])) >>> $A >>> [,1] >>> [1,] 10 >>> >>> $B >>> [,1] >>> [1,] 10 >>> >>> ------------------------------------- >>> David L Carlson >>> Department of Anthropology >>> Texas A&M University >>> College Station, TX 77840-4352 >>> >>> >>> -----Original Message----- >>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon >>> Sent: Tuesday, August 23, 2016 6:02 PM >>> To: Gang Chen; r-help mailing list >>> Subject: Re: [R] aggregate >>> >>> Hi Gang Chen, >>> If I have the right idea: >>> >>> for(zval in levels(myData$Z)) >>> crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")])) >>> >>> Jim >>> >>> On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote: >>>> This is a simple question: With a dataframe like the following >>>> >>>> myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B')) >>>> >>>> how can I get the cross product between X and Y for each level of >>>> factor Z? My difficulty is that I don't know how to deal with the fact >>>> that crossprod() acts on two variables in this case. >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code.
This will work, but you should double-check to be certain that CP and unique(myData[, 3:5]) are in the same order. It will fail if N is not identical for all rows of the same S-Z combination.> CP <- sapply(split(myData, paste0(myData$S, myData$Z)), function(x)+ crossprod(x[, 1], x[, 2]))> data.frame(CP, unique(myData[, 3:5]))CP N S Z S1A 22 2.1 S1 A S1B 38 2.1 S1 B S2A 38 3.2 S2 A S2B 22 3.2 S2 B David C -----Original Message----- From: Gang Chen [mailto:gangchen6 at gmail.com] Sent: Wednesday, August 24, 2016 2:51 PM To: David L Carlson Cc: r-help mailing list Subject: Re: [R] aggregate Thanks again for patiently offering great help, David! I just learned dput() and paste0() now. Hopefully this is my last question. Suppose a new dataframe is as below (one more numeric column): myData <- structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6, 5, 4, 3, 2, 1), N =c(rep(2.1, 4), rep(3.2, 4)), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L ), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("X", "Y", "N", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")> myDataX Y N S Z 1 1 8 2.1 S1 A 2 2 7 2.1 S1 A 3 3 6 2.1 S1 B 4 4 5 2.1 S1 B 5 5 4 3.2 S2 A 6 6 3 3.2 S2 A 7 7 2 3.2 S2 B 8 8 1 3.2 S2 B Once I obtain the cross product,> sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 1], x[, 2]))S1A S1B S2A S2B 22 38 38 22 how can I easily add the other 3 columns (N, S, and Z) in a new dataframe? For S and Z, I can play with the names from the cross product output, but I have trouble dealing with the numeric column N. On Wed, Aug 24, 2016 at 1:07 PM, David L Carlson <dcarlson at tamu.edu> wrote:> You need to spend some time with a basic R tutorial. Your data is messed up because you did not use a simple text editor somewhere along the way. R understands ', but not ? or ?. The best way to send data to the list is to use dput: > >> dput(myData) > structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6, > 5, 4, 3, 2, 1), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L > ), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L, > 1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("X", > "Y", "S", "Z"), row.names = c(NA, -8L), class = "data.frame") > > Combining two labels just requires the paste0() function: > >> sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 1], x[, 2])) > S1A S1B S2A S2B > 22 38 38 22 > > David C > > -----Original Message----- > From: Gang Chen [mailto:gangchen6 at gmail.com] > Sent: Wednesday, August 24, 2016 11:56 AM > To: David L Carlson > Cc: Jim Lemon; r-help mailing list > Subject: Re: [R] aggregate > > Thanks a lot, David! I want to further expand the operation a little > bit. With a new dataframe: > > myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4, > 3, 2, 1), S=c(?S1?, ?S1?, ?S1?, ?S1?, ?S2?, ?S2?, ?S2?, ?S2?), > Z=c(?A?, ?A?, ?B?, ?B?, ?A?, ?A?, ?B?, ?B?)) > >> myData > > X Y S Z > 1 1 8 S1 A > 2 2 7 S1 A > 3 3 6 S1 B > 4 4 5 S1 B > 5 5 4 S2 A > 6 6 3 S2 A > 7 7 2 S2 B > 8 8 1 S2 B > > I would like to obtain the same cross product between columns X and Y, > but at each combination level of factors S and Z. In other words, the > cross product would be still performed each two rows in the new > dataframe myData. How can I achieve that? > > On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson <dcarlson at tamu.edu> wrote: >> Your is fine, but it will be a little simpler if you use sapply() instead: >> >>> data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z), >> + function(x) crossprod(x[, 1], x[, 2]))) >> Z CP >> A A 10 >> B B 10 >> >> David C >> >> >> -----Original Message----- >> From: Gang Chen [mailto:gangchen6 at gmail.com] >> Sent: Wednesday, August 24, 2016 10:17 AM >> To: David L Carlson >> Cc: Jim Lemon; r-help mailing list >> Subject: Re: [R] aggregate >> >> Thank you all for the suggestions! Yes, I'm looking for the cross >> product between the two columns of X and Y. >> >> A follow-up question: what is a nice way to merge the output of >> >> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2])) >> >> with the column Z in myData so that I would get a new dataframe as the >> following (the 2nd column is the cross product between X and Y)? >> >> Z CP >> A 10 >> B 10 >> >> Is the following legitimate? >> >> data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData, >> myData$Z), function(x) crossprod(x[, 1], x[, 2])))) >> >> >> On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarlson at tamu.edu> wrote: >>> Thank you for the reproducible example, but it is not clear what cross product you want. Jim's solution gives you the cross product of the 2-column matrix with itself. If you want the cross product between the columns you need something else. The aggregate function will not work since it will treat the columns separately: >>> >>>> A <- as.matrix(myData[myData$Z=="A", 1:2]) >>>> A >>> X Y >>> 1 1 4 >>> 2 2 3 >>>> crossprod(A) # Same as t(A) %*% A >>> X Y >>> X 5 10 >>> Y 10 25 >>>> crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2] >>> [,1] >>> [1,] 10 >>>> >>>> # For all the groups >>>> lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2]))) >>> $A >>> X Y >>> X 5 10 >>> Y 10 25 >>> >>> $B >>> X Y >>> X 25 10 >>> Y 10 5 >>> >>>> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2])) >>> $A >>> [,1] >>> [1,] 10 >>> >>> $B >>> [,1] >>> [1,] 10 >>> >>> ------------------------------------- >>> David L Carlson >>> Department of Anthropology >>> Texas A&M University >>> College Station, TX 77840-4352 >>> >>> >>> -----Original Message----- >>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon >>> Sent: Tuesday, August 23, 2016 6:02 PM >>> To: Gang Chen; r-help mailing list >>> Subject: Re: [R] aggregate >>> >>> Hi Gang Chen, >>> If I have the right idea: >>> >>> for(zval in levels(myData$Z)) >>> crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")])) >>> >>> Jim >>> >>> On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote: >>>> This is a simple question: With a dataframe like the following >>>> >>>> myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B')) >>>> >>>> how can I get the cross product between X and Y for each level of >>>> factor Z? My difficulty is that I don't know how to deal with the fact >>>> that crossprod() acts on two variables in this case. >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code.