I hope this time I'm using the "iris" dataset correctly: ir <- rbind(iris3[,,1], iris3[,,2], iris3[,,3]) lir <- data.frame(log(ir)) names(lir) <- c("a","b","c","d") I'm trying to understand the meaning of expressions like "~ a+b+c+d", used with princomp, e.g. princomp(~ a+b+c+d, data=lir, cor=T) By inspection, it looks like the result is the same as in princomp(lir, cor = T). Do "a+b+c+d" simply specify the columns to be included? Could someone provide a meaningful example of princomp formula that uses operators other than "+"? In linear model, E(y)= xb, examples, "~" is usually placed between "y" and "x". What is the meaning of "~" here?
On 4/13/06, Sasha Pustota <popgen at gmail.com> wrote:> I hope this time I'm using the "iris" dataset correctly: > > ir <- rbind(iris3[,,1], iris3[,,2], iris3[,,3]) > lir <- data.frame(log(ir)) > names(lir) <- c("a","b","c","d") > > I'm trying to understand the meaning of expressions like "~ a+b+c+d", > used with princomp, e.g. > > princomp(~ a+b+c+d, data=lir, cor=T) > > By inspection, it looks like the result is the same as in > > princomp(lir, cor = T).Yes, princomp.formula just takes the model matrix of the formula and passes it to princomp.default.> > Do "a+b+c+d" simply specify the columns to be included? Could someone > provide a meaningful example of princomp formula that uses operators > other than "+"?colnames(model.matrix(~., lir)) princomp(~., lir) colnames(model.matrix(~(.)^2, lir)) princomp(~(.)^2, lir)> > In linear model, E(y)= xb, examples, "~" is usually placed between "y" > and "x". What is the meaning of "~" here?Its just a way to specify a formula that you can take a model matrix of. See ?model.matrix and try playing with it a bit on small examples.
aggregate(DF[,-1], DF[, 1, drop = FALSE], mean) On 4/15/06, Srinivas Iyyer <srini_iyyer_bio at yahoo.com> wrote:> dear group, > > i have a sample matrix > name v1 v2 v3 v4 > cat 10 11 12 15 > dog 3 12 10 14 > cat 9 12 12 15 > cat 5 12 10 11 > dog 12 113 123 31 > ... > > > since cat is repeated 3 times, I want a mean value for > it. Like wise for every element of the name column. > cat v1 = mean(c(10,9,5)) > cat v3 = mean(c(11,12,13)) > ..etc. > > name v1 v2 v3 v4 > cat 8 11.6 11.3 13.6 > dog 7.5 62.5 66.5 22.5 > > could any one help me in solving this mystery. thank you. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
Dear Srinivas, Your data are likely in a data frame rather than a matrix (since the columns are heterogeneous), and name is a variable, not the row names of the data frame. There are several ways to do what you want; one simple way, assuming that the data are in a data frame named Data, is by(Data[,2:5], Data$name, mean) If you want the result in the form of a matrix, then you could do aggregate(Data[,2:5], list(Data$name), mean) I hope this helps, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox --------------------------------> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Srinivas Iyyer > Sent: Friday, April 14, 2006 11:58 PM > To: r-help at stat.math.ethz.ch > Subject: [R] matching identical row names > > dear group, > > i have a sample matrix > name v1 v2 v3 v4 > cat 10 11 12 15 > dog 3 12 10 14 > cat 9 12 12 15 > cat 5 12 10 11 > dog 12 113 123 31 > ... > > > since cat is repeated 3 times, I want a mean value for it. > Like wise for every element of the name column. > cat v1 = mean(c(10,9,5)) > cat v3 = mean(c(11,12,13)) > ..etc. > > name v1 v2 v3 v4 > cat 8 11.6 11.3 13.6 > dog 7.5 62.5 66.5 22.5 > > could any one help me in solving this mystery. thank you. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html
Dear Dr. Fox Your reply to Sirinivas Iyyar was most helpful to me. I am trying to collapse some categories of a data.frame in a similar way. I have a data frame in the form below Prog Sub.Program Job V1 V2 V3 1 Alpha A 1 2 3 2 Alpha B 2 3 1 2 Gamma B 1 3 3 2 Alpha A 3 4 1 2 Gamma B 2 2 3 1 Alpha A 2 2 2 What I want is to sum the values of VI, V2 and V3 and end up with a new data.frame that would look like Prog Subprog Job Sum(V1) Sum(V2), Sum(V3) 1 Alpha A 3 4 5 2 Alpha A 3 4 1 2 Gamma B 3 5 6 I thought that I could use by() to create a vector for each of V1:V3 but I cannot see any way to capture the values. temp1 <- by(Data[,4] simply gives me the complete output. An example of what I have done is ------------------------------------------------------------- Prog <- 1, 2, 2, 2,2,1, Sub.Program <- c("Alpha", "Alpha", "Gamma", "Alpha", "Gamma", "Alpha" ) Job <- c("A", "B", "B", "A", "B", "A") V1 <- c(1,2, 1,3,2,2) V2 <- c(2, 3, 3, 4, 2, 2) V3 <- c(3, 1 , 3, 1, 3,2 Mydata <- data.frame(cbind( Prog, Sub.Program, Job, V1, V2, V3) by(MyData[,4],list(Sub.Program=Sub.Program, Job=Job), sum) ---------------------------------------------------------------- I also get the expected <NA. for cells that do not exist. Is there any way to set them to "0" in the operation? Any help would be greatly appreciated. Thanks John ----- Original Message ---- From: John Fox <jfox at mcmaster.ca> To: Srinivas Iyyer <srini_iyyer_bio at yahoo.com> Cc: r-help at stat.math.ethz.ch Sent: Saturday, April 15, 2006 9:35:46 AM Subject: Re: [R] matching identical row names Dear Srinivas, Your data are likely in a data frame rather than a matrix (since the columns are heterogeneous), and name is a variable, not the row names of the data frame. There are several ways to do what you want; one simple way, assuming that the data are in a data frame named Data, is by(Data[,2:5], Data$name, mean) If you want the result in the form of a matrix, then you could do aggregate(Data[,2:5], list(Data$name), mean) I hope this helps, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox --------------------------------> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Srinivas Iyyer > Sent: Friday, April 14, 2006 11:58 PM > To: r-help at stat.math.ethz.ch > Subject: [R] matching identical row names > > dear group, > > i have a sample matrix > name v1 v2 v3 v4 > cat 10 11 12 15 > dog 3 12 10 14 > cat 9 12 12 15 > cat 5 12 10 11 > dog 12 113 123 31 > ... > > > since cat is repeated 3 times, I want a mean value for it. > Like wise for every element of the name column. > cat v1 = mean(c(10,9,5)) > cat v3 = mean(c(11,12,13)) > ..etc. > > name v1 v2 v3 v4 > cat 8 11.6 11.3 13.6 > dog 7.5 62.5 66.5 22.5 > > could any one help me in solving this mystery. thank you. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Dear John, You can use aggregate(), also described in my suggestion to Sirinivas:> aggregate(Data[, 4:6], Data[1:3], sum)Prog Sub.Program Job V1 V2 V3 1 1 Alpha A 3 4 5 2 2 Alpha A 3 4 1 3 2 Alpha B 2 3 1 4 2 Gamma B 3 5 6 I hope this helps, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox --------------------------------> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of John Kane > Sent: Sunday, April 16, 2006 10:29 AM > To: John Fox; Srinivas Iyyer > Cc: r-help at stat.math.ethz.ch > Subject: Re: [R] matching identical row names > > Dear Dr. Fox > Your reply to Sirinivas Iyyar was most helpful to me. I am > trying to collapse some categories of a data.frame in a similar way. > I have a data frame in the form below > > Prog Sub.Program Job V1 V2 V3 > 1 Alpha A 1 2 3 > 2 Alpha B 2 3 1 > 2 Gamma B 1 3 3 > 2 Alpha A 3 4 1 > 2 Gamma B 2 2 3 > 1 Alpha A 2 2 2 > > What I want is to sum the values of VI, V2 and V3 and end up > with a new data.frame that would look like > > Prog Subprog Job Sum(V1) Sum(V2), Sum(V3) > 1 Alpha A 3 4 5 > 2 Alpha A 3 4 1 > 2 Gamma B 3 5 6 > > I thought that I could use by() to create a vector for each > of V1:V3 but I cannot see any way to capture the values. > temp1 <- by(Data[,4] simply gives me the complete output. > > An example of what I have done is > ------------------------------------------------------------- > > Prog <- 1, 2, 2, 2,2,1, > Sub.Program <- c("Alpha", "Alpha", "Gamma", "Alpha", > "Gamma", "Alpha" ) > Job <- c("A", "B", "B", "A", "B", "A") > V1 <- c(1,2, 1,3,2,2) > V2 <- c(2, 3, 3, 4, 2, 2) > V3 <- c(3, 1 , 3, 1, 3,2 > Mydata <- data.frame(cbind( Prog, Sub.Program, Job, V1, V2, V3) > > by(MyData[,4],list(Sub.Program=Sub.Program, Job=Job), sum) > ---------------------------------------------------------------- > > I also get the expected <NA. for cells that do not exist. Is > there any way to set them to "0" in the operation? > > > > Any help would be greatly appreciated. > Thanks > John > > ----- Original Message ---- > From: John Fox <jfox at mcmaster.ca> > To: Srinivas Iyyer <srini_iyyer_bio at yahoo.com> > Cc: r-help at stat.math.ethz.ch > Sent: Saturday, April 15, 2006 9:35:46 AM > Subject: Re: [R] matching identical row names > > Dear Srinivas, > > Your data are likely in a data frame rather than a matrix > (since the columns are heterogeneous), and name is a > variable, not the row names of the data frame. > > There are several ways to do what you want; one simple way, > assuming that the data are in a data frame named Data, is > > by(Data[,2:5], Data$name, mean) > > If you want the result in the form of a matrix, then you could do > > aggregate(Data[,2:5], list(Data$name), mean) > > I hope this helps, > John > > -------------------------------- > John Fox > Department of Sociology > McMaster University > Hamilton, Ontario > Canada L8S 4M4 > 905-525-9140x23604 > http://socserv.mcmaster.ca/jfox > -------------------------------- > > > -----Original Message----- > > From: r-help-bounces at stat.math.ethz.ch > > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of > Srinivas Iyyer > > Sent: Friday, April 14, 2006 11:58 PM > > To: r-help at stat.math.ethz.ch > > Subject: [R] matching identical row names > > > > dear group, > > > > i have a sample matrix > > name v1 v2 v3 v4 > > cat 10 11 12 15 > > dog 3 12 10 14 > > cat 9 12 12 15 > > cat 5 12 10 11 > > dog 12 113 123 31 > > ... > > > > > > since cat is repeated 3 times, I want a mean value for it. > > Like wise for every element of the name column. > > cat v1 = mean(c(10,9,5)) > > cat v3 = mean(c(11,12,13)) > > ..etc. > > > > name v1 v2 v3 v4 > > cat 8 11.6 11.3 13.6 > > dog 7.5 62.5 66.5 22.5 > > > > could any one help me in solving this mystery. thank you. > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html