thr3ads.net - R help - [R] Questions on formula in princomp [Apr 2006]

If this information is useful, please help other people find it:
Share via:

Sasha Pustota

2006-Apr-13 18:26 UTC

[R] Questions on formula in princomp

I hope this time I'm using the "iris" dataset correctly:

ir <- rbind(iris3[,,1], iris3[,,2], iris3[,,3])
lir <- data.frame(log(ir))
names(lir) <- c("a","b","c","d")

I'm trying to understand the meaning of expressions like "~
a+b+c+d",
used with princomp, e.g.

princomp(~ a+b+c+d, data=lir, cor=T)

By inspection, it looks like the result is the same as in

princomp(lir, cor = T).

Do "a+b+c+d" simply specify the columns to be included? Could someone
provide a meaningful example of princomp formula that uses operators
other than "+"?

In linear model, E(y)= xb, examples, "~" is usually placed between
"y"
and "x". What is the meaning of "~" here?

Gabor Grothendieck

2006-Apr-13 23:12 UTC

head link

[R] Questions on formula in princomp

On 4/13/06, Sasha Pustota <popgen at gmail.com>
wrote:> I hope this time I'm using the "iris" dataset correctly:
>
> ir <- rbind(iris3[,,1], iris3[,,2], iris3[,,3])
> lir <- data.frame(log(ir))
> names(lir) <- c("a","b","c","d")
>
> I'm trying to understand the meaning of expressions like "~
a+b+c+d",
> used with princomp, e.g.
>
> princomp(~ a+b+c+d, data=lir, cor=T)
>
> By inspection, it looks like the result is the same as in
>
> princomp(lir, cor = T).
Yes, princomp.formula just takes the model matrix of the formula
and passes it to princomp.default.
>
> Do "a+b+c+d" simply specify the columns to be included? Could
someone
> provide a meaningful example of princomp formula that uses operators
> other than "+"?
colnames(model.matrix(~., lir))
princomp(~., lir)

colnames(model.matrix(~(.)^2, lir))
princomp(~(.)^2, lir)

>
> In linear model, E(y)= xb, examples, "~" is usually placed
between "y"
> and "x". What is the meaning of "~" here?
Its just a way to specify a formula that you can take a model matrix
of.  See ?model.matrix and try playing with it a bit on small examples.

Gabor Grothendieck

2006-Apr-15 11:22 UTC

head link

[R] matching identical row names

aggregate(DF[,-1], DF[, 1, drop = FALSE], mean)

On 4/15/06, Srinivas Iyyer <srini_iyyer_bio at yahoo.com>
wrote:> dear group,
>
> i have a sample matrix
> name   v1   v2   v3   v4
> cat   10    11   12   15
> dog   3     12   10   14
> cat   9     12   12   15
> cat   5     12   10   11
> dog   12    113  123  31
> ...
>
>
> since cat is repeated 3 times, I want a mean value for
> it. Like wise for every element of the name column.
> cat v1 = mean(c(10,9,5))
> cat v3 = mean(c(11,12,13))
> ..etc.
>
> name v1   v2     v3   v4
> cat  8   11.6   11.3  13.6
> dog  7.5 62.5   66.5  22.5
>
> could any one help me in solving this mystery. thank you.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>

John Fox

2006-Apr-15 13:35 UTC

head link

[R] matching identical row names

Dear Srinivas,

Your data are likely in a data frame rather than a matrix (since the columns
are heterogeneous), and name is a variable, not the row names of the data
frame.

There are several ways to do what you want; one simple way, assuming that
the data are in a data frame named Data, is

 by(Data[,2:5], Data$name, mean)

If you want the result in the form of a matrix, then you could do

 aggregate(Data[,2:5], list(Data$name), mean)

I hope this helps,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
-------------------------------- 
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Srinivas Iyyer
> Sent: Friday, April 14, 2006 11:58 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] matching identical row names
> 
> dear group, 
>  
> i have a sample matrix
> name   v1   v2   v3   v4
> cat   10    11   12   15
> dog   3     12   10   14
> cat   9     12   12   15
> cat   5     12   10   11
> dog   12    113  123  31
> ...
> 
> 
> since cat is repeated 3 times, I want a mean value for it. 
> Like wise for every element of the name column. 
> cat v1 = mean(c(10,9,5))
> cat v3 = mean(c(11,12,13))
> ..etc.
> 
> name v1   v2     v3   v4
> cat  8   11.6   11.3  13.6
> dog  7.5 62.5   66.5  22.5
> 
> could any one help me in solving this mystery. thank you.
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html

John Kane

2006-Apr-16 15:29 UTC

head link

[R] matching identical row names

Dear Dr. Fox
Your reply to Sirinivas Iyyar was most helpful to me. I am trying to collapse
some categories of a data.frame in a similar way.
I have a data frame in the form below 

Prog     Sub.Program   Job  V1  V2  V3
  1         Alpha               A     1   2   3
  2         Alpha               B     2   3   1
  2         Gamma            B     1   3   3
  2         Alpha               A     3   4   1
  2         Gamma            B     2   2   3
  1         Alpha               A     2  2    2
 
What I want is to sum the values of VI, V2 and V3 and end up with a new
data.frame that would look like
 
   Prog    Subprog     Job   Sum(V1)  Sum(V2), Sum(V3)
    1          Alpha        A        3            4              5
    2          Alpha        A        3            4              1
    2          Gamma     B         3            5               6

I thought that I could use by() to create a vector for each of V1:V3 but I
cannot see any way to capture the values.
temp1 <- by(Data[,4] simply gives me the complete output.

An example of what I have done is
-------------------------------------------------------------

Prog  <- 1, 2, 2, 2,2,1,    
Sub.Program <-  c("Alpha", "Alpha", "Gamma",
"Alpha", "Gamma", "Alpha" )
Job <- c("A",     "B",     "B",    
"A",     "B",    "A")
V1 <- c(1,2, 1,3,2,2)
V2 <-  c(2, 3, 3, 4, 2, 2)   
V3 <-  c(3, 1 , 3, 1, 3,2 
Mydata <- data.frame(cbind( Prog, Sub.Program, Job, V1, V2, V3)   

by(MyData[,4],list(Sub.Program=Sub.Program, Job=Job), sum)
----------------------------------------------------------------
 
I also get the expected <NA. for cells that do not exist. Is there any way to
set them to "0" in the operation?

 

Any help would be greatly appreciated. 
Thanks 
John

----- Original Message ----
From: John Fox <jfox at mcmaster.ca>
To: Srinivas Iyyer <srini_iyyer_bio at yahoo.com>
Cc: r-help at stat.math.ethz.ch
Sent: Saturday, April 15, 2006 9:35:46 AM
Subject: Re: [R] matching identical row names

Dear Srinivas,

Your data are likely in a data frame rather than a matrix (since the columns
are heterogeneous), and name is a variable, not the row names of the data
frame.

There are several ways to do what you want; one simple way, assuming that
the data are in a data frame named Data, is

 by(Data[,2:5], Data$name, mean)

If you want the result in the form of a matrix, then you could do

 aggregate(Data[,2:5], list(Data$name), mean)

I hope this helps,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
-------------------------------- 
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Srinivas Iyyer
> Sent: Friday, April 14, 2006 11:58 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] matching identical row names
> 
> dear group, 
>  
> i have a sample matrix
> name   v1   v2   v3   v4
> cat   10    11   12   15
> dog   3     12   10   14
> cat   9     12   12   15
> cat   5     12   10   11
> dog   12    113  123  31
> ...
> 
> 
> since cat is repeated 3 times, I want a mean value for it. 
> Like wise for every element of the name column. 
> cat v1 = mean(c(10,9,5))
> cat v3 = mean(c(11,12,13))
> ..etc.
> 
> name v1   v2     v3   v4
> cat  8   11.6   11.3  13.6
> dog  7.5 62.5   66.5  22.5
> 
> could any one help me in solving this mystery. thank you.
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

John Fox

2006-Apr-16 15:51 UTC

head link

[R] matching identical row names

Dear John,

You can use aggregate(), also described in my suggestion to Sirinivas:
> aggregate(Data[, 4:6], Data[1:3], sum)  Prog Sub.Program Job V1 V2 V3
1    1       Alpha   A  3  4  5
2    2       Alpha   A  3  4  1
3    2       Alpha   B  2  3  1
4    2       Gamma   B  3  5  6

I hope this helps,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
-------------------------------- 
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of John Kane
> Sent: Sunday, April 16, 2006 10:29 AM
> To: John Fox; Srinivas Iyyer
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] matching identical row names
> 
> Dear Dr. Fox
> Your reply to Sirinivas Iyyar was most helpful to me. I am 
> trying to collapse some categories of a data.frame in a similar way.  
> I have a data frame in the form below 
> 
> Prog     Sub.Program   Job  V1  V2  V3
>   1         Alpha               A     1   2   3
>   2         Alpha               B     2   3   1
>   2         Gamma            B     1   3   3
>   2         Alpha               A     3   4   1
>   2         Gamma            B     2   2   3
>   1         Alpha               A     2  2    2
>  
> What I want is to sum the values of VI, V2 and V3 and end up 
> with a new data.frame that would look like
>  
>    Prog    Subprog     Job   Sum(V1)  Sum(V2), Sum(V3)
>     1          Alpha        A        3            4              5
>     2          Alpha        A        3            4              1
>     2          Gamma     B         3            5               6
> 
> I thought that I could use by() to create a vector for each 
> of V1:V3 but I cannot see any way to capture the values.
> temp1 <- by(Data[,4] simply gives me the complete output.
> 
> An example of what I have done is
> -------------------------------------------------------------
> 
> Prog  <- 1, 2, 2, 2,2,1,    
> Sub.Program <-  c("Alpha", "Alpha",
"Gamma", "Alpha",
> "Gamma", "Alpha" )
> Job <- c("A",     "B",     "B",    
"A",     "B",    "A")
> V1 <- c(1,2, 1,3,2,2)
> V2 <-  c(2, 3, 3, 4, 2, 2)   
> V3 <-  c(3, 1 , 3, 1, 3,2 
> Mydata <- data.frame(cbind( Prog, Sub.Program, Job, V1, V2, V3)   
> 
> by(MyData[,4],list(Sub.Program=Sub.Program, Job=Job), sum)
> ----------------------------------------------------------------
>  
> I also get the expected <NA. for cells that do not exist. Is 
> there any way to set them to "0" in the operation?  
> 
>  
> 
> Any help would be greatly appreciated. 
> Thanks
> John
> 
> ----- Original Message ----
> From: John Fox <jfox at mcmaster.ca>
> To: Srinivas Iyyer <srini_iyyer_bio at yahoo.com>
> Cc: r-help at stat.math.ethz.ch
> Sent: Saturday, April 15, 2006 9:35:46 AM
> Subject: Re: [R] matching identical row names
> 
> Dear Srinivas,
> 
> Your data are likely in a data frame rather than a matrix 
> (since the columns are heterogeneous), and name is a 
> variable, not the row names of the data frame.
> 
> There are several ways to do what you want; one simple way, 
> assuming that the data are in a data frame named Data, is
> 
>  by(Data[,2:5], Data$name, mean)
> 
> If you want the result in the form of a matrix, then you could do
> 
>  aggregate(Data[,2:5], list(Data$name), mean)
> 
> I hope this helps,
>  John
> 
> --------------------------------
> John Fox
> Department of Sociology
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> 905-525-9140x23604
> http://socserv.mcmaster.ca/jfox
> -------------------------------- 
> 
> > -----Original Message-----
> > From: r-help-bounces at stat.math.ethz.ch 
> > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of 
> Srinivas Iyyer
> > Sent: Friday, April 14, 2006 11:58 PM
> > To: r-help at stat.math.ethz.ch
> > Subject: [R] matching identical row names
> > 
> > dear group,
> >  
> > i have a sample matrix
> > name   v1   v2   v3   v4
> > cat   10    11   12   15
> > dog   3     12   10   14
> > cat   9     12   12   15
> > cat   5     12   10   11
> > dog   12    113  123  31
> > ...
> > 
> > 
> > since cat is repeated 3 times, I want a mean value for it. 
> > Like wise for every element of the name column. 
> > cat v1 = mean(c(10,9,5))
> > cat v3 = mean(c(11,12,13))
> > ..etc.
> > 
> > name v1   v2     v3   v4
> > cat  8   11.6   11.3  13.6
> > dog  7.5 62.5   66.5  22.5
> > 
> > could any one help me in solving this mystery. thank you.
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html

Reasonably Related Threads

Search for more seemingly similar threads

R help - Apr 2006 - Questions on formula in princomp

[R] Questions on formula in princomp

[R] Questions on formula in princomp

[R] matching identical row names

[R] matching identical row names

[R] matching identical row names

[R] matching identical row names

Reasonably Related Threads