thr3ads.net - R help - [R] Joining two (or more) frequency tables [May 2010]

If this information is useful, please help other people find it:
Share via:

Jan.Sunde at biokapital.no

2010-May-19 09:56 UTC

[R] Joining two (or more) frequency tables

Hi,
I am still new to R and have, in spite of searching all over, found no 
"understandable" documentation for how to join two (or more) frequency
tables.
Why would I want to do that and what do I mean by "joining" ? Let me
try
to explain:

I have a diagnosis form that allows registration of up to three diagnosis 
codes on animals with birth defects, like this:

father.id diagnosis diagnosis2 diagnosis3
a             100         200            300
b              200        100            340
....

and so on.

As you can see the same diagnosis can show up in either field (1, 2 or 3). 

I am interested in how many times each diagnosis shows up for offspring 
from the same father (looking at inheritance)

The best I have come up with so far is

table(father.id, diagnosis)
table(father.id, diagnosis2)
table(father.id, diagnosis3)

which gives me the frequency counts for each combination of father.id and 
diagnosis code

and then I manually join these tables together by cutting and pasting (in 
Excel!!). Oh the horror!

The resulting table looks like this though and gives me what I want:

father.id      100      200      300   340  
a                   1         1         1     0
b                   1          1         0    1
.....

This is incredibly cumbersome and therefore I want to find a way of doing 
this completely in R script so that I have the code ready when I need it 
again.
So far I have tried with several varieties of merge() and xtabs() but have 
given up. I can not afford to spend more time on this and as a last resort 
I'm hoping anyone here could pass me a tip.
I use R for modelling but I am so far finding it hard to wrap my head 
around it when it comes to the pre-analysis data manipulation like this as 
I am not proficient enough (i.e. do not quite understand) in the language 
and data structure intricacies.

Regards, 
Jan 
	[[alternative HTML version deleted]]

jim holtman

2010-May-19 12:00 UTC

head link

[R] Joining two (or more) frequency tables

use the 'reshape package:
> require(reshape)Loading required package: reshape
Loading required package: plyr> x  father.id diagnosis diagnosis2 diagnosis3
1         a       100        200        300
2         b       200        100        340
> melt(x)Using father.id as id variables
  father.id   variable value
1         a  diagnosis   100
2         b  diagnosis   200
3         a diagnosis2   200
4         b diagnosis2   100
5         a diagnosis3   300
6         b diagnosis3   340> z <- melt(x)
Using father.id as id variables> table(z$father.id, z$value)
    100 200 300 340
  a   1   1   1   0
  b   1   1   0   1>

On Wed, May 19, 2010 at 5:56 AM,  <Jan.Sunde at biokapital.no>
wrote:> Hi,
> I am still new to R and have, in spite of searching all over, found no
> "understandable" documentation for how to join two (or more)
frequency
> tables.
> Why would I want to do that and what do I mean by "joining" ? Let
me try
> to explain:
>
> I have a diagnosis form that allows registration of up to three diagnosis
> codes on animals with birth defects, like this:
>
> father.id diagnosis diagnosis2 diagnosis3
> a ? ? ? ? ? ? 100 ? ? ? ? 200 ? ? ? ? ? ?300
> b ? ? ? ? ? ? ?200 ? ? ? ?100 ? ? ? ? ? ?340
> ....
>
> and so on.
>
> As you can see the same diagnosis can show up in either field (1, 2 or 3).
>
> I am interested in how many times each diagnosis shows up for offspring
> from the same father (looking at inheritance)
>
> The best I have come up with so far is
>
> table(father.id, diagnosis)
> table(father.id, diagnosis2)
> table(father.id, diagnosis3)
>
> which gives me the frequency counts for each combination of father.id and
> diagnosis code
>
> and then I manually join these tables together by cutting and pasting (in
> Excel!!). Oh the horror!
>
> The resulting table looks like this though and gives me what I want:
>
> father.id ? ? ?100 ? ? ?200 ? ? ?300 ? 340
> a ? ? ? ? ? ? ? ? ? 1 ? ? ? ? 1 ? ? ? ? 1 ? ? 0
> b ? ? ? ? ? ? ? ? ? 1 ? ? ? ? ?1 ? ? ? ? 0 ? ?1
> .....
>
> This is incredibly cumbersome and therefore I want to find a way of doing
> this completely in R script so that I have the code ready when I need it
> again.
> So far I have tried with several varieties of merge() and xtabs() but have
> given up. I can not afford to spend more time on this and as a last resort
> I'm hoping anyone here could pass me a tip.
> I use R for modelling but I am so far finding it hard to wrap my head
> around it when it comes to the pre-analysis data manipulation like this as
> I am not proficient enough (i.e. do not quite understand) in the language
> and data structure intricacies.
>
> Regards,
> Jan
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

Wu Gong

2010-May-19 14:36 UTC

head link

[R] Joining two (or more) frequency tables

> ## Create a sample data.
> data <- data.frame(father.id = letters[1:5],+ diagnosis = sample(c(100,200,300,340),5,replace=TRUE),
+ diagnosis1 = sample(c(100,200,300,340),5,replace=TRUE),
+ diagnosis2 =
sample(c(100,200,300,340),5,replace=TRUE))> data  father.id diagnosis diagnosis1 diagnosis2
1         a       340        100        300
2         b       200        300        340
3         c       200        300        100
4         d       200        100        300
5         e       200        100        100> 
> ## Create a matrix by replicating the father.id column 3 times
> ## Because we have 3 diagnosis columns to match when use table() function
> match <- as.matrix(data[,rep(1:2,c(3,0))])
> match     father.id father.id.1 father.id.2
[1,] "a"       "a"         "a"        
[2,] "b"       "b"         "b"        
[3,] "c"       "c"         "c"        
[4,] "d"       "d"         "d"        
[5,] "e"       "e"         "e"       
> 
> ## Count
> table(match,as.matrix(data[,2:4]))     
match 100 200 300 340
    a   1   0   1   1
    b   0   1   1   1
    c   1   1   1   0
    d   1   1   1   0
    e   2   1   0   0> 

Please refer to:
http://r.789695.n4.nabble.com/Counting-Frequencies-in-Data-Frame-tt2221342.html#a2221487
http://r.789695.n4.nabble.com/Counting-Frequencies-in-Data-Frame-tt2221342.html#a2221487

-----
A R learner.
-- 
View this message in context:
http://r.789695.n4.nabble.com/Joining-two-or-more-frequency-tables-tp2222576p2222946.html
Sent from the R help mailing list archive at Nabble.com.

Maybe Matching Threads

Search for more apparently analagous threads

R help - May 2010 - Joining two (or more) frequency tables

[R] Joining two (or more) frequency tables

[R] Joining two (or more) frequency tables

[R] Joining two (or more) frequency tables

Maybe Matching Threads