thr3ads.net - R help - [R] Crosstabbing multiple response data [Feb 2007]

If this information is useful, please help other people find it:
Share via:

Michael Wexler

2007-Feb-22 15:29 UTC

[R] Crosstabbing multiple response data

Using R version 2.4.1 (2006-12-18) on Windows, I have a dataset which resembles
this:

id    att1    att2    att3
1    1        1        0
2    1        0        0
3    0        1        1
4    1        1        1

ratings <- data.frame(id = c(1,2,3,4), att1 = c(1,1,0,1), att2 = c(1,0,0,1),
att3 = c(0,1,1,1))

I would like to get a cross tab of counts of co-ocurrence, which might resemble
this:

    att1    att2    att3
att1         2       1
att2    2            2
att3    1    2    

with the hope of understanding, at least pairwise, what things "hang
together".   (Yes, there are much, much better ways to do this
statistically including clustering and binary corrected correlation, but the
audience I am working with asked for this version for a specific reason.)

(Later on, I would also like to convert to percentages of the total unique pop,
so the final version of the table would be


    att1    att2    att3

att1         50%       25%

att2    50%            50%

att3    25%    50%    


But I can do this in excel if I can get the first table out.)

I have tried the reshape library, but could not get anything resembling this
(both on its own, as well as feeding in to table()).  (I have also played with
transposing and using some comments from this list from 2002 and 2004, but the
questioners appear to assume more knowledge than I have in use of R; the example
in the posting guide was also more complex than I was ready for, I'm
afraid.)

Sample of some of my efforts:
library(reshape)
melt(ratings,id=c("id"))

ds1 <- melt(ratings,id=c("id"))
table(ds1$variable, ds1$variable) # returns only rowcounts, 3 along diagonal
xtabs(formula = value ~ ds1$variable + ds1$variable , data=ds1) # returns only a
single row of collapsed counts, appears to not allow 1 variable in multiple uses

I suspect I am close, so any nudges in the right direction would be helpful.

Thanks much, Michael

PS: www.rseek.org is very impressive, I heartily encourage its use.


	[[alternative HTML version deleted]]

Gabor Grothendieck

2007-Feb-22 16:16 UTC

head link

[R] Crosstabbing multiple response data

Try this:

tab <- crossprod(as.matrix(ratings[,-1]))
tab <- tab - diag(diag(tab))
tab

tab / nrow(ratings)


On 2/22/07, Michael Wexler <wexler at yahoo.com>
wrote:> Using R version 2.4.1 (2006-12-18) on Windows, I have a dataset which
resembles this:
>
> id    att1    att2    att3
> 1    1        1        0
> 2    1        0        0
> 3    0        1        1
> 4    1        1        1
>
> ratings <- data.frame(id = c(1,2,3,4), att1 = c(1,1,0,1), att2 =
c(1,0,0,1), att3 = c(0,1,1,1))
>
> I would like to get a cross tab of counts of co-ocurrence, which might
resemble this:
>
>    att1    att2    att3
> att1         2       1
> att2    2            2
> att3    1    2
>
> with the hope of understanding, at least pairwise, what things "hang
together".   (Yes, there are much, much better ways to do this
statistically including clustering and binary corrected correlation, but the
audience I am working with asked for this version for a specific reason.)
>
> (Later on, I would also like to convert to percentages of the total unique
pop, so the final version of the table would be
>
>
>    att1    att2    att3
>
> att1         50%       25%
>
> att2    50%            50%
>
> att3    25%    50%
>
>
> But I can do this in excel if I can get the first table out.)
>
> I have tried the reshape library, but could not get anything resembling
this (both on its own, as well as feeding in to table()).  (I have also played
with transposing and using some comments from this list from 2002 and 2004, but
the questioners appear to assume more knowledge than I have in use of R; the
example in the posting guide was also more complex than I was ready for, I'm
afraid.)
>
> Sample of some of my efforts:
> library(reshape)
> melt(ratings,id=c("id"))
>
> ds1 <- melt(ratings,id=c("id"))
> table(ds1$variable, ds1$variable) # returns only rowcounts, 3 along
diagonal
> xtabs(formula = value ~ ds1$variable + ds1$variable , data=ds1) # returns
only a single row of collapsed counts, appears to not allow 1 variable in
multiple uses
>
> I suspect I am close, so any nudges in the right direction would be
helpful.
>
> Thanks much, Michael
>
> PS: www.rseek.org is very impressive, I heartily encourage its use.
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry

2007-Feb-22 18:17 UTC

head link

[R] Crosstabbing multiple response data

> res <- crossprod( as.matrix( ratings[ , -1] ) )
> diag(res) <- ""
> print(res, quote=F)      att1 att2 att3
att1      2    1
att2 2         2
att3 1    2> 
> res2 <- crossprod(as.matrix( ratings[ , -1])) * 100 / nrow( ratings )
> res2[] <- paste( res2, "%", sep="" )
> diag(res2) <- ""
> print(res2, quote=F)      att1 att2 att3
att1      50%  25%
att2 50%       50%
att3 25%  50%>
Be sure to bone up on format and sprintf before taking this into 
production.

On Thu, 22 Feb 2007, Michael Wexler wrote:
> Using R version 2.4.1 (2006-12-18) on Windows, I have a dataset which
resembles this:
>
> id    att1    att2    att3
> 1    1        1        0
> 2    1        0        0
> 3    0        1        1
> 4    1        1        1
>
> ratings <- data.frame(id = c(1,2,3,4), att1 = c(1,1,0,1), att2 =
c(1,0,0,1), att3 = c(0,1,1,1))
>
> I would like to get a cross tab of counts of co-ocurrence, which might
resemble this:
>
>    att1    att2    att3
> att1         2       1
> att2    2            2
> att3    1    2
>
> with the hope of understanding, at least pairwise, what things "hang
together".   (Yes, there are much, much better ways to do this
statistically including clustering and binary corrected correlation, but the
audience I am working with asked for this version for a specific reason.)
>
> (Later on, I would also like to convert to percentages of the total unique
pop, so the final version of the table would be
>
>
>    att1    att2    att3
>
> att1         50%       25%
>
> att2    50%            50%
>
> att3    25%    50%
>
>
> But I can do this in excel if I can get the first table out.)
>
> I have tried the reshape library, but could not get anything resembling
this (both on its own, as well as feeding in to table()).  (I have also played
with transposing and using some comments from this list from 2002 and 2004, but
the questioners appear to assume more knowledge than I have in use of R; the
example in the posting guide was also more complex than I was ready for, I'm
afraid.)
>
> Sample of some of my efforts:
> library(reshape)
> melt(ratings,id=c("id"))
>
> ds1 <- melt(ratings,id=c("id"))
> table(ds1$variable, ds1$variable) # returns only rowcounts, 3 along
diagonal
> xtabs(formula = value ~ ds1$variable + ds1$variable , data=ds1) # returns
only a single row of collapsed counts, appears to not allow 1 variable in
multiple uses
>
> I suspect I am close, so any nudges in the right direction would be
helpful.
>
> Thanks much, Michael
>
> PS: www.rseek.org is very impressive, I heartily encourage its use.
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry                        (858) 534-2098
                                          Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	         UC San Diego
http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0901

Michael Wexler

2007-Feb-26 23:27 UTC

head link

[R] Crosstabbing multiple response data

Thanks to Charles, Gabor, and a private message from Frank E Harrell with some
good ideas and help.  This crossprod approach was very clever, I would never
have thought of it.

Best, Michael


----- Original Message ----
From: Charles C. Berry <cberry@tajo.ucsd.edu>
To: Michael Wexler <wexler@yahoo.com>
Cc: r-help@stat.math.ethz.ch
Sent: Thursday, February 22, 2007 1:17:44 PM
Subject: Re: [R] Crosstabbing multiple response data

> res <- crossprod( as.matrix( ratings[ , -1] ) )
> diag(res) <- ""
> print(res, quote=F)      att1 att2 att3
att1      2    1
att2 2         2
att3 1    2> 
> res2 <- crossprod(as.matrix( ratings[ , -1])) * 100 / nrow( ratings )
> res2[] <- paste( res2, "%", sep="" )
> diag(res2) <- ""
> print(res2, quote=F)      att1 att2 att3
att1      50%  25%
att2 50%       50%
att3 25%  50%>
Be sure to bone up on format and sprintf before taking this into 
production.

On Thu, 22 Feb 2007, Michael Wexler wrote:
> Using R version 2.4.1 (2006-12-18) on Windows, I have a dataset which
resembles this:
>
> id    att1    att2    att3
> 1    1        1        0
> 2    1        0        0
> 3    0        1        1
> 4    1        1        1
>
> ratings <- data.frame(id = c(1,2,3,4), att1 = c(1,1,0,1), att2 =
c(1,0,0,1), att3 = c(0,1,1,1))
>
> I would like to get a cross tab of counts of co-ocurrence, which might
resemble this:
>
>    att1    att2    att3
> att1         2       1
> att2    2            2
> att3    1    2
>
> with the hope of understanding, at least pairwise, what things "hang
together".   (Yes, there are much, much better ways to do this
statistically including clustering and binary corrected correlation, but the
audience I am working with asked for this version for a specific reason.)
>
> (Later on, I would also like to convert to percentages of the total unique
pop, so the final version of the table would be
>
>
>    att1    att2    att3
>
> att1         50%       25%
>
> att2    50%            50%
>
> att3    25%    50%
>
>
> But I can do this in excel if I can get the first table out.)
>
> I have tried the reshape library, but could not get anything resembling
this (both on its own, as well as feeding in to table()).  (I have also played
with transposing and using some comments from this list from 2002 and 2004, but
the questioners appear to assume more knowledge than I have in use of R; the
example in the posting guide was also more complex than I was ready for, I'm
afraid.)
>
> Sample of some of my efforts:
> library(reshape)
> melt(ratings,id=c("id"))
>
> ds1 <- melt(ratings,id=c("id"))
> table(ds1$variable, ds1$variable) # returns only rowcounts, 3 along
diagonal
> xtabs(formula = value ~ ds1$variable + ds1$variable , data=ds1) # returns
only a single row of collapsed counts, appears to not allow 1 variable in
multiple uses
>
> I suspect I am close, so any nudges in the right direction would be
helpful.
>
> Thanks much, Michael
>
> PS: www.rseek.org is very impressive, I heartily encourage its use.
>
>
>     [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry                        (858) 534-2098
                                          Dept of Family/Preventive Medicine
E mailto:cberry@tajo.ucsd.edu             UC San Diego
http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0901







	[[alternative HTML version deleted]]

John Kane

2007-Feb-27 13:33 UTC

head link

[R] Crosstabbing multiple response data

--- John Kane <jrkrideau at yahoo.ca> wrote:
> Thanks to everyone for this.  I was looking at the
> same problem last night and just was going to write
> a
> posting to R-help when I saw this.  
> 
> 
> --- Michael Wexler <wexler at yahoo.com> wrote:
> 
> > 
> > Thanks to Charles, Gabor, and a private message
> from
> > Frank E Harrell with some good ideas and help. 
> This
> > crossprod approach was very clever, I would never
> > have thought of it.
> > 
> > Best, Michael
> > 
> > 
> > ----- Original Message ----
> > From: Charles C. Berry <cberry at tajo.ucsd.edu>
> > To: Michael Wexler <wexler at yahoo.com>
> > Cc: r-help at stat.math.ethz.ch
> > Sent: Thursday, February 22, 2007 1:17:44 PM
> > Subject: Re: [R] Crosstabbing multiple response
> data
> > 
> > 
> > > res <- crossprod( as.matrix( ratings[ , -1] ) )
> > > diag(res) <- ""
> > > print(res, quote=F)
> >       att1 att2 att3
> > att1      2    1
> > att2 2         2
> > att3 1    2
> > > 
> > > res2 <- crossprod(as.matrix( ratings[ , -1])) *
> > 100 / nrow( ratings )
> > > res2[] <- paste( res2, "%", sep="" )
> > > diag(res2) <- ""
> > > print(res2, quote=F)
> >       att1 att2 att3
> > att1      50%  25%
> > att2 50%       50%
> > att3 25%  50%
> > >
> > 
> > Be sure to bone up on format and sprintf before
> > taking this into 
> > production.
> > 
> > On Thu, 22 Feb 2007, Michael Wexler wrote:
> > 
> > > Using R version 2.4.1 (2006-12-18) on Windows, I
> > have a dataset which resembles this:
> > >
> > > id    att1    att2    att3
> > > 1    1        1        0
> > > 2    1        0        0
> > > 3    0        1        1
> > > 4    1        1        1
> > >
> > > ratings <- data.frame(id = c(1,2,3,4), att1 > >
c(1,1,0,1), att2 = c(1,0,0,1), att3 = c(0,1,1,1))
> > >
> > > I would like to get a cross tab of counts of
> > co-ocurrence, which might resemble this:
> > >
> > >    att1    att2    att3
> > > att1         2       1
> > > att2    2            2
> > > att3    1    2
> > >
> > > with the hope of understanding, at least
> pairwise,
> > what things "hang together".   (Yes, there are
> much,
> > much better ways to do this statistically
> including
> > clustering and binary corrected correlation, but
> the
> > audience I am working with asked for this version
> > for a specific reason.)
> > >
> > > (Later on, I would also like to convert to
> > percentages of the total unique pop, so the final
> > version of the table would be
> > >
> > >
> > >    att1    att2    att3
> > >
> > > att1         50%       25%
> > >
> > > att2    50%            50%
> > >
> > > att3    25%    50%
> > >
> > >
> > > But I can do this in excel if I can get the
> first
> > table out.)
> > >
> > > I have tried the reshape library, but could not
> > get anything resembling this (both on its own, as
> > well as feeding in to table()).  (I have also
> played
> > with transposing and using some comments from this
> > list from 2002 and 2004, but the questioners
> appear
> > to assume more knowledge than I have in use of R;
> > the example in the posting guide was also more
> > complex than I was ready for, I'm afraid.)
> > >
> > > Sample of some of my efforts:
> > > library(reshape)
> > > melt(ratings,id=c("id"))
> > >
> > > ds1 <- melt(ratings,id=c("id"))
> > > table(ds1$variable, ds1$variable) # returns only
> > rowcounts, 3 along diagonal
> > > xtabs(formula = value ~ ds1$variable +
> > ds1$variable , data=ds1) # returns only a single
> row
> > of collapsed counts, appears to not allow 1
> variable
> > in multiple uses
> > >
> > > I suspect I am close, so any nudges in the right
> > direction would be helpful.
> > >
> > > Thanks much, Michael
> > >
> > > PS: www.rseek.org is very impressive, I heartily
> > encourage its use.
> > >
> > >
> > >     [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained,
> > reproducible code.
> > >
> > 
> > Charles C. Berry                        (858)
> > 534-2098
> >                                           Dept of
> > Family/Preventive Medicine
> > E mailto:cberry at tajo.ucsd.edu             UC San
> > Diego
> > http://biostat.ucsd.edu/~cberry/         La Jolla,
> > San Diego 92093-0901
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 	[[alternative HTML version deleted]]
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> > reproducible code.
> > 
> 
> 
> __________________________________________________
> Do You Yahoo!?
> protection around 
> http://mail.yahoo.com 
>

Reasonably Related Threads

Search for more apparently analagous threads

R help - Feb 2007 - Crosstabbing multiple response data

[R] Crosstabbing multiple response data

[R] Crosstabbing multiple response data

[R] Crosstabbing multiple response data

[R] Crosstabbing multiple response data

[R] Crosstabbing multiple response data

Reasonably Related Threads