On Jan 21, 2010, at 12:38 PM, b k wrote:
> Hello,
>
> I know there must be a simple soluton to this problem but it eludes me
> currently.
>
> My data is partitioned into two subsets, each subset has a common
> column
> factor but with varying levels:
>
> levels(fdf_ghc$AgeDemo)
> [1] "26TO35" "36TO45" "46TO55"
"56TO65" "66TO75" "76TO85"
> levels(fdf_ghcnull$AgeDemo)
> [1] "26TO35" "36TO45" "46TO55"
"56TO65" "66TO75" "76TO85"
> "86TO100"
> table(fdf_ghc$AgeDemo)
> 26TO35 36TO45 46TO55 56TO65 66TO75 76TO85
> 6 14 21 31 19 14
> table(fdf_ghcnull$AgeDemo)
> 26TO35 36TO45 46TO55 56TO65 66TO75 76TO85 86TO100
> 5 5 10 7 8 4 1
> I need to construct a common contingency table from the two lists,
> but rbind
> recycles values due to the differing levels:
>
> rbind(table(fdf_ghc$AgeDemo), table(fdf_ghcnull$AgeDemo))
> 26TO35 36TO45 46TO55 56TO65 66TO75 76TO85 86TO100
> [1,] 6 14 21 31 19 14 6
> [2,] 5 5 10 7 8 4 1
> Warning message:
> In rbind(table(fdf_ghc$AgeDemo), table(fdf_ghcnull$AgeDemo)) :
> number of columns of result is not a multiple of vector length (arg
> 1)
fdf_ghc<- data.frame(AgeDemo=factor( c("26TO35",
"36TO45", "46TO55",
"56TO65", "66TO75", "76TO85") ))
fdf_ghcnull <- data.frame(AgeDemo= factor( c( "26TO35",
"36TO45",
"46TO55", "56TO65", "66TO75"
,"76TO85", "86TO100") ))
table( rbind(subset(fdf_ghcnull, select=AgeDemo), subset(fdf_ghc,
select=AgeDemo)) )
# You need a strategy that prevents wiping out the factor level labels.
# in my test case:
26TO35 36TO45 46TO55 56TO65 66TO75 76TO85 86TO100
2 2 2 2 2 2 1
Whereas concatenation seemed to remove the factor labels although it
keeps the counts:
> factor(c(fdf_ghcnull$AgeDemo, fdf_ghc$AgeDemo))
[1] 1 2 3 4 5 6 7 1 2 3 4 5 6
Levels: 1 2 3 4 5 6 7
>
> I need something I can pass to fisher.test() or chisq.test().
I'm not sure I see how those would accept a single factor or one row
table for a test.
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT