I'm trying to form contingincy tables among a set of character variables which were read from a .csv file and have missing represented as "". I want to exclude the missing levels from the table.> levels(CPIC)[1] "" "N" "Y"> levels(Manix)[1] "" "N" "Y"> xtabs(~CPIC + Manix, exclude=c("",NA))Manix CPIC N Y 272 4 15 N 154 2812 1472 Y 158 466 4870> table(CPIC, Manix, exclude=c("",NA))Manix CPIC N Y 272 4 15 N 154 2812 1472 Y 158 466 4870 The only way I can exclude them is by t <- table(CPIC, Manix) t <- t[-1,-1] that's not to hard in this case, but my application is to a much larger table where this gets unweildly. -- Michael Friendly friendly at yorku.ca York University http://www.math.yorku.ca/SCS/friendly.html Psychology Department 4700 Keele Street Tel: (416) 736-5115 x66249 Toronto, Ontario, M3J 1P3 Fax: (416) 736-5814
Dear Mike, You could read the file specifying the na.strings argument to read.csv. Does that do what you need? Regards, John At 10:15 AM 12/11/2002 -0500, Michael Friendly wrote:>I'm trying to form contingincy tables among a set of character variables >which were read from a .csv file and >have missing represented as "". I want to exclude the missing levels >from the table. > > > levels(CPIC) >[1] "" "N" "Y" > > levels(Manix) >[1] "" "N" "Y" > > xtabs(~CPIC + Manix, exclude=c("",NA)) > Manix >CPIC N Y > 272 4 15 > N 154 2812 1472 > Y 158 466 4870 > > > table(CPIC, Manix, exclude=c("",NA)) > > Manix >CPIC N Y > 272 4 15 > N 154 2812 1472 > Y 158 466 4870 > >The only way I can exclude them is by > >t <- table(CPIC, Manix) >t <- t[-1,-1] > >that's not to hard in this case, but my application is to a much >larger table where this gets unweildly. > >-- >Michael Friendly friendly at yorku.ca >York University http://www.math.yorku.ca/SCS/friendly.html >Psychology Department >4700 Keele Street Tel: (416) 736-5115 x66249 >Toronto, Ontario, M3J 1P3 Fax: (416) 736-5814 > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >http://www.stat.math.ethz.ch/mailman/listinfo/r-help____________________________ John Fox Department of Sociology McMaster University email: jfox at mcmaster.ca web: http://www.socsci.mcmaster.ca/jfox
On Wed, 11 Dec 2002, Michael Friendly wrote:> I'm trying to form contingincy tables among a set of character variables > which were read from a .csv file and > have missing represented as "". I want to exclude the missing levels > from the table.I think this is a bug. The exclude= argument doesn't work for factors, because the argument is passed to factor(), and its exclude argument has a different format when the main argument is a factor. -thomas> > > levels(CPIC) > [1] "" "N" "Y" > > levels(Manix) > [1] "" "N" "Y" > > xtabs(~CPIC + Manix, exclude=c("",NA)) > Manix > CPIC N Y > 272 4 15 > N 154 2812 1472 > Y 158 466 4870 > > > table(CPIC, Manix, exclude=c("",NA)) > > Manix > CPIC N Y > 272 4 15 > N 154 2812 1472 > Y 158 466 4870 > > The only way I can exclude them is by > > t <- table(CPIC, Manix) > t <- t[-1,-1] > > that's not to hard in this case, but my application is to a much > larger table where this gets unweildly. > > -- > Michael Friendly friendly at yorku.ca > York University http://www.math.yorku.ca/SCS/friendly.html > Psychology Department > 4700 Keele Street Tel: (416) 736-5115 x66249 > Toronto, Ontario, M3J 1P3 Fax: (416) 736-5814 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > http://www.stat.math.ethz.ch/mailman/listinfo/r-help >Thomas Lumley Asst. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle ^^^^^^^^^^^^^^^^^^^^^^^^ - NOTE NEW EMAIL ADDRESS
Having looked over the replies and examined the code, I can't see any reason for table (and xtabs) to avoid honoring the exclude= argument for factors. There are often reasons for wanting to exclude certain levels, even non-missing in making a table. In my application, John Fox suggested that I could circumvent the problem by reading in the .csv file with na.strings="". However, it was only for making tables that I wanted to exclude the "" categories. The change to table() to have it honor the exclude option for factors is quite straight-forward. I wonder if the R team will consider placing this on its list. (revised version below) More generally, in working with tables I often find the need to collapse or reorder the levels of some dimensions of an n-way table. I've written a collapse.table to do the first, e.g., sex <- c("Male", "Female") age <- letters[1:6] education <- c("low", 'med', 'high') data <- expand.grid(sex=sex, age=age, education=education) data <- cbind(data, rpois(36, 100)) # collapse age to 3 levels t2 <- collapse.table(t1, age=c("A", "A", "B", "B", "C", "C")) t3 <- collapse.table(t1, age=c("A", "A", "B", "B", "C", "C"), education=c("low", "low", "high")) and it's not too hard to do the second. However, I wonder if some more general and convenient tools for working with tables are available somewhere I've missed. For example, for mosaicplots it is often crucial be able to treat table variables as ordered factors, where the ordering is that which shows the pattern of association, not the default. For a data frame, this can be done with subset$Skin.Colour <- factor(subset$Skin.Colour, levels=c("White", "Brown", "Other", "Black")) but it's more unweildy with a table object. -Michael ------- table.R ------ # modified to respect the exclude argument for factors # use exclude=NULL for former behavior for factors (or change default) table <- function (..., exclude = c(NA, NaN), dnn = list.names(...), deparse.level = 1) { list.names <- function(...) { l <- as.list(substitute(list(...)))[-1] nm <- names(l) fixup <- if (is.null(nm)) seq(along = l) else nm == "" dep <- sapply(l[fixup], function(x) switch (deparse.level + 1, "", if (is.symbol(x)) as.character(x) else "", deparse(x)[1] ) ) if (is.null(nm)) dep else { nm[fixup] <- dep nm } } args <- list(...) if (length(args) == 0) stop("nothing to tabulate") if (length(args) == 1 && is.list(args[[1]])) { args <- args[[1]] if (length(dnn) != length(args)) dnn <- if (!is.null(argn <- names(args))) argn else paste(dnn[1],1:length(args),sep='.') } bin <- 0 lens <- NULL dims <- integer(0) pd <- 1 dn <- NULL for (a in args) { if (is.null(lens)) lens <- length(a) else if (length(a) != lens) stop("all arguments must have the same length") # MF: make exclude work for factors too # if (is.factor(a)) # cat <- a # else cat <- factor(a, exclude = exclude) nl <- length(l <- levels(cat)) dims <- c(dims, nl) dn <- c(dn, list(l)) ## requiring all(unique(as.integer(cat)) == 1:nlevels(cat)) : bin <- bin + pd * (as.integer(cat) - 1) pd <- pd * nl } names(dn) <- dnn bin <- bin[!is.na(bin)] if (length(bin)) bin <- bin + 1 # otherwise, that makes bin NA y <- array(tabulate(bin, pd), dims, dimnames = dn) class(y) <- "table" y } -- Michael Friendly friendly at yorku.ca York University http://www.math.yorku.ca/SCS/friendly.html Psychology Department 4700 Keele Street Tel: (416) 736-5115 x66249 Toronto, Ontario, M3J 1P3 Fax: (416) 736-5814
Reasonably Related Threads
- [LLVMdev] [MC] [llvm-mc] Getting target specific information to <target>ELFObjectWriter
- [LLVMdev] [MC] [llvm-mc] Getting target specific information to <target>ELFObjectWriter
- [LLVMdev] [MC] [llvm-mc] Getting target specific information to <target>ELFObjectWriter
- [LLVMdev] [MC] [llvm-mc] Getting target specific information to <target>ELFObjectWriter
- [LLVMdev] [MC] [llvm-mc] Getting target specific information to <target>ELFObjectWriter