Kamil Sijko
2010-Apr-16 14:27 UTC
[R] Removing empty (or very underpopulated) sub-populations
Hi, I'm trying to develop a function that will simplify the most common analyses in my area of interest (social sciences) by computing all required statistics at one run (for exaple in case of a factor and numeric variable: 1) normality test, then in case variable are normal 2) ANOVA 3) with efect-size estimation and aprropriate graph). I test normality in each group with this code: are.normal <- c() group <- as.factor(group) for (i in 1:length(levels(factor(group)))) { are.normal[i] <- normality(response[group==levels(factor(group))[i]]) } whrere: 1) response is response (numeric variable), 2) group is grouping variable (factor), 4) normality is a function which takes one variable as argument, and the tries to figure out wheter it's normal (TRUE) or not (FALSE). My problem is that sometimes, some combinations of response~group produce empty populations or very underpopulated (eg. situation when you examine relation between country of origin and age of respondents, and it turns out, that you have only one guy from some country). It causes a failure of my function. I've been wondering wheter there is some way to exclude those underpopulated groups from analysis? Best regards, Kamil Sijko [[alternative HTML version deleted]]