Hi, Suppose I have a categorical variable called STREET, and I have 30 levels for it (i.e. 30 different streets). I want to find all those streets with only 15 observations or below then collapse them into a level called OTHER. Is there a quick way, other than using a for() loop, to do it? Currently what I'm doing is something like: ### Collapse STREET (those < 15) st <- c() STREET <- as.vector(STREET) for(i in 1:length(STREET)) { if(STREET[i] == "BOYNE AVE" || STREET[i] == "CHAPEL ST" || STREET[i] == "CONE PL" || STREET[i] == "LACEBARK LANE" || STREET[i] == "PRUDHOE LANE" || STREET[i] == "VIRGIL PL" || STREET[i] == "WILMOT ST" ) st[i] <- "Other" else st[i] <- STREET[i] } But I'm sure there is a better way.... Kevin -------------------------------------------- Ko-Kang Kevin Wang, MSc(Hon) Statistics Workshops Co-ordinator Student Learning Centre University of Auckland New Zealand
Ko-Kang Kevin Wang wrote:> Hi, > > Suppose I have a categorical variable called STREET, and I have 30 > levels for it (i.e. 30 different streets). I want to find all those > streets with only 15 observations or below then collapse them into a > level called OTHER. Is there a quick way, other than using a for() > loop, to do it? Currently what I'm doing is something like: > ### Collapse STREET (those < 15) > st <- c() > STREET <- as.vector(STREET) > for(i in 1:length(STREET)) { > if(STREET[i] == "BOYNE AVE" || > STREET[i] == "CHAPEL ST" || > STREET[i] == "CONE PL" || > STREET[i] == "LACEBARK LANE" || > STREET[i] == "PRUDHOE LANE" || > STREET[i] == "VIRGIL PL" || > STREET[i] == "WILMOT ST" ) st[i] <- "Other" > else st[i] <- STREET[i] > } > > But I'm sure there is a better way.... > > KevinHow about: tab <- table(STREET) small <- names(tab[tab < 15]) st <- ifelse(STREET %in% small, "Other", STREET) /untested -sundar
Kevin, something like .... table(STREET) STREET <- as.character(STREET) STREET[as.numeric(factor(STREET)) %in% which(table(STREET) < 15)] <- "Other" STREET <- factor(STREET) table(STREET) Andrew On Thursday 26 February 2004 15:25, Ko-Kang Kevin Wang wrote:> Hi, > > Suppose I have a categorical variable called STREET, and I have 30 > levels for it (i.e. 30 different streets). I want to find all those > streets with only 15 observations or below then collapse them into a > level called OTHER. Is there a quick way, other than using a for() > loop, to do it? Currently what I'm doing is something like: > ### Collapse STREET (those < 15) > st <- c() > STREET <- as.vector(STREET) > for(i in 1:length(STREET)) { > if(STREET[i] == "BOYNE AVE" || > STREET[i] == "CHAPEL ST" || > STREET[i] == "CONE PL" || > STREET[i] == "LACEBARK LANE" || > STREET[i] == "PRUDHOE LANE" || > STREET[i] == "VIRGIL PL" || > STREET[i] == "WILMOT ST" ) st[i] <- "Other" > else st[i] <- STREET[i] > } > > But I'm sure there is a better way.... > > Kevin > > -------------------------------------------- > Ko-Kang Kevin Wang, MSc(Hon) > Statistics Workshops Co-ordinator > Student Learning Centre > University of Auckland > New Zealand > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html-- Andrew Robinson Ph: 208 885 7115 Department of Forest Resources Fa: 208 885 6226 University of Idaho E : andrewr at uidaho.edu PO Box 441133 W : http://www.uidaho.edu/~andrewr Moscow ID 83843 Or: http://www.biometrics.uidaho.edu No statement above necessarily represents my employer's opinion.
> -----Original Message----- > From: Phil Spector [mailto:spector at stat.Berkeley.EDU] > > How about something like: > > tstreet = table(STREET) > collapsestreets = names(tstreet[tstreet <= 15]) > STREET[STREET %in% collapsestreets] = 'OTHER'Thanks a lot! This is exactly what I want. I had a feeling my way to use a for() loop was rather silly....;D Kevin -------------------------------------------- Ko-Kang Kevin Wang, MSc(Hon) Statistics Workshops Co-ordinator Student Learning Centre University of Auckland New Zealand