Hi,
Suppose I have a categorical variable called STREET, and I have 30
levels for it (i.e. 30 different streets). I want to find all those
streets with only 15 observations or below then collapse them into a
level called OTHER. Is there a quick way, other than using a for()
loop, to do it? Currently what I'm doing is something like:
### Collapse STREET (those < 15)
st <- c()
STREET <- as.vector(STREET)
for(i in 1:length(STREET)) {
if(STREET[i] == "BOYNE AVE" ||
STREET[i] == "CHAPEL ST" ||
STREET[i] == "CONE PL" ||
STREET[i] == "LACEBARK LANE" ||
STREET[i] == "PRUDHOE LANE" ||
STREET[i] == "VIRGIL PL" ||
STREET[i] == "WILMOT ST" ) st[i] <- "Other"
else st[i] <- STREET[i]
}
But I'm sure there is a better way....
Kevin
--------------------------------------------
Ko-Kang Kevin Wang, MSc(Hon)
Statistics Workshops Co-ordinator
Student Learning Centre
University of Auckland
New Zealand
Ko-Kang Kevin Wang wrote:> Hi, > > Suppose I have a categorical variable called STREET, and I have 30 > levels for it (i.e. 30 different streets). I want to find all those > streets with only 15 observations or below then collapse them into a > level called OTHER. Is there a quick way, other than using a for() > loop, to do it? Currently what I'm doing is something like: > ### Collapse STREET (those < 15) > st <- c() > STREET <- as.vector(STREET) > for(i in 1:length(STREET)) { > if(STREET[i] == "BOYNE AVE" || > STREET[i] == "CHAPEL ST" || > STREET[i] == "CONE PL" || > STREET[i] == "LACEBARK LANE" || > STREET[i] == "PRUDHOE LANE" || > STREET[i] == "VIRGIL PL" || > STREET[i] == "WILMOT ST" ) st[i] <- "Other" > else st[i] <- STREET[i] > } > > But I'm sure there is a better way.... > > KevinHow about: tab <- table(STREET) small <- names(tab[tab < 15]) st <- ifelse(STREET %in% small, "Other", STREET) /untested -sundar
Kevin, something like .... table(STREET) STREET <- as.character(STREET) STREET[as.numeric(factor(STREET)) %in% which(table(STREET) < 15)] <- "Other" STREET <- factor(STREET) table(STREET) Andrew On Thursday 26 February 2004 15:25, Ko-Kang Kevin Wang wrote:> Hi, > > Suppose I have a categorical variable called STREET, and I have 30 > levels for it (i.e. 30 different streets). I want to find all those > streets with only 15 observations or below then collapse them into a > level called OTHER. Is there a quick way, other than using a for() > loop, to do it? Currently what I'm doing is something like: > ### Collapse STREET (those < 15) > st <- c() > STREET <- as.vector(STREET) > for(i in 1:length(STREET)) { > if(STREET[i] == "BOYNE AVE" || > STREET[i] == "CHAPEL ST" || > STREET[i] == "CONE PL" || > STREET[i] == "LACEBARK LANE" || > STREET[i] == "PRUDHOE LANE" || > STREET[i] == "VIRGIL PL" || > STREET[i] == "WILMOT ST" ) st[i] <- "Other" > else st[i] <- STREET[i] > } > > But I'm sure there is a better way.... > > Kevin > > -------------------------------------------- > Ko-Kang Kevin Wang, MSc(Hon) > Statistics Workshops Co-ordinator > Student Learning Centre > University of Auckland > New Zealand > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html-- Andrew Robinson Ph: 208 885 7115 Department of Forest Resources Fa: 208 885 6226 University of Idaho E : andrewr at uidaho.edu PO Box 441133 W : http://www.uidaho.edu/~andrewr Moscow ID 83843 Or: http://www.biometrics.uidaho.edu No statement above necessarily represents my employer's opinion.
> -----Original Message----- > From: Phil Spector [mailto:spector at stat.Berkeley.EDU] > > How about something like: > > tstreet = table(STREET) > collapsestreets = names(tstreet[tstreet <= 15]) > STREET[STREET %in% collapsestreets] = 'OTHER'Thanks a lot! This is exactly what I want. I had a feeling my way to use a for() loop was rather silly....;D Kevin -------------------------------------------- Ko-Kang Kevin Wang, MSc(Hon) Statistics Workshops Co-ordinator Student Learning Centre University of Auckland New Zealand