I'm trying to categorize a continuous variable (yes, I know that's horrible, but I'm trying to reproduce some exercises from a textbook) and don't really know an efficient way to do this. I have a data frame that looks like: surv_time relapse sex log_WBC rx 1 35 0 1 1.45 0 2 34 0 1 1.47 0 3 32 0 1 2.20 0 4 32 0 1 2.53 0 And I'm trying to categorize log_WBC into: (0-2.30) = "low" (2.31-3.00)= "medium" (>3.00) = "high" I've used an ifelse statement such as: anderson$log_WBC <- ifelse(anderson$log_WBC<2.30,"low",anderson$log_WBC) Is there a way to use "greater than" "less than" syntax within the context of an ifelse statement? Or can someone point me to a function that will do this easier. Many Thanks, Patrick This email message, including any attachments, is for th...{{dropped:6}}
Dear Patrick, Take a look at ?cut for some ideas. HTH, Jorge On Tue, Oct 13, 2009 at 3:37 PM, Richardson, Patrick <> wrote:> I'm trying to categorize a continuous variable (yes, I know that's > horrible, but I'm trying to reproduce some exercises from a textbook) and > don't really know an efficient way to do this. > > I have a data frame that looks like: > > surv_time relapse sex log_WBC rx > 1 35 0 1 1.45 0 > 2 34 0 1 1.47 0 > 3 32 0 1 2.20 0 > 4 32 0 1 2.53 0 > > And I'm trying to categorize log_WBC into: > > (0-2.30) = "low" > (2.31-3.00)= "medium" > (>3.00) = "high" > > I've used an ifelse statement such as: > > anderson$log_WBC <- ifelse(anderson$log_WBC<2.30,"low",anderson$log_WBC) > > Is there a way to use "greater than" "less than" syntax within the context > of an ifelse statement? Or can someone point me to a function that will do > this easier. > > Many Thanks, > > Patrick > > This email message, including any attachments, is for ...{{dropped:13}}
Try this: with(anderson, cut(log_WBC, c(0, 2.3, 3, max(log_WBC)), labels c('low', 'medium', 'high'))) On Tue, Oct 13, 2009 at 4:37 PM, Richardson, Patrick <Patrick.Richardson at vai.org> wrote:> I'm trying to categorize a continuous variable (yes, I know that's horrible, but I'm trying to reproduce some exercises from a textbook) and don't really know an efficient way to do this. > > I have a data frame that looks like: > > ? surv_time relapse sex log_WBC rx > 1 ? ? ? ? 35 ? ? ? 0 ? 1 ? ?1.45 ?0 > 2 ? ? ? ? 34 ? ? ? 0 ? 1 ? ?1.47 ?0 > 3 ? ? ? ? 32 ? ? ? 0 ? 1 ? ?2.20 ?0 > 4 ? ? ? ? 32 ? ? ? 0 ? 1 ? ?2.53 ?0 > > And I'm trying to categorize log_WBC into: > > (0-2.30) = "low" > (2.31-3.00)= "medium" > (>3.00) = "high" > > I've used an ifelse statement such as: > > anderson$log_WBC <- ifelse(anderson$log_WBC<2.30,"low",anderson$log_WBC) > > Is there a way to use "greater than" "less than" syntax within the context of an ifelse statement? Or can someone point me to a function that will do this easier. > > Many Thanks, > > Patrick > > This email message, including any attachments, is for th...{{dropped:6}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O