Gesmann, Markus
2005-Nov-14 20:05 UTC
[R] change some levels of a factor column in data frame according to a condition
Dear R-users, I am looking for an elegant way to change some levels of a factor column in data frame according to a condition. Lets look at the following data frame:> data.frame(crit1=gl(2,5), crit2=factor(letters[1:10]), x=rnorm(10))crit1 crit2 x 1 1 a -1.06957692 2 1 b 0.24368402 3 1 c -0.24958322 4 1 d -1.37577955 5 1 e -0.01713288 6 2 f -1.25203573 7 2 g -1.94348533 8 2 h -0.16041719 9 2 i -1.91572616 10 2 j -0.20256478 Now I would like to find for each level in crit1 the two smallest values of x and change the levels of crit2 to "small", so the result would look like this: crit1 crit2 x 1 1 small -1.06957692 2 1 b 0.24368402 3 1 c -0.24958322 4 1 small -1.37577955 5 1 e -0.01713288 6 2 f -1.25203573 7 2 small -1.94348533 8 2 h -0.16041719 9 2 small -1.91572616 10 2 j -0.20256478 Thank you for advice! Markus Gesmann ************LNSCNTMCS01*************************************************** The information in this E-Mail and in any attachments is CON...{{dropped}}
jim holtman
2005-Nov-14 21:02 UTC
[R] change some levels of a factor column in data frame according to a condition
try this: # create data x.by <http://x.by> <- data.frame(crit1=rep(c(1,2),c(10,10)), crit2=sample(letters[1:4],20,T), val=runif(20)) levels(x.by$crit2) <- c(levels(x.by$crit2), 'small') # add 'small' to the levels y <- by(x.by <http://x.by>, x.by$crit1, function(.grp){ .small <- order(.grp$val) # find the smallest values .grp$crit2[.small[1:min(2,length(.small))]] <- 'small' # make sure we don't exceed vector .grp }) do.call('rbind', y) # put it back together On 11/14/05, Gesmann, Markus <Markus.Gesmann@lloyds.com> wrote:> > Dear R-users, > > I am looking for an elegant way to change some levels of a factor column > in data frame according to a condition. > Lets look at the following data frame: > > > data.frame(crit1=gl(2,5), crit2=factor(letters[1:10]), x=rnorm(10)) > crit1 crit2 x > 1 1 a -1.06957692 > 2 1 b 0.24368402 > 3 1 c -0.24958322 > 4 1 d -1.37577955 > 5 1 e -0.01713288 > 6 2 f -1.25203573 > 7 2 g -1.94348533 > 8 2 h -0.16041719 > 9 2 i -1.91572616 > 10 2 j -0.20256478 > > Now I would like to find for each level in crit1 the two smallest values > of x and change the levels of crit2 to "small", so the result would look > like this: > > crit1 crit2 x > 1 1 small -1.06957692 > 2 1 b 0.24368402 > 3 1 c -0.24958322 > 4 1 small -1.37577955 > 5 1 e -0.01713288 > 6 2 f -1.25203573 > 7 2 small -1.94348533 > 8 2 h -0.16041719 > 9 2 small -1.91572616 > 10 2 j -0.20256478 > > Thank you for advice! > > Markus Gesmann > > ************LNSCNTMCS01*************************************************** > The information in this E-Mail and in any attachments is CON...{{dropped}} > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >-- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]]
Francisco J. Zagmutt
2005-Nov-14 22:01 UTC
[R] change some levels of a factor column in data frame according to a condi
Hi Gesman There may be more elegant ways to do this but here is one option: d=data.frame(crit1=gl(2,5), crit2=factor(letters[1:10]), x=rnorm(10)) #Creates data levels(d$crit2)=c(levels(d$crit2),"Small")#Adds the level "Small" to the factor crit2. d2=d[order(d$crit1,d$x),]#Sorts x ascending, by crit1 idx=do.call("rbind",by(d2,d$crit1,head,2))#selects the 2 smallest by crit1 and merges the results by row d2[d2$x %in% idx$x,'crit2']="Small" #Changes the desired crit2 to "Small" Cheers Francisco>From: "Gesmann, Markus" <Markus.Gesmann at lloyds.com> >To: r-help at stat.math.ethz.ch >Subject: [R] change some levels of a factor column in data frame according >to a condition >Date: Mon, 14 Nov 2005 20:05:38 +0000 > >Dear R-users, > >I am looking for an elegant way to change some levels of a factor column >in data frame according to a condition. >Lets look at the following data frame: > > > data.frame(crit1=gl(2,5), crit2=factor(letters[1:10]), x=rnorm(10)) > crit1 crit2 x >1 1 a -1.06957692 >2 1 b 0.24368402 >3 1 c -0.24958322 >4 1 d -1.37577955 >5 1 e -0.01713288 >6 2 f -1.25203573 >7 2 g -1.94348533 >8 2 h -0.16041719 >9 2 i -1.91572616 >10 2 j -0.20256478 > >Now I would like to find for each level in crit1 the two smallest values >of x and change the levels of crit2 to "small", so the result would look >like this: > > crit1 crit2 x >1 1 small -1.06957692 >2 1 b 0.24368402 >3 1 c -0.24958322 >4 1 small -1.37577955 >5 1 e -0.01713288 >6 2 f -1.25203573 >7 2 small -1.94348533 >8 2 h -0.16041719 >9 2 small -1.91572616 >10 2 j -0.20256478 > >Thank you for advice! > >Markus Gesmann > >************LNSCNTMCS01*************************************************** >The information in this E-Mail and in any attachments is CON...{{dropped}} > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! >http://www.R-project.org/posting-guide.html