Denis Chabot
2005-Jan-19 13:56 UTC
[R] recoding large number of categories (select in SAS)
Hi, I have data on stomach contents. Possible prey species are in the hundreds, so a list of prey codes has been in used in many labs doing this kind of work. When comes time to do analyses on these data one often wants to regroup prey in broader categories, especially for rare prey. In SAS you can nest a large number of "if-else", or do this more cleanly with "select" like this: select; when (149 <= prey <=150) preyGr= 150; when (186 <= prey <= 187) preyGr= 187; when (prey= 438) preyGr= 438; when (prey= 430) preyGr= 430; when (prey= 436) preyGr= 436; when (prey= 431) preyGr= 431; when (prey= 451) preyGr= 451; when (prey= 461) preyGr= 461; when (prey= 478) preyGr= 478; when (prey= 572) preyGr= 572; when (692 <= prey <= 695 ) preyGr= 692; when (808 <= prey <= 826, 830 <= prey <= 832 ) preyGr= 808; when (997 <= prey <= 998, 792 <= prey <= 796) preyGr= 792; when (882 <= prey <= 909) preyGr= 882; when (prey in (999, 125, 994)) preyGr= 9994; otherwise preyGr= 1; end; *select; The number of transformations is usually much larger than this short example. What is the best way of doing this in R? Sincerely, Denis Chabot
Philippe Grosjean
2005-Jan-19 14:14 UTC
[R] recoding large number of categories (select in SAS)
Does > ?cut answers to your question? Best, Philippe Grosjean ..............................................<?}))><........ ) ) ) ) ) ( ( ( ( ( Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( ( Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons-Hainaut University, Pentagone (3D08) ( ( ( ( ( Academie Universitaire Wallonie-Bruxelles ) ) ) ) ) 8, av du Champ de Mars, 7000 Mons, Belgium ( ( ( ( ( ) ) ) ) ) phone: + 32.65.37.34.97, fax: + 32.65.37.30.54 ( ( ( ( ( email: Philippe.Grosjean at umh.ac.be ) ) ) ) ) ( ( ( ( ( web: http://www.umh.ac.be/~econum ) ) ) ) ) http://www.sciviews.org ( ( ( ( ( .............................................................. Denis Chabot wrote:> Hi, > > I have data on stomach contents. Possible prey species are in the > hundreds, so a list of prey codes has been in used in many labs doing > this kind of work. > > When comes time to do analyses on these data one often wants to regroup > prey in broader categories, especially for rare prey. > > In SAS you can nest a large number of "if-else", or do this more cleanly > with "select" like this: > select; > when (149 <= prey <=150) preyGr= 150; > when (186 <= prey <= 187) preyGr= 187; > when (prey= 438) preyGr= 438; > when (prey= 430) preyGr= 430; > when (prey= 436) preyGr= 436; > when (prey= 431) preyGr= 431; > when (prey= 451) preyGr= 451; > when (prey= 461) preyGr= 461; > when (prey= 478) preyGr= 478; > when (prey= 572) preyGr= 572; > when (692 <= prey <= 695 ) > preyGr= 692; > when (808 <= prey <= 826, 830 <= prey <= 832 ) preyGr= 808; > when (997 <= prey <= 998, 792 <= prey <= 796) preyGr= 792; > when (882 <= prey <= 909) preyGr= 882; > when (prey in (999, 125, 994)) preyGr= 9994; > otherwise preyGr= 1; > end; *select; > > The number of transformations is usually much larger than this short > example. > > What is the best way of doing this in R? > > Sincerely, > > Denis Chabot > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
james.holtman@convergys.com
2005-Jan-19 14:30 UTC
[R] recoding large number of categories (select in SAS)
Here is a way of doing it by setting up a matrix of values to test against. Easier than writing all the 'select' statements.> x.trans <- matrix(c( # translation matrix; first column is min, secondis max, + 149, 150, 150, # and third is the value to be returned + 186, 187, 187, + 438, 438, 438, + 430, 430, 430, + 808, 826, 808, + 830, 832, 808, + 997, 998, 792, + 792, 796, 792), ncol=3, byrow=T)> colnames(x.trans) <- c('min', 'max', 'value') > > x.default <- 9999 # default/nomatch value > > x.test <- c(150, 149, 148, 438, 997, 791, 795, 810, 820, 834) # testdata> # > # this function will test each value and if between the min/max, return 3column> # > newValues <- sapply(x.test, function(x){+ .value <- x.trans[(x >= x.trans[,'min']) & (x <x.trans[,'max']),'value'] + if (length(.value) == 0) .value <- x.default # on no match, take default + .value[1] # return first value if multiple matches + })> newValues[1] 150 150 9999 438 792 9999 792 808 808 9999>__________________________________________________________ James Holtman "What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys james.holtman at convergys.com +1 (513) 723-2929 Denis Chabot <chabotd at globetrotter To: r-help at stat.math.ethz.ch .net> cc: Sent by: Subject: [R] recoding large number of categories (select in SAS) r-help-bounces at stat.m ath.ethz.ch 01/19/2005 08:56 AM Hi, I have data on stomach contents. Possible prey species are in the hundreds, so a list of prey codes has been in used in many labs doing this kind of work. When comes time to do analyses on these data one often wants to regroup prey in broader categories, especially for rare prey. In SAS you can nest a large number of "if-else", or do this more cleanly with "select" like this: select; when (149 <= prey <=150) preyGr= 150; when (186 <= prey <= 187) preyGr= 187; when (prey= 438) preyGr= 438; when (prey= 430) preyGr= 430; when (prey= 436) preyGr= 436; when (prey= 431) preyGr= 431; when (prey= 451) preyGr= 451; when (prey= 461) preyGr= 461; when (prey= 478) preyGr= 478; when (prey= 572) preyGr= 572; when (692 <= prey <= 695 ) preyGr= 692; when (808 <= prey <= 826, 830 <= prey <= 832 ) preyGr= 808; when (997 <= prey <= 998, 792 <= prey <= 796) preyGr= 792; when (882 <= prey <= 909) preyGr= 882; when (prey in (999, 125, 994)) preyGr= 9994; otherwise preyGr= 1; end; *select; The number of transformations is usually much larger than this short example. What is the best way of doing this in R? Sincerely, Denis Chabot ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html