thr3ads.net - R help - [R] Creating factors from continuous variables [Aug 2005]

If this information is useful, please help other people find it:
Share via:

David James

2005-Aug-26 20:59 UTC

[R] Creating factors from continuous variables

What is the quickest way to create many categorical variables  
(factors) from continuous variables?

This is the approach that I have used:

# create sample data
N <- 20
x <- runif(N,0,1)

# setup ranges to define categories
x.a <- (x >= 0.0) & (x < 0.4)
x.b <- (x >= 0.4) & (x < 0.5)
x.c <- (x >= 0.5) & (x < 0.6)
x.d <- (x >= 0.6) & (x < 1.0)

# create factors
i <- runif(N,1,1)
x.new <- (i*1*x.a) + (i*2*x.b) + (i*3*x.c) + (i*4*x.d)
x.factor <- factor(x.new)

I'm looking for a better / simpler / more elegant / more robust (as  
the number of categories increases) way to do this.  I also don't  
like that my factor names can only be numbers in this example.  I  
would prefer a solution to take a form like the following (inspired  
by the "hist" function):

# define breakpoints
x.breaks = c(0, 0.4, 0.5, 0.6, 1.0)
x.factornames = c( "0 - 0.4", "0.4 - 0.5", "0.5 -
0.6", "0.6 - 1.0" )
x.factor = unknown.function( x, x.breaks, x.factornames )

Thanks,
David

P.S. Here's what I have read to try to find the answer to my problem:
* "Introductory Statistics with R"
* "A Brief Guide to R for Beginners in Econometrics"
* "Econometrics in R"

Berton Gunter

2005-Aug-26 21:02 UTC

head link

[R] Creating factors from continuous variables

?cut

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of David James
> Sent: Friday, August 26, 2005 2:00 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Creating factors from continuous variables
> 
> What is the quickest way to create many categorical variables  
> (factors) from continuous variables?
> 
> This is the approach that I have used:
> 
> # create sample data
> N <- 20
> x <- runif(N,0,1)
> 
> # setup ranges to define categories
> x.a <- (x >= 0.0) & (x < 0.4)
> x.b <- (x >= 0.4) & (x < 0.5)
> x.c <- (x >= 0.5) & (x < 0.6)
> x.d <- (x >= 0.6) & (x < 1.0)
> 
> # create factors
> i <- runif(N,1,1)
> x.new <- (i*1*x.a) + (i*2*x.b) + (i*3*x.c) + (i*4*x.d)
> x.factor <- factor(x.new)
> 
> I'm looking for a better / simpler / more elegant / more robust (as  
> the number of categories increases) way to do this.  I also don't  
> like that my factor names can only be numbers in this example.  I  
> would prefer a solution to take a form like the following (inspired  
> by the "hist" function):
> 
> # define breakpoints
> x.breaks = c(0, 0.4, 0.5, 0.6, 1.0)
> x.factornames = c( "0 - 0.4", "0.4 - 0.5", "0.5 -
0.6", "0.6 - 1.0" )
> x.factor = unknown.function( x, x.breaks, x.factornames )
> 
> Thanks,
> David
> 
> P.S. Here's what I have read to try to find the answer to my problem:
> * "Introductory Statistics with R"
> * "A Brief Guide to R for Beginners in Econometrics"
> * "Econometrics in R"
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

Prof Brian Ripley

2005-Aug-26 21:09 UTC

head link

[R] Creating factors from continuous variables

?cut

This is in `An Introduction to R', the manual which ships with R and basic 
reading.

On Fri, 26 Aug 2005, David James wrote:
> What is the quickest way to create many categorical variables
> (factors) from continuous variables?
>
> This is the approach that I have used:
>
> # create sample data
> N <- 20
> x <- runif(N,0,1)
>
> # setup ranges to define categories
> x.a <- (x >= 0.0) & (x < 0.4)
> x.b <- (x >= 0.4) & (x < 0.5)
> x.c <- (x >= 0.5) & (x < 0.6)
> x.d <- (x >= 0.6) & (x < 1.0)
>
> # create factors
> i <- runif(N,1,1)
> x.new <- (i*1*x.a) + (i*2*x.b) + (i*3*x.c) + (i*4*x.d)
> x.factor <- factor(x.new)
>
> I'm looking for a better / simpler / more elegant / more robust (as
> the number of categories increases) way to do this.  I also don't
> like that my factor names can only be numbers in this example.  I
> would prefer a solution to take a form like the following (inspired
> by the "hist" function):
>
> # define breakpoints
> x.breaks = c(0, 0.4, 0.5, 0.6, 1.0)
> x.factornames = c( "0 - 0.4", "0.4 - 0.5", "0.5 -
0.6", "0.6 - 1.0" )
> x.factor = unknown.function( x, x.breaks, x.factornames )
>
> Thanks,
> David
>
> P.S. Here's what I have read to try to find the answer to my problem:
> * "Introductory Statistics with R"
> * "A Brief Guide to R for Beginners in Econometrics"
> * "Econometrics in R"
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Apparently Analagous Threads

Search for more reasonably related threads

R help - Aug 2005 - Creating factors from continuous variables

[R] Creating factors from continuous variables

[R] Creating factors from continuous variables

[R] Creating factors from continuous variables

Apparently Analagous Threads