Pascal A. Niklaus
2003-Nov-05 08:50 UTC
[R] converting column to factor *within* a data frame
Hi all, I repeatedly encounter the following problem: After importing a data set into a data frame, I wish to set a column with numeric values to be a factor, but can't figure out how to do this. Also, I do not wish to write as.factor(x) all the time. I can create a new vector with x <- factor(x), but the new vector resides outside the attached data frame. Pascal > attach(ngrad) > is.factor(STNW) [1] FALSE > ngrad$STNW<-factor(STNW) ## doesn't work > is.factor(STNW) [1] FALSE > is.factor(STNW) <- T ## doesn't work either Error: couldn't find function "is.factor<-"
Prof Brian Ripley
2003-Nov-05 09:01 UTC
[R] converting column to factor *within* a data frame
On Wed, 5 Nov 2003, Pascal A. Niklaus wrote:> Hi all, > > I repeatedly encounter the following problem: After importing a data set > into a data frame, I wish to set a column with numeric values to be a > factor, but can't figure out how to do this. Also, I do not wish to > write as.factor(x) all the time. I can create a new vector with x <- > factor(x), but the new vector resides outside the attached data frame. > > Pascal > > > attach(ngrad) > > is.factor(STNW) > [1] FALSE > > > ngrad$STNW<-factor(STNW) ## doesn't work > > is.factor(STNW) > [1] FALSEIt does work. It changes ngrad, and not the copy you attached. ngrad$STNW<-factor(ngrad$STNW) attach(ngrad) is the correct sequence. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Your problem is with scoping, not the conversion per se:> > attach(ngrad) > > is.factor(STNW) > [1] FALSEAt this moment, STNW is the same as ngrad$STNW> > ngrad$STNW<-factor(STNW) ## doesn't workYes it does work, try looking at is.factor(ngrad$STNW)> > is.factor(STNW) > [1] FALSEAfter you assign to ngrad$STNW, it is no longer the same thing as the attached STNW. You would need to detach and re-attach ngrad for this to be so. There's no automatic synchronisation between the attached STNW and ngrad$STNW; changing one will not change the other. My advice is: never use attach() if you can help it. It's an accident waiting to happen. Get used to typing dataFrame$varname instead of just varname - that way you will always get what you expect. Or use with() instead of attach() in almost every case. HTH Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 644449 Fax: +44 (0) 1379 644445 email: Simon.Fear at synequanon.com web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}}
Philippe Glaziou
2003-Nov-05 10:01 UTC
[R] converting column to factor *within* a data frame
Pascal A. Niklaus <Pascal.Niklaus at unibas.ch> wrote:> I repeatedly encounter the following problem: After importing a data set > into a data frame, I wish to set a column with numeric values to be a > factor, but can't figure out how to do this. Also, I do not wish to > write as.factor(x) all the time. I can create a new vector with x <- > factor(x), but the new vector resides outside the attached data frame. > > Pascal > > > attach(ngrad) > > is.factor(STNW) > [1] FALSE > > > ngrad$STNW<-factor(STNW) ## doesn't work > > is.factor(STNW) > [1] FALSEThat command checked the attached dataset, which has not been modified by the previous command. You seem to be confused by what attach() does. Read ?attach.> data <- data.frame(a=1:3, b=4:6) > dataa b 1 1 4 2 2 5 3 3 6> attach(data) > is.factor(a)[1] FALSE> data$a <- as.factor(a) > is.factor(a)[1] FALSE> is.factor(data$a)[1] TRUE> detach(data) > attach(data) > is.factor(a)[1] TRUE -- Philippe