All, When I take a subset of a factor the reduced factor still maintains all the original levels of the factor when say forming the key in a plot. The data is correct, but the variable still "remembers" the original levels. See below for reproducible code. Does anyone know how to fix this? cheers, dave fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) new.fact = fact[1:6]> new.fact[1] A A A B B B Levels: A B C ## should only show A B
Just add the following to your code new.fact = fact[1:6, drop=T]> new.fact[1] A A A B B B Levels: A B> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of > Afshartous, David > Sent: Tuesday, September 12, 2006 11:23 AM > To: r-help at stat.math.ethz.ch > Subject: [R] levels of factor when subsetting the factor > > > All, > > When I take a subset of a factor the reduced factor still > maintains all the original levels of the factor when say > forming the key in a plot. > The data is correct, but the variable still "remembers" the > original levels. See below for reproducible code. Does > anyone know how to fix this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C ## should only show A B > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Try> new.fact = fact[1:6, drop=TRUE]On 12/09/06, Afshartous, David <afshart at exchange.sba.miami.edu> wrote:> > All, > > When I take a subset of a factor the reduced factor still maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C ## should only show A B > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- ================================David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP
On 9/12/06, Afshartous, David <afshart at exchange.sba.miami.edu> wrote:> > All, > > When I take a subset of a factor the reduced factor still maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this?Use the optional argument "drop = TRUE"> cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C ## should only show A B> fact[1:6, drop = TRUE][1] A A A B B B Levels: A B
You have at least two choices: R> factor(fact[1:6]) [1] A A A B B B Levels: A B R> fact[1:6, drop=TRUE] [1] A A A B B B Levels: A B HTH, Andy From: Afshartous, David> > All, > > When I take a subset of a factor the reduced factor still > maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C ## should only show A B > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments,...{{dropped}}
factor(new.fact) will do the trick. But that will recode the levels and that might be something you don't want.> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > new.fact[1] A A A B B B Levels: A B C> factor(new.fact)[1] A A A B B B Levels: A B Cheers, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx op inbo.be www.inbo.be -----Oorspronkelijk bericht----- Van: r-help-bounces op stat.math.ethz.ch [mailto:r-help-bounces op stat.math.ethz.ch] Namens Afshartous, David Verzonden: dinsdag 12 september 2006 17:23 Aan: r-help op stat.math.ethz.ch Onderwerp: [R] levels of factor when subsetting the factor All, When I take a subset of a factor the reduced factor still maintains all the original levels of the factor when say forming the key in a plot. The data is correct, but the variable still "remembers" the original levels. See below for reproducible code. Does anyone know how to fix this? cheers, dave fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) new.fact = fact[1:6]> new.fact[1] A A A B B B Levels: A B C ## should only show A B ______________________________________________ R-help op stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
check ?"[.factor", you need: fact[1:6, drop = TRUE] Best, Dimitris ---- Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm ----- Original Message ----- From: "Afshartous, David" <afshart at exchange.sba.miami.edu> To: <r-help at stat.math.ethz.ch> Sent: Tuesday, September 12, 2006 5:22 PM Subject: [R] levels of factor when subsetting the factor> > All, > > When I take a subset of a factor the reduced factor still maintains > all > the original levels of the factor when say forming the key in a > plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to > fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] >> new.fact > [1] A A A B B B > Levels: A B C ## should only show A B > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
Also, it is probably easier to use gl() than coerce your data into a factor fact <- gl(3, 3, label = c("A", "B", "C"))> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Liaw, Andy > Sent: Tuesday, September 12, 2006 11:32 AM > To: Afshartous, David; r-help at stat.math.ethz.ch > Subject: Re: [R] levels of factor when subsetting the factor > > You have at least two choices: > > R> factor(fact[1:6]) > [1] A A A B B B > Levels: A B > R> fact[1:6, drop=TRUE] > [1] A A A B B B > Levels: A B > > HTH, > Andy > > > From: Afshartous, David > > > > All, > > > > When I take a subset of a factor the reduced factor still maintains > > all the original levels of the factor when say forming the key in a > > plot. > > The data is correct, but the variable still "remembers" the > original > > levels. See below for reproducible code. Does anyone know > how to fix > > this? > > cheers, > > dave > > > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = > > fact[1:6] > > > new.fact > > [1] A A A B B B > > Levels: A B C ## should only show A B > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > -------------------------------------------------------------- > ---------------- > Notice: This e-mail message, together with any > attachments,...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
I think you want 'fact[1:6, drop = TRUE]' -roger Afshartous, David wrote:> > All, > > When I take a subset of a factor the reduced factor still maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] >> new.fact > [1] A A A B B B > Levels: A B C ## should only show A B > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/
"Afshartous, David" <afshart at exchange.sba.miami.edu> writes:> > All, > > When I take a subset of a factor the reduced factor still maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C ## should only show A BJust use> factor(new.fact)[1] A A A B B B Levels: A B or> fact[1:6, drop=T][1] A A A B B B Levels: A B And, no, it is not a bug. The fact that a subsample happens to consist only of males does not turn gender into a one-level factor... (Apart from the philosophy, it makes a real difference in tabulation.) -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
thanks to all for the quick replies! if the factor is part of a dataframe, I can apply the subsetting to the entire dataframe, and then use drop=True to the factor separately and then put it back into the new dataframe (code below). is there a way to do this in a single step? dat <-data.frame(fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))),Y = rnorm(9)) dat.new = dat[1:6, ] dat.new$fact = dat$fact[1:6, drop = T] -----Original Message----- From: pd at pubhealth.ku.dk [mailto:pd at pubhealth.ku.dk] On Behalf Of Peter Dalgaard Sent: Tuesday, September 12, 2006 11:45 AM To: Afshartous, David Cc: r-help at stat.math.ethz.ch Subject: Re: [R] levels of factor when subsetting the factor "Afshartous, David" <afshart at exchange.sba.miami.edu> writes:> > All, > > When I take a subset of a factor the reduced factor still maintains > all the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) new.fact = > fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C ## should only show A BJust use> factor(new.fact)[1] A A A B B B Levels: A B or> fact[1:6, drop=T][1] A A A B B B Levels: A B And, no, it is not a bug. The fact that a subsample happens to consist only of males does not turn gender into a one-level factor... (Apart from the philosophy, it makes a real difference in tabulation.) -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Yes. I do this periodically: dat.new <- dat[1:6, ] dat.new[] <- lapply(dat.new, function(x) if(is.factor(x)) factor(x) else x) HTH, --sundar Afshartous, David said the following on 9/12/2006 11:00 AM:> thanks to all for the quick replies! > > if the factor is part of a dataframe, I can apply the subsetting > to the entire dataframe, and then use drop=True to the factor > separately and then put it back into the new dataframe (code below). is there a way > to do this in a single step? > > dat <-data.frame(fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))),Y = rnorm(9)) > dat.new = dat[1:6, ] > dat.new$fact = dat$fact[1:6, drop = T] > > > > > -----Original Message----- > From: pd at pubhealth.ku.dk [mailto:pd at pubhealth.ku.dk] On Behalf Of Peter Dalgaard > Sent: Tuesday, September 12, 2006 11:45 AM > To: Afshartous, David > Cc: r-help at stat.math.ethz.ch > Subject: Re: [R] levels of factor when subsetting the factor > > "Afshartous, David" <afshart at exchange.sba.miami.edu> writes: > >> >> All, >> >> When I take a subset of a factor the reduced factor still maintains >> all the original levels of the factor when say forming the key in a plot. >> The data is correct, but the variable still "remembers" the original >> levels. See below for reproducible code. Does anyone know how to fix >> this? >> cheers, >> dave >> >> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) new.fact = >> fact[1:6] >>> new.fact >> [1] A A A B B B >> Levels: A B C ## should only show A B > > Just use > >> factor(new.fact) > [1] A A A B B B > Levels: A B > > or > >> fact[1:6, drop=T] > [1] A A A B B B > Levels: A B > > > And, no, it is not a bug. The fact that a subsample happens to consist only of males does not turn gender into a one-level factor... (Apart from the philosophy, it makes a real difference in tabulation.) > >