All,
When I take a subset of a factor the reduced factor still maintains all
the original levels of the factor when say forming the key in a plot.
The data is correct, but the variable still "remembers" the original
levels. See below for reproducible code. Does anyone know how to fix
this?
cheers,
dave
fact = as.factor(c(rep("A", 3),rep("B", 3),
rep("C", 3)))
new.fact = fact[1:6]> new.fact
[1] A A A B B B
Levels: A B C ## should only show A B
Just add the following to your code new.fact = fact[1:6, drop=T]> new.fact[1] A A A B B B Levels: A B> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of > Afshartous, David > Sent: Tuesday, September 12, 2006 11:23 AM > To: r-help at stat.math.ethz.ch > Subject: [R] levels of factor when subsetting the factor > > > All, > > When I take a subset of a factor the reduced factor still > maintains all the original levels of the factor when say > forming the key in a plot. > The data is correct, but the variable still "remembers" the > original levels. See below for reproducible code. Does > anyone know how to fix this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C ## should only show A B > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Try> new.fact = fact[1:6, drop=TRUE]On 12/09/06, Afshartous, David <afshart at exchange.sba.miami.edu> wrote:> > All, > > When I take a subset of a factor the reduced factor still maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C ## should only show A B > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- ================================David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP
On 9/12/06, Afshartous, David <afshart at exchange.sba.miami.edu> wrote:> > All, > > When I take a subset of a factor the reduced factor still maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this?Use the optional argument "drop = TRUE"> cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C ## should only show A B> fact[1:6, drop = TRUE][1] A A A B B B Levels: A B
You have at least two choices: R> factor(fact[1:6]) [1] A A A B B B Levels: A B R> fact[1:6, drop=TRUE] [1] A A A B B B Levels: A B HTH, Andy From: Afshartous, David> > All, > > When I take a subset of a factor the reduced factor still > maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C ## should only show A B > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments,...{{dropped}}
factor(new.fact) will do the trick. But that will recode the levels and that might be something you don't want.> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > new.fact[1] A A A B B B Levels: A B C> factor(new.fact)[1] A A A B B B Levels: A B Cheers, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx op inbo.be www.inbo.be -----Oorspronkelijk bericht----- Van: r-help-bounces op stat.math.ethz.ch [mailto:r-help-bounces op stat.math.ethz.ch] Namens Afshartous, David Verzonden: dinsdag 12 september 2006 17:23 Aan: r-help op stat.math.ethz.ch Onderwerp: [R] levels of factor when subsetting the factor All, When I take a subset of a factor the reduced factor still maintains all the original levels of the factor when say forming the key in a plot. The data is correct, but the variable still "remembers" the original levels. See below for reproducible code. Does anyone know how to fix this? cheers, dave fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) new.fact = fact[1:6]> new.fact[1] A A A B B B Levels: A B C ## should only show A B ______________________________________________ R-help op stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
check ?"[.factor", you need:
fact[1:6, drop = TRUE]
Best,
Dimitris
----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven
Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
http://www.student.kuleuven.be/~m0390867/dimitris.htm
----- Original Message -----
From: "Afshartous, David" <afshart at exchange.sba.miami.edu>
To: <r-help at stat.math.ethz.ch>
Sent: Tuesday, September 12, 2006 5:22 PM
Subject: [R] levels of factor when subsetting the factor
>
> All,
>
> When I take a subset of a factor the reduced factor still maintains
> all
> the original levels of the factor when say forming the key in a
> plot.
> The data is correct, but the variable still "remembers" the
original
> levels. See below for reproducible code. Does anyone know how to
> fix
> this?
> cheers,
> dave
>
> fact = as.factor(c(rep("A", 3),rep("B", 3),
rep("C", 3)))
> new.fact = fact[1:6]
>> new.fact
> [1] A A A B B B
> Levels: A B C ## should only show A B
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
Also, it is probably easier to use gl() than coerce your data into a
factor
fact <- gl(3, 3, label = c("A", "B", "C"))
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Liaw, Andy
> Sent: Tuesday, September 12, 2006 11:32 AM
> To: Afshartous, David; r-help at stat.math.ethz.ch
> Subject: Re: [R] levels of factor when subsetting the factor
>
> You have at least two choices:
>
> R> factor(fact[1:6])
> [1] A A A B B B
> Levels: A B
> R> fact[1:6, drop=TRUE]
> [1] A A A B B B
> Levels: A B
>
> HTH,
> Andy
>
>
> From: Afshartous, David
> >
> > All,
> >
> > When I take a subset of a factor the reduced factor still maintains
> > all the original levels of the factor when say forming the key in a
> > plot.
> > The data is correct, but the variable still "remembers" the
> original
> > levels. See below for reproducible code. Does anyone know
> how to fix
> > this?
> > cheers,
> > dave
> >
> > fact = as.factor(c(rep("A", 3),rep("B", 3),
rep("C", 3)))
> new.fact =
> > fact[1:6]
> > > new.fact
> > [1] A A A B B B
> > Levels: A B C ## should only show A B
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
>
> --------------------------------------------------------------
> ----------------
> Notice: This e-mail message, together with any
> attachments,...{{dropped}}
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
I think you want 'fact[1:6, drop = TRUE]' -roger Afshartous, David wrote:> > All, > > When I take a subset of a factor the reduced factor still maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] >> new.fact > [1] A A A B B B > Levels: A B C ## should only show A B > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/
"Afshartous, David" <afshart at exchange.sba.miami.edu> writes:> > All, > > When I take a subset of a factor the reduced factor still maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C ## should only show A BJust use> factor(new.fact)[1] A A A B B B Levels: A B or> fact[1:6, drop=T][1] A A A B B B Levels: A B And, no, it is not a bug. The fact that a subsample happens to consist only of males does not turn gender into a one-level factor... (Apart from the philosophy, it makes a real difference in tabulation.) -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
thanks to all for the quick replies!
if the factor is part of a dataframe, I can apply the subsetting
to the entire dataframe, and then use drop=True to the factor
separately and then put it back into the new dataframe (code below). is there a
way
to do this in a single step?
dat <-data.frame(fact = as.factor(c(rep("A", 3),rep("B",
3), rep("C", 3))),Y = rnorm(9))
dat.new = dat[1:6, ]
dat.new$fact = dat$fact[1:6, drop = T]
-----Original Message-----
From: pd at pubhealth.ku.dk [mailto:pd at pubhealth.ku.dk] On Behalf Of Peter
Dalgaard
Sent: Tuesday, September 12, 2006 11:45 AM
To: Afshartous, David
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] levels of factor when subsetting the factor
"Afshartous, David" <afshart at exchange.sba.miami.edu> writes:
>
> All,
>
> When I take a subset of a factor the reduced factor still maintains
> all the original levels of the factor when say forming the key in a plot.
> The data is correct, but the variable still "remembers" the
original
> levels. See below for reproducible code. Does anyone know how to fix
> this?
> cheers,
> dave
>
> fact = as.factor(c(rep("A", 3),rep("B", 3),
rep("C", 3))) new.fact =
> fact[1:6]
> > new.fact
> [1] A A A B B B
> Levels: A B C ## should only show A B
Just use
> factor(new.fact)
[1] A A A B B B
Levels: A B
or
> fact[1:6, drop=T]
[1] A A A B B B
Levels: A B
And, no, it is not a bug. The fact that a subsample happens to consist only of
males does not turn gender into a one-level factor... (Apart from the
philosophy, it makes a real difference in tabulation.)
--
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Yes. I do this periodically:
dat.new <- dat[1:6, ]
dat.new[] <- lapply(dat.new, function(x)
if(is.factor(x)) factor(x) else x)
HTH,
--sundar
Afshartous, David said the following on 9/12/2006 11:00
AM:> thanks to all for the quick replies!
>
> if the factor is part of a dataframe, I can apply the subsetting
> to the entire dataframe, and then use drop=True to the factor
> separately and then put it back into the new dataframe (code below). is
there a way
> to do this in a single step?
>
> dat <-data.frame(fact = as.factor(c(rep("A",
3),rep("B", 3), rep("C", 3))),Y = rnorm(9))
> dat.new = dat[1:6, ]
> dat.new$fact = dat$fact[1:6, drop = T]
>
>
>
>
> -----Original Message-----
> From: pd at pubhealth.ku.dk [mailto:pd at pubhealth.ku.dk] On Behalf Of
Peter Dalgaard
> Sent: Tuesday, September 12, 2006 11:45 AM
> To: Afshartous, David
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] levels of factor when subsetting the factor
>
> "Afshartous, David" <afshart at exchange.sba.miami.edu>
writes:
>
>>
>> All,
>>
>> When I take a subset of a factor the reduced factor still maintains
>> all the original levels of the factor when say forming the key in a
plot.
>> The data is correct, but the variable still "remembers" the
original
>> levels. See below for reproducible code. Does anyone know how to fix
>> this?
>> cheers,
>> dave
>>
>> fact = as.factor(c(rep("A", 3),rep("B", 3),
rep("C", 3))) new.fact =
>> fact[1:6]
>>> new.fact
>> [1] A A A B B B
>> Levels: A B C ## should only show A B
>
> Just use
>
>> factor(new.fact)
> [1] A A A B B B
> Levels: A B
>
> or
>
>> fact[1:6, drop=T]
> [1] A A A B B B
> Levels: A B
>
>
> And, no, it is not a bug. The fact that a subsample happens to consist only
of males does not turn gender into a one-level factor... (Apart from the
philosophy, it makes a real difference in tabulation.)
>
>