I'm sure this is simple enough, but an R site search on my subject terms did suggest a solution. I have a numeric vector with many values that I wish to create a factor from having only a few levels. Here is a toy example. > x <- 1:10 > x <- factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C")) > x [1] A A A B B B C C C C Levels: A A A B B B C C C C > summary(x) A A A B B B C C C C 3 0 0 3 0 0 4 0 0 0 So, there are clearly still 10 underlying levels. The results I would like to see from printing the value and summary(x) are: > x [1] A A A B B B C C C C Levels: A B C > summary(x) A B C 3 3 4 Hopefully this makes sense. Thanks, Kevin -- Kevin E. Thorpe Biostatistician/Trialist, Knowledge Translation Program Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.thorpe at utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016
Hi Kevin, Here are two suggestions: # Combination of levels() and table() table(levels(x)) # A B C # 3 3 4 # Or defining a function mysummary <- function(x) table(levels(x)) # you can easily improve it :-) mysummary(x) # A B C # 3 3 4 HTH, Jorge On Sun, Nov 1, 2009 at 3:51 PM, Kevin E. Thorpe <> wrote:> I'm sure this is simple enough, but an R site search on my subject > terms did suggest a solution. I have a numeric vector with many > values that I wish to create a factor from having only a few levels. > Here is a toy example. > > > x <- 1:10 > > x <- > factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C")) > > x > [1] A A A B B B C C C C > Levels: A A A B B B C C C C > > summary(x) > A A A B B B C C C C > 3 0 0 3 0 0 4 0 0 0 > > So, there are clearly still 10 underlying levels. The results I would > like to see from printing the value and summary(x) are: > > > x > [1] A A A B B B C C C C > Levels: A B C > > summary(x) > A B C > 3 3 4 > > Hopefully this makes sense. > > Thanks, > > Kevin > > -- > Kevin E. Thorpe > Biostatistician/Trialist, Knowledge Translation Program > Assistant Professor, Dalla Lana School of Public Health > University of Toronto > email: kevin.thorpe@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
On Nov 1, 2009, at 3:51 PM, Kevin E. Thorpe wrote:> I'm sure this is simple enough, but an R site search on my subject > terms did suggest a solution. I have a numeric vector with many > values that I wish to create a factor from having only a few levels. > Here is a toy example. > > > x <- 1:10 > > x <- > factor > (x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C"))You have thusly created a pathological situation. In 2.10.0 this is what you might see: > x <- factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C")) Warning message: In `levels<-`(`*tmp*`, value = c("A", "A", "A", "B", "B", "B", "C", : duplicated levels will not be allowed in factors anymore What you _should_ have done was: x2 <- factor(c("A","A","A","B","B","B","C","C","C","C")) The usual approach to getting rid of unused factor levels is just to apply the function factor() again without additional arguments. > x <- factor(x) # the "x" was from your code Warning message: In `levels<-`(`*tmp*`, value = c("A", "A", "A", "B", "B", "B", "C", : duplicated levels will not be allowed in factors anymore # but that will be the last time you will see the warning.. > summary(x) A B C 3 3 4 -- David.> > x > [1] A A A B B B C C C C > Levels: A A A B B B C C C C > > summary(x) > A A A B B B C C C C > 3 0 0 3 0 0 4 0 0 0 > > So, there are clearly still 10 underlying levels. The results I would > like to see from printing the value and summary(x) are: > > > x > [1] A A A B B B C C C C > Levels: A B C > > summary(x) > A B C > 3 3 4 > > Hopefully this makes sense. > > Thanks, > > Kevin > > -- > Kevin E. Thorpe > Biostatistician/Trialist, Knowledge Translation Program > Assistant Professor, Dalla Lana School of Public Health > University of Toronto > email: kevin.thorpe at utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT
Kevin E. Thorpe wrote:> I'm sure this is simple enough, but an R site search on my subject > terms did suggest a solution. I have a numeric vector with many > values that I wish to create a factor from having only a few levels. > Here is a toy example. > > > x <- 1:10 > > x <- > factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C")) > > x > [1] A A A B B B C C C C > Levels: A A A B B B C C C C > > summary(x) > A A A B B B C C C C > 3 0 0 3 0 0 4 0 0 0 > > So, there are clearly still 10 underlying levels. The results I would > like to see from printing the value and summary(x) are: > > > x > [1] A A A B B B C C C C > Levels: A B C > > summary(x) > A B C > 3 3 4 > > Hopefully this makes sense. > > Thanks, > > Kevin >It's an anomaly inherited frokm S-PLUS (or so I have been told). Actually, with the current R, you should get a warning: > x <- 1:10 > x <- factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C")) Warning message: In `levels<-`(`*tmp*`, value = c("A", "A", "A", "B", "B", "B", "C", : duplicated levels will not be allowed in factors anymore This works (as documented on the help page for levels!): > x <- 1:10 > x <- factor(x,levels=1:10) > levels(x) <- c("A","A","A","B","B","B","C","C","C","C") > table(x) x A B C 3 3 4 -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907