thr3ads.net - R help - [R] redundant factor levels after subsetting a dataset [Nov 2009]

If this information is useful, please help other people find it:
Share via:

Daniel Malter

2009-Nov-12 02:00 UTC

[R] redundant factor levels after subsetting a dataset

#I have a data frame with a numeric and a character variable. 

x=c(1,2,3,2,0,2,-1,-2,-4)
md=c(rep("Miller",3),
rep("Richard",3),rep("Smith",3))
data1=data.frame(x,md)

#I subset this data.frame in a way such that one level of the character
variable does not appear in the new dataset. 

data2=data1[x>0,]
data3=subset(data1,x>0)

#However, when I check the levels of the factor variable in the subset data
frame, it still shows the levels that are now unused. 

unique(data2$md)
unique(data3$md)

#This leads to complications in table and tapply that I want to avoid.

table(data2$md)
tapply(data2$x,data2$md,mean)

table(data3$md)
tapply(data3$x,data3$md,mean)

#Basically, I want to completely remove "Smith" from data frame data2
or
data3 so that it would not show up in table or tapply operations.

Thanks for any pointers,
Daniel







-----------------------------------------------
"Who has visions, should see a doctor," 
Helmut Schmidt, German Chancellor (1974-1982).

David Winsemius

2009-Nov-12 02:20 UTC

head link

[R] redundant factor levels after subsetting a dataset

On Nov 11, 2009, at 9:00 PM, Daniel Malter wrote:
> #I have a data frame with a numeric and a character variable.
>
> x=c(1,2,3,2,0,2,-1,-2,-4)
> md=c(rep("Miller",3),
rep("Richard",3),rep("Smith",3))
> data1=data.frame(x,md)
>
> #I subset this data.frame in a way such that one level of the  
> character
> variable does not appear in the new dataset.
>
> data2=data1[x>0,]
> data3=subset(data1,x>0)
I thought this was asked and answered yesterday ((???)):

 > data2 <- as.data.frame(lapply(data2, function(x) x[,drop=TRUE]))
 > data2
   x      md
1 1  Miller
2 2  Miller
3 3  Miller
4 2 Richard
5 2 Richard
 > data3 <- as.data.frame(lapply(data3, function(x) x[,drop=TRUE]))
 > data3
   x      md
1 1  Miller
2 2  Miller
3 3  Miller
4 2 Richard
5 2 Richard

 > unique(data2$md)
[1] Miller  Richard
Levels: Miller Richard
 > unique(data3$md)
[1] Miller  Richard
Levels: Miller Richard

-- 
David
>
> #However, when I check the levels of the factor variable in the  
> subset data
> frame, it still shows the levels that are now unused.
>
> unique(data2$md)
> unique(data3$md)
>
> #This leads to complications in table and tapply that I want to avoid.
>
> table(data2$md)
> tapply(data2$x,data2$md,mean)
>
> table(data3$md)
> tapply(data3$x,data3$md,mean)
>
> #Basically, I want to completely remove "Smith" from data frame  
> data2 or
> data3 so that it would not show up in table or tapply operations.
>
> Thanks for any pointers,
> Daniel
>
>
>
>
>
>
>
> -----------------------------------------------
> "Who has visions, should see a doctor,"
> Helmut Schmidt, German Chancellor (1974-1982).
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Daniel Malter

2009-Nov-12 05:50 UTC

head link

[R] redundant factor levels after subsetting a dataset

Thanks, works a charme. I was not aware that it had been answered just
yesterday. The solution previously suggested in this thread did not work for
me.

Daniel


-------------------------
cuncta stricte discussurus
-------------------------

-----Urspr?ngliche Nachricht-----
Von: David Winsemius [mailto:dwinsemius at comcast.net] 
Gesendet: Wednesday, November 11, 2009 9:21 PM
An: Daniel Malter
Cc: r-help at stat.math.ethz.ch
Betreff: Re: [R] redundant factor levels after subsetting a dataset


On Nov 11, 2009, at 9:00 PM, Daniel Malter wrote:
> #I have a data frame with a numeric and a character variable.
>
> x=c(1,2,3,2,0,2,-1,-2,-4)
> md=c(rep("Miller",3),
rep("Richard",3),rep("Smith",3))
> data1=data.frame(x,md)
>
> #I subset this data.frame in a way such that one level of the 
> character variable does not appear in the new dataset.
>
> data2=data1[x>0,]
> data3=subset(data1,x>0)
I thought this was asked and answered yesterday ((???)):

 > data2 <- as.data.frame(lapply(data2, function(x) x[,drop=TRUE]))  >
data2
   x      md
1 1  Miller
2 2  Miller
3 3  Miller
4 2 Richard
5 2 Richard
 > data3 <- as.data.frame(lapply(data3, function(x) x[,drop=TRUE]))  >
data3
   x      md
1 1  Miller
2 2  Miller
3 3  Miller
4 2 Richard
5 2 Richard

 > unique(data2$md)
[1] Miller  Richard
Levels: Miller Richard
 > unique(data3$md)
[1] Miller  Richard
Levels: Miller Richard

--
David
>
> #However, when I check the levels of the factor variable in the subset 
> data frame, it still shows the levels that are now unused.
>
> unique(data2$md)
> unique(data3$md)
>
> #This leads to complications in table and tapply that I want to avoid.
>
> table(data2$md)
> tapply(data2$x,data2$md,mean)
>
> table(data3$md)
> tapply(data3$x,data3$md,mean)
>
> #Basically, I want to completely remove "Smith" from data frame
> data2 or
> data3 so that it would not show up in table or tapply operations.
>
> Thanks for any pointers,
> Daniel
>
>
>
>
>
>
>
> -----------------------------------------------
> "Who has visions, should see a doctor,"
> Helmut Schmidt, German Chancellor (1974-1982).
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Reasonably Related Threads

Search for more maybe matching threads

R help - Nov 2009 - redundant factor levels after subsetting a dataset

[R] redundant factor levels after subsetting a dataset

[R] redundant factor levels after subsetting a dataset

[R] redundant factor levels after subsetting a dataset

Reasonably Related Threads