I have a data set 382 rows and 63 columns. One of the columns is bay, and there are 6 bays. But, the number of levels for this factor is 7 when it should be six because there is some 'blank' level "". When I subset for the blank level "", I get 0 rows. What in my data could be causing this? Thanks.> dim(datmtx)[1] 382 63> datmtx$bay[1] TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB [51] TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB [101] TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB HI TB HI TB TB [151] TB TB TB TB TB HI TB HI HI HI TB HI HI HI TB HI HI HI HI HI HI HI HI TB TB TB TB CH CH TB CH CH CH CH CH CH CH CH CH CH TB TB CH CH CH CH CH CH CH CH [201] CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH TB HI HI HI TB HI HI TB TB TB TB TB TB TB TB TB TB HI TB TB TB TB TB TB TB TB TB TB TB TB [251] TB HI HI HI CH CH CH CH CH CH CH CH CH CH HI HI CH CH CH CH CH CH CH CH CH CH CH CH TB TB TB TB TB TB TB TB TB TB CH CH AP AP AP AP AP AP HI HI HI CH [301] CH CH CH AP AP TB TB AP AP AP AP AP AP SA BB BB TB TB TB TB AP HI AP SA AP HI AP AP HI HI TB HI AP SA AP AP AP AP AP AP AP AP SA AP AP SA AP AP AP SA [351] SA SA AP AP AP CH CH CH CH CH AP BB BB BB BB BB TB CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH Levels: AP BB CH HI SA TB> levels(datmtx$bay)[1] "" "AP" "BB" "CH" "HI" "SA" "TB"> nlevels(datmtx$bay)[1] 7 David Chagaris Associate Research Scientist Florida Fish and Wildlife Conservation Commission Florida Fish and Wildlife Research Institute 100 8th Ave SE St. Petersburg, FL 33701 (727) 896-8626 ext. 4305 (727) 893-1374 fax [[alternative HTML version deleted]]
On Oct 8, 2010, at 3:04 PM, Chagaris, Dave wrote:> I have a data set 382 rows and 63 columns. One of the columns is > bay, and there are 6 bays. But, the number of levels for this > factor is 7 when it should be six because there is some 'blank' > level "". When I subset for the blank level "", I get 0 rows.How did you do the subset?> What in my data could be causing this? Thanks. > >> dim(datmtx) > [1] 382 63 > > >> datmtx$bay > [1] TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB > TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB > TB TB TB TB TB TB > [51] TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB > TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB > TB TB TB TB TB TB > [101] TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB > TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB > TB HI TB HI TB TB > [151] TB TB TB TB TB HI TB HI HI HI TB HI HI HI TB HI HI HI HI HI HI > HI HI TB TB TB TB CH CH TB CH CH CH CH CH CH CH CH CH CH TB TB CH CH > CH CH CH CH CH CH > [201] CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH TB > HI HI HI TB HI HI TB TB TB TB TB TB TB TB TB TB HI TB TB TB TB TB TB > TB TB TB TB TB TB > [251] TB HI HI HI CH CH CH CH CH CH CH CH CH CH HI HI CH CH CH CH CH > CH CH CH CH CH CH CH TB TB TB TB TB TB TB TB TB TB CH CH AP AP AP AP > AP AP HI HI HI CH > [301] CH CH CH AP AP TB TB AP AP AP AP AP AP SA BB BB TB TB TB TB AP > HI AP SA AP HI AP AP HI HI TB HI AP SA AP AP AP AP AP AP AP AP SA AP > AP SA AP AP AP SA > [351] SA SA AP AP AP CH CH CH CH CH AP BB BB BB BB BB TB CH CH CH CH > CH CH CH CH CH CH CH CH CH CH CH > Levels: AP BB CH HI SA TB > >> levels(datmtx$bay) > [1] "" "AP" "BB" "CH" "HI" "SA" "TB"What do you get with: which(!datmtx$bay %in% c( "AP", "BB", "CH", "HI," "SA", "TB") ) -- David.> >> nlevels(datmtx$bay) > [1] 7 > > David Chagaris > Associate Research Scientist > Florida Fish and Wildlife Conservation Commission > Florida Fish and Wildlife Research Institute > 100 8th Ave SE > St. Petersburg, FL 33701 > (727) 896-8626 ext. 4305 > (727) 893-1374 fax > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Chagaris, Dave > Sent: Friday, October 08, 2010 12:04 PM > To: r-help at r-project.org > Subject: [R] incorrect number of levels > > I have a data set 382 rows and 63 columns. One of the > columns is bay, and there are 6 bays. But, the number of > levels for this factor is 7 when it should be six because > there is some 'blank' level "". When I subset for the blank > level "", I get 0 rows. What in my data could be causing > this? Thanks.There are lots of ways to make such a dataaet. It could be caused by read.table(sep=",",file) where file contains two adjacent commas and you later removed the offending row > d <- read.table(sep=",", textConnection("101,,201\n102,two,202\n103,three,203\n")) > d <- d[-1,] > levels(d$V2) [1] "" "three" "two" and you later removed the offending row. You can get rid of the unused levels by passing it through factor() > levels(factor(d$V2)) [1] "three" "two" and you may as well use the opportunity to give it levels in the order you want > str(factor(d$V2, levels=c("two","three"))) Factor w/ 2 levels "two","three": 1 2 > d$V2 <- factor(d$V2, levels=c("two","three")) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> > dim(datmtx) > [1] 382 63 > > > > datmtx$bay > [1] TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB > TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB > TB TB TB TB TB TB TB TB TB TB TB TB > [51] TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB > TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB > TB TB TB TB TB TB TB TB TB TB TB TB > [101] TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB > TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB > TB TB TB TB TB TB TB HI TB HI TB TB > [151] TB TB TB TB TB HI TB HI HI HI TB HI HI HI TB HI HI HI > HI HI HI HI HI TB TB TB TB CH CH TB CH CH CH CH CH CH CH CH > CH CH TB TB CH CH CH CH CH CH CH CH > [201] CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH > CH CH TB HI HI HI TB HI HI TB TB TB TB TB TB TB TB TB TB HI > TB TB TB TB TB TB TB TB TB TB TB TB > [251] TB HI HI HI CH CH CH CH CH CH CH CH CH CH HI HI CH CH > CH CH CH CH CH CH CH CH CH CH TB TB TB TB TB TB TB TB TB TB > CH CH AP AP AP AP AP AP HI HI HI CH > [301] CH CH CH AP AP TB TB AP AP AP AP AP AP SA BB BB TB TB > TB TB AP HI AP SA AP HI AP AP HI HI TB HI AP SA AP AP AP AP > AP AP AP AP SA AP AP SA AP AP AP SA > [351] SA SA AP AP AP CH CH CH CH CH AP BB BB BB BB BB TB CH > CH CH CH CH CH CH CH CH CH CH CH CH CH CH > Levels: AP BB CH HI SA TB > > > levels(datmtx$bay) > [1] "" "AP" "BB" "CH" "HI" "SA" "TB" > > > nlevels(datmtx$bay) > [1] 7 > > David Chagaris > Associate Research Scientist > Florida Fish and Wildlife Conservation Commission > Florida Fish and Wildlife Research Institute > 100 8th Ave SE > St. Petersburg, FL 33701 > (727) 896-8626 ext. 4305 > (727) 893-1374 fax > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >