Mark Hempelmann
2007-Nov-24 16:52 UTC
[R] Indexing and partially replacing 99, 999 in data frames
Dear WizaRds, unfortunately, I have been unable to replace the '99' and '999' entries in library(UsingR) attach(babies) as definitions for missing values NA, because sometimes the 99 entry is indeed a correct value. Usually, or so I thought, NAs can easily replace a, say, 999 entry via mymat[mymat==999] <- "yodl" in a matrix or data frame. Alas, the babies' dataset also includes 99 entries as true values. So, here is what I did: #to remove all 999: babies[babies==999] <- NA , but to remove the 99 in columns nr. 10,12,17 I have come to a complete stop. The corny idea of babies$ht[babies$ht==99] <- NA babies$dht[babies$dht==99] <- NA babies$dwt[babies$dwt==99] <- NA works, but seems to show that I have not really understood the art of indexing, have I? The archives did not really offer enough insight for me to solve the problem, I am ashamed. I tried something with babies[is.element(babies[,c(10,12,17)], 99)] <- NA # beeep, wrong or babies[babies[,c(10,12,17)]==99] # no way, indeed. detach(babies) There must be a more intelligent and elegant solution. Also, what is the nr. of rows after I remove all NA entries? Easy example: frog <- matrix(1:42, ncol=3) frog[sample(42, 7)] <- NA length(frog[!is.na(frog)]) # ok, but I want to know the nr of rows without NAs dim(frog[!is.na(frog),]) #no nrow(!is.na(frog)) # no Thank you for your help and Cheers mark
jim holtman
2007-Nov-24 17:04 UTC
[R] Indexing and partially replacing 99, 999 in data frames
for you last case with 'frog': sum(complete.cases(frog)) On Nov 24, 2007 11:52 AM, Mark Hempelmann <neo27 at rakers.de> wrote:> > Dear WizaRds, > > unfortunately, I have been unable to replace the '99' and '999' entries in > > library(UsingR) > attach(babies) > > as definitions for missing values NA, because sometimes the 99 entry is > indeed a correct value. Usually, or so I thought, NAs can > easily replace a, say, 999 entry via > > mymat[mymat==999] <- "yodl" > > in a matrix or data frame. Alas, the babies' dataset also includes 99 > entries as true values. So, here is what I did: > > #to remove all 999: > babies[babies==999] <- NA > > , but to remove the 99 in columns nr. 10,12,17 I have come to a complete > stop. The corny idea of > > babies$ht[babies$ht==99] <- NA > babies$dht[babies$dht==99] <- NA > babies$dwt[babies$dwt==99] <- NA > > works, but seems to show that I have not really understood the art of > indexing, have I? The archives did not really offer enough insight for > me to solve the problem, I am ashamed. > > I tried something with > babies[is.element(babies[,c(10,12,17)], 99)] <- NA # beeep, wrong or > babies[babies[,c(10,12,17)]==99] # no way, indeed. > > detach(babies) > > There must be a more intelligent and elegant solution. > > Also, what is the nr. of rows after I remove all NA entries? Easy example: > > frog <- matrix(1:42, ncol=3) > frog[sample(42, 7)] <- NA > > length(frog[!is.na(frog)]) > # ok, but I want to know the nr of rows without NAs > dim(frog[!is.na(frog),]) #no > nrow(!is.na(frog)) # no > > > Thank you for your help and > Cheers > mark > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?