Paul Chatfield
2010-Jul-06 11:26 UTC
[R] information reduction-database management question
If you redefine your NAs as below to be detected as some arbitrary large number, then the code should work through. Any 5's left in your dataset can be replaced just as easily by NAs again. Not elegant, but effective. site <- c("s1", "s1", "s1", "s2","s2", "s2") pref <- c(1, 2, 3, 1, 2, 3) R1 <- c(NA, NA, 1, NA,NA,NA) R2 <- c(NA, 0, 1, 1, NA, 1) R3 <- c(NA, 1, 1, NA, 1, 1) R4 <- c(0, NA, 0, 1, NA, 0) R5 <- c(NA, 0, 1, NA, 1, 1) datum <- data.frame(site, pref, R1, R2, R3, R4, R5) ## For 1 column; datum$R1[is.na(datum$R1)==T]<-5 tapply(datum$R1, datum$site, min, na.rm=T) ## Can loop this over all columns; new<-matrix(0,5,2) for (i in 3:7) {datum[,i][is.na(datum[,i])==T]<-5 new[i-2,]<-tapply(datum[,i], datum$site, min, na.rm=T)} -- View this message in context: http://r.789695.n4.nabble.com/information-reduction-database-management-question-tp2278863p2279385.html Sent from the R help mailing list archive at Nabble.com.
Paul Chatfield
2010-Jul-06 14:55 UTC
[R] information reduction-database management question
I don't think the approach would change much with text. You would have to write a function which picks the 'min' or whatever that means to you with text and then it should work ok, Paul From: Brad Patrick Schneid [via R] [mailto:ml-node+2279677-1095983982-120784@n4.nabble.com] Sent: 06 July 2010 15:48 To: Paul Chatfield Subject: Re: information reduction-database management question Thanks Paul, I appreciate your time and this is an interesting approach. Unfortunately, I need it to work for all types of information, including character data (i.e. text). Again.. thanks for your suggestion! Brad Paul Chatfield wrote: If you redefine your NAs as below to be detected as some arbitrary large number, then the code should work through. Any 5's left in your dataset can be replaced just as easily by NAs again. Not elegant, but effective. site <- c("s1", "s1", "s1", "s2","s2", "s2") pref <- c(1, 2, 3, 1, 2, 3) R1 <- c(NA, NA, 1, NA,NA,NA) R2 <- c(NA, 0, 1, 1, NA, 1) R3 <- c(NA, 1, 1, NA, 1, 1) R4 <- c(0, NA, 0, 1, NA, 0) R5 <- c(NA, 0, 1, NA, 1, 1) datum <- data.frame(site, pref, R1, R2, R3, R4, R5) ## For 1 column; datum$R1[is.na(datum$R1)==T]<-5 tapply(datum$R1, datum$site, min, na.rm=T) ## Can loop this over all columns; new<-matrix(0,5,2) for (i in 3:7) {datum[,i][is.na(datum[,i])==T]<-5 new[i-2,]<-tapply(datum[,i], datum$site, min, na.rm=T)} ________________________________ View message @ http://r.789695.n4.nabble.com/information-reduction-database-management- question-tp2278863p2279677.html To unsubscribe from Re: information reduction-database management question, click here < (link removed) NoYXRmaWVsZEByZWFkaW5nLmFjLnVrfDIyNzkzODV8LTE4MjM2NDg5MTM=> . -- View this message in context: http://r.789695.n4.nabble.com/information-reduction-database-management-question-tp2278863p2279688.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]]