John Dennison
2011-May-12 16:33 UTC
[R] Saving misclassified records into dataframe within a loop
Greetings R world, I know some version of the this question has been asked before, but i need to save the output of a loop into a data frame to eventually be written to a postgres data base with dbWriteTable. Some background. I have developed classifications models to help identify problem accounts. The logic is this, if the model classifies the record as including variable X and it turns out that record does not have X then it should be reviewed(ie i need the row number/ID saved to a database). Generally i want to look at the misclassified records. This is a little hack i know, anyone got a better idea please let me know. Here is an example library(rpart) # grow tree fit <- rpart(Kyphosis ~ Age + Number + Start, method="class", data=kyphosis) #predict prediction<-predict(fit, kyphosis) #misclassification index function predict.function <- function(x){ for (i in 1:length(kyphosis$Kyphosis)) { #the idea is that if the record is "absent" but the prediction is otherwise then show me that record if (((kyphosis$Kyphosis[i]=="absent")==(prediction[i,1]==1)) == 0 ){ #THIS WORKS print( row.names(kyphosis[c(i),])) } } } predict.function(x) Now my issue is that i want to save these id to a data.frame so i can later save them to a database. This this an incorrect approach. Can I save each id to the postgres instance as it is found. i have a ignorant fear of lapply, but it seems it may hold the key. Ive tried predict.function <- function(x){ results<-as.data.frame(1) for (i in 1:length(kyphosis$Kyphosis)) { #the idea is that if the record is "absent" but the prediction is otherwise then show me that record if (((kyphosis$Kyphosis[i]=="absent")==(prediction[i,1]==1)) == 0 ){ #THIS WORKS results[i,]<- as.data.frame(row.names(kyphosis[c(i),])) } } } this does not work. results object does not get saved. Any Help would be greatly appreciated. Thanks John Dennison [[alternative HTML version deleted]]
Phil Spector
2011-May-12 17:50 UTC
[R] Saving misclassified records into dataframe within a loop
John - In your example, the misclassified observations (as defined by your predict.function) will be kyphosis[kyphosis$Kyphosis == 'absent' & prediction[,1] != 1,] so you could start from there. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Thu, 12 May 2011, John Dennison wrote:> Greetings R world, > > I know some version of the this question has been asked before, but i need > to save the output of a loop into a data frame to eventually be written to a > postgres data base with dbWriteTable. Some background. I have developed > classifications models to help identify problem accounts. The logic is this, > if the model classifies the record as including variable X and it turns out > that record does not have X then it should be reviewed(ie i need the row > number/ID saved to a database). Generally i want to look at the > misclassified records. This is a little hack i know, anyone got a better > idea please let me know. Here is an example > > library(rpart) > > # grow tree > fit <- rpart(Kyphosis ~ Age + Number + Start, > method="class", data=kyphosis) > #predict > prediction<-predict(fit, kyphosis) > > #misclassification index function > > predict.function <- function(x){ > for (i in 1:length(kyphosis$Kyphosis)) { > #the idea is that if the record is "absent" but the prediction is otherwise > then show me that record > if (((kyphosis$Kyphosis[i]=="absent")==(prediction[i,1]==1)) == 0 ){ > #THIS WORKS > print( row.names(kyphosis[c(i),])) > } > } } > > predict.function(x) > > Now my issue is that i want to save these id to a data.frame so i can later > save them to a database. This this an incorrect approach. Can I save each id > to the postgres instance as it is found. i have a ignorant fear of lapply, > but it seems it may hold the key. > > > Ive tried > > predict.function <- function(x){ > results<-as.data.frame(1) > for (i in 1:length(kyphosis$Kyphosis)) { > #the idea is that if the record is "absent" but the prediction is otherwise > then show me that record > if (((kyphosis$Kyphosis[i]=="absent")==(prediction[i,1]==1)) == 0 ){ > #THIS WORKS > results[i,]<- as.data.frame(row.names(kyphosis[c(i),])) > } > } } > > this does not work. results object does not get saved. Any Help would be > greatly appreciated. > > > Thanks > > John Dennison > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >