John Dennison
2011-May-12 16:33 UTC
[R] Saving misclassified records into dataframe within a loop
Greetings R world,
I know some version of the this question has been asked before, but i need
to save the output of a loop into a data frame to eventually be written to a
postgres data base with dbWriteTable. Some background. I have developed
classifications models to help identify problem accounts. The logic is this,
if the model classifies the record as including variable X and it turns out
that record does not have X then it should be reviewed(ie i need the row
number/ID saved to a database). Generally i want to look at the
misclassified records. This is a little hack i know, anyone got a better
idea please let me know. Here is an example
library(rpart)
# grow tree
fit <- rpart(Kyphosis ~ Age + Number + Start,
method="class", data=kyphosis)
#predict
prediction<-predict(fit, kyphosis)
#misclassification index function
predict.function <- function(x){
for (i in 1:length(kyphosis$Kyphosis)) {
#the idea is that if the record is "absent" but the prediction is
otherwise
then show me that record
if (((kyphosis$Kyphosis[i]=="absent")==(prediction[i,1]==1)) == 0 ){
#THIS WORKS
print( row.names(kyphosis[c(i),]))
}
} }
predict.function(x)
Now my issue is that i want to save these id to a data.frame so i can later
save them to a database. This this an incorrect approach. Can I save each id
to the postgres instance as it is found. i have a ignorant fear of lapply,
but it seems it may hold the key.
Ive tried
predict.function <- function(x){
results<-as.data.frame(1)
for (i in 1:length(kyphosis$Kyphosis)) {
#the idea is that if the record is "absent" but the prediction is
otherwise
then show me that record
if (((kyphosis$Kyphosis[i]=="absent")==(prediction[i,1]==1)) == 0 ){
#THIS WORKS
results[i,]<- as.data.frame(row.names(kyphosis[c(i),]))
}
} }
this does not work. results object does not get saved. Any Help would be
greatly appreciated.
Thanks
John Dennison
[[alternative HTML version deleted]]
Phil Spector
2011-May-12 17:50 UTC
[R] Saving misclassified records into dataframe within a loop
John -
In your example, the misclassified observations (as defined by
your predict.function) will be
kyphosis[kyphosis$Kyphosis == 'absent' & prediction[,1] != 1,]
so you could start from there.
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spector at stat.berkeley.edu
On Thu, 12 May 2011, John Dennison wrote:
> Greetings R world,
>
> I know some version of the this question has been asked before, but i need
> to save the output of a loop into a data frame to eventually be written to
a
> postgres data base with dbWriteTable. Some background. I have developed
> classifications models to help identify problem accounts. The logic is
this,
> if the model classifies the record as including variable X and it turns out
> that record does not have X then it should be reviewed(ie i need the row
> number/ID saved to a database). Generally i want to look at the
> misclassified records. This is a little hack i know, anyone got a better
> idea please let me know. Here is an example
>
> library(rpart)
>
> # grow tree
> fit <- rpart(Kyphosis ~ Age + Number + Start,
> method="class", data=kyphosis)
> #predict
> prediction<-predict(fit, kyphosis)
>
> #misclassification index function
>
> predict.function <- function(x){
> for (i in 1:length(kyphosis$Kyphosis)) {
> #the idea is that if the record is "absent" but the prediction is
otherwise
> then show me that record
> if (((kyphosis$Kyphosis[i]=="absent")==(prediction[i,1]==1)) == 0
){
> #THIS WORKS
> print( row.names(kyphosis[c(i),]))
> }
> } }
>
> predict.function(x)
>
> Now my issue is that i want to save these id to a data.frame so i can later
> save them to a database. This this an incorrect approach. Can I save each
id
> to the postgres instance as it is found. i have a ignorant fear of lapply,
> but it seems it may hold the key.
>
>
> Ive tried
>
> predict.function <- function(x){
> results<-as.data.frame(1)
> for (i in 1:length(kyphosis$Kyphosis)) {
> #the idea is that if the record is "absent" but the prediction is
otherwise
> then show me that record
> if (((kyphosis$Kyphosis[i]=="absent")==(prediction[i,1]==1)) == 0
){
> #THIS WORKS
> results[i,]<- as.data.frame(row.names(kyphosis[c(i),]))
> }
> } }
>
> this does not work. results object does not get saved. Any Help would be
> greatly appreciated.
>
>
> Thanks
>
> John Dennison
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>