Whew, figured it out through trial and error.
In case anyone else runs into this problem, the issue ended up being with the
data in one of the columns. I knew I didn't have any actual missing values,
but one of the columns is a text field which can have the literal value of
"NA". I guess R was interpreting those as a special case and then
running into problems later. When I replaced the NA with another value, the
classifier now sees the right number of rows, and I can run a summary() function
fine.
> From: d.daschle@hotmail.com
> To: r-help@r-project.org
> Date: Wed, 7 Apr 2010 18:44:34 -0400
> Subject: [R] RWeka - Error when attempting to summary() model
>
>
> I'm a big fan of both Weka and R (quite new at R :) ), and jumped at
the chance to use them together. Unfortunately, I'm running into what is
probably a dumb error when trying to view info about my model. A Google search
turned up 0 hits for the actual error I got (last line), but you all are
smarter!
>
> My code is below, but basically my data frame (q) is imported via RODBC and
has 1586 rows (as you can see from nrow() ). q$Site is the column I hope to
classify by using the JRip classifier. When I view the m object, the model seems
to have been trained on a lot fewer rows than expected (10 vs 1586?), and the
summary() command fails with the error I mentioned I haven't seen anyone run
into. My guess is something is wrong with the specification of the training set,
but when I add control=Weka_control(F=1) to specify only one fold, the end
result is the same with the degenerate confusion matrix error. Is there some
other way I should be forcing it to train on more rows? Is that issue related to
not being able to generate a confusion matrix?
>
>
>
> > attach(q)
>
> > nrow(q)
>
> [1] 1586
>
> > summary(Site)
>
> A B C D E F
>
> 265 190 260 344 329 198
>
> > m <- JRip(Site~.,data=q)
>
> > m
>
> JRIP rules:
>
> ==========>
>
>
> (Dinosaur = TRex) => Site=A (3.0/0.0)
>
> => Site=B (5.0/2.0)
>
>
>
> Number of Rules : 2
>
>
>
> > summary(m)
>
> Error in evaluate_Weka_classifier(object, ...) :
>
> Cannot set dimnames on degenerate confusion matrix.
>
>
>
>
> _________________________________________________________________
> Hotmail is redefining busy with tools for the New Busy. Get more from your
inbox.
>
> N:WL:en-US:WM_HMP:042010_2
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
_________________________________________________________________
The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with
Hotmail.
PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5
[[alternative HTML version deleted]]