Apologies that I am late on this thread.
On 02/12/10 17:39, Sascha Wolfer wrote:> I seem to have a problem with the openNLP package, I'm actually stuck
> in the very beginning. Here's what I did:
> > install.packages("openNLP")
> > install.packages("openNLPmodels.de", repos =
> "http://datacube.wu.ac.at/", type = "source")
>
> > library(openNLPmodels.de)
> > library(openNLP)
>
> So I installed the main package as well as the supplementary german
> model. Now, I try to use the "sentDetect" function:
>
> > s <- c("Das hier ist ein Satz. Und hier ist noch einer - sogar
mit
> Gedankenstrich. Ist das nicht toll?")
> > sentDetect(s, language = "de", model =
"openNLPmodels.de")
>
> I get the following error message which I can't make any sense of:
>
> Fehler in
.jnew("opennlp/maxent/io/SuffixSensitiveGISModelReader",
> .jnew("java.io.File", :
> java.io.FileNotFoundException: openNLPmodels.de (No such file or
> directory)
The correct syntax seems to be
sentDetect(s, model = system.file("models", "de-sent.bin",
package = "openNLPmodels.de"))
but unfortunately I get
Error in
.jcall(.jnew("opennlp/maxent/io/SuffixSensitiveGISModelReader", :
java.io.UTFDataFormatException: malformed input around byte 48
YMMV. But you get the idea on the syntax of the model= argument. This
"works":
sentDetect(s, model = system.file("models", "sentdetect",
"EnglishSD.bin.gz", package = "openNLPmodels.en"))
# [1] "Das hier ist ein Satz. "
# [2] "Und hier ist noch einer - sogar mit Gedankenstrich. "
# [3] "Ist das nicht toll?"
Hope this helps you a little.
Allan