Apologies that I am late on this thread.
On 02/12/10 17:39, Sascha Wolfer wrote:
I seem to have a problem with the openNLP package, I'm actually stuck in the very beginning. Here's what I did:
install.packages("openNLP")
install.packages("openNLPmodels.de", repos =
"http://datacube.wu.ac.at/", type = "source")
library(openNLPmodels.de) library(openNLP)
So I installed the main package as well as the supplementary german model. Now, I try to use the "sentDetect" function:
s <- c("Das hier ist ein Satz. Und hier ist noch einer - sogar mit
Gedankenstrich. Ist das nicht toll?")
sentDetect(s, language = "de", model = "openNLPmodels.de")
I get the following error message which I can't make any sense of:
Fehler in .jnew("opennlp/maxent/io/SuffixSensitiveGISModelReader",
.jnew("java.io.File", :
java.io.FileNotFoundException: openNLPmodels.de (No such file or
directory)
The correct syntax seems to be
sentDetect(s, model = system.file("models", "de-sent.bin", package = "openNLPmodels.de"))
but unfortunately I get
Error in .jcall(.jnew("opennlp/maxent/io/SuffixSensitiveGISModelReader", :
java.io.UTFDataFormatException: malformed input around byte 48
YMMV. But you get the idea on the syntax of the model= argument. This
"works":
sentDetect(s, model = system.file("models", "sentdetect", "EnglishSD.bin.gz", package = "openNLPmodels.en"))
# [1] "Das hier ist ein Satz. "
# [2] "Und hier ist noch einer - sogar mit Gedankenstrich. "
# [3] "Ist das nicht toll?"
Hope this helps you a little.
Allan