StringIndexOutOfBoundsException in RWeka

Tue, Jan 12, 2010 9:14 AM

I have narrowed the problem down to this:

NGramTokenizer("-", control = Weka_control(min = 1, max = 4))

The string actually occurs as fourth segment in the 21,226th sentence.  I find this strange, since I am 
using the default delimiters ' \r\n\t.,;:'"()?!', which do not contain a hyphen.

Regards,
Richard

On Tue, 12 Jan 2010 16:50:16 +0100, Richard R. Liu wrote

--
Richard R. Liu
Dittingerstr. 33
CH-4053 Basel
Switzerland

Tel.:  +41 61 331 10 47
Email:  richard.liu at pueo-owl.ch

Thread (4 messages)

Richard R. Liu StringIndexOutOfBoundsException in RWeka Jan 12 Richard R. Liu StringIndexOutOfBoundsException in RWeka Jan 12 Richard R. Liu StringIndexOutOfBoundsException in RWeka Jan 12 Kurt Hornik StringIndexOutOfBoundsException in RWeka Jan 12