Skip to content

Word boundaries and gregexpr in R 2.2.1

1 message · Stefan Th. Gries

#
Hi

I have a question concerning how to match word boundaries which I bet has a very simple answer, but I haven't found it with trial and error nor by searching the help archives for the terms in the subject line. The problem is this: I have a vector of two character strings.

text<-c("This is a first example sentence.", "And this is a second example 	sentence.")

If I now look for word boundaries with regexpr, this is what I get:
[1] 1 1
attr(,"match.length")
[1] 0 0

So far, so good. But with gregexpr I get:
Error: cannot allocate vector of size 524288 Kb
In addition: Warning messages:
1: Reached total allocation of 1015Mb: see help(memory.size) 
2: Reached total allocation of 1015Mb: see help(memory.size) 

Why don't I get the locations and extensions of all word boundaries?

I am using R 2.2.1 on a machine running Windows XP:
_
platform i386-pc-mingw32
arch     i386
os       mingw32
system   i386, mingw32
status
major    2
minor    2.1
year     2005
month    12
day      20
svn rev  36812
language R

Thanks a lot,
STG