Skip to content

Help: stemming and stem completion with package tm in R

2 messages · Yanchang Zhao, Felix Andrews

3 days later
#
Hi Yanchang,

The problem seems to be that stemCompletion only looks for words that
begin with "mine", and "mining" does not strictly begin with "mine". I
don't think there is any easy way to modify stemCompletion to get
around that.

However, maybe you could substitute the most prevalent word in your
document for each of the stemmed words, then you would not need to use
stemCompletion at all: e.g.

topfreq <- function(x) rev(names(sort(table(x))))[1]
(d <- ave(a, b, FUN = topfreq))
# [1] "mining" "miners" "mining"

Cheers
Felix
On 4 November 2011 12:28, Yanchang Zhao <yanchangzhao at gmail.com> wrote: