Skip to content

Text Mining - Remove punctuation not removing quotes and dashes

1 message · Anindya Sankar Dey

#
Hi,

I have been doing some text mining. I created the DTM matrix using the
following steps.

corpus1<-VCorpus(VectorSource(resume1$Dat1))

corpus1<-tm_map(corpus1,content_transformer(tolower))

dtm<-DocumentTermMatrix(corpus1,
                               control = list(removePunctuation = TRUE,
                                              removeNumbers = TRUE,
                                              removeSparseTerms=TRUE,
                                                stopwords = TRUE))


?After all the run I am still getting words like -quotation, "fun, model"?
, etc.

What can I do about it. I do not need this dahses and extra quotations.