Parsing and counting expressions in .txt-files
also check out this CRAN task view: https://cran.r-project.org/web/views/NaturalLanguageProcessing.html Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Wed, Apr 20, 2016 at 9:07 AM, Alexander Nikles <24790 at novasbe.pt> wrote:
Dear Community,
I hope that I have the right category selected because I am relatively new
to the "R" world. I come with a relatively challenging problem in the
luggage. I would like to realize, that "R" reads text files (there are
several hundred pieces in my folder) sequentially, and screens for specific
terms. If the term is found, the program should write a 1, if not a 0.
Another task is to scrape a ten-digit number from the file after a
particular keyword, so that I can map the results. The Programm should
create an .txt file ideally.
A brief example:
Keywords: "surpassed" "achieved", "very motivated"
Text1:
"Personnel number: 0123456789
The employee has exceeded the set targets and was also otherwise always
motivated (...) "
So I want that my program for this case, ideally reflects the following (in
lines and columns=
Personell number;surpassed;achieved; very motivated (do not write)
0123456789;1;0;1
For the following files, he shall all continue analogously in line 2, 3, 4
and so on.
Could you give a brief assessment, how to realize such a thing? How do I
start best and whether you are possibly "stumbled" in advance about
something similar in R? I am grateful for any suggestions/proposals.
Thank you in advance,
Alex
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.