Dear Community, I hope that I have the right category selected because I am relatively new to the "R" world. I come with a relatively challenging problem in the luggage. I would like to realize, that "R" reads text files (there are several hundred pieces in my folder) sequentially, and screens for specific terms. If the term is found, the program should write a 1, if not a 0. Another task is to scrape a ten-digit number from the file after a particular keyword, so that I can map the results. The Programm should create an .txt file ideally. A brief example: Keywords: "surpassed" "achieved", "very motivated" Text1: "Personnel number: 0123456789 The employee has exceeded the set targets and was also otherwise always motivated (...) " So I want that my program for this case, ideally reflects the following (in lines and columns Personell number;surpassed;achieved; very motivated (do not write) 0123456789;1;0;1 For the following files, he shall all continue analogously in line 2, 3, 4 and so on. Could you give a brief assessment, how to realize such a thing? How do I start best and whether you are possibly "stumbled" in advance about something similar in R? I am grateful for any suggestions/proposals. Thank you in advance, Alex [[alternative HTML version deleted]]
I suggest you go through some R tutorials to learn about R's capabilities. Some recommendations can be found here: https://www.rstudio.com/online-learning/#R To answer your specific query: ?scan ## Because you do not specify file format. ?grep ?regexp ## to use regular expressions to find text. R may not be the best tool for this task, however. Or certain R packages may be better than the basic R tools. Try searching on the rseek.org site to see what might be available if you do not receive suggestions here. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Apr 20, 2016 at 9:07 AM, Alexander Nikles <24790 at novasbe.pt> wrote:> Dear Community, > > > > I hope that I have the right category selected because I am relatively new > to the "R" world. I come with a relatively challenging problem in the > luggage. I would like to realize, that "R" reads text files (there are > several hundred pieces in my folder) sequentially, and screens for specific > terms. If the term is found, the program should write a 1, if not a 0. > Another task is to scrape a ten-digit number from the file after a > particular keyword, so that I can map the results. The Programm should > create an .txt file ideally. > > > > A brief example: > > > > Keywords: "surpassed" "achieved", "very motivated" > > Text1: > > "Personnel number: 0123456789 > > > > The employee has exceeded the set targets and was also otherwise always > motivated (...) " > > > > So I want that my program for this case, ideally reflects the following (in > lines and columns> > > > Personell number;surpassed;achieved; very motivated (do not write) > 0123456789;1;0;1 > > > For the following files, he shall all continue analogously in line 2, 3, 4 > and so on. > > > > Could you give a brief assessment, how to realize such a thing? How do I > start best and whether you are possibly "stumbled" in advance about > something similar in R? I am grateful for any suggestions/proposals. > > > > Thank you in advance, > > > > Alex > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
also check out this CRAN task view: https://cran.r-project.org/web/views/NaturalLanguageProcessing.html Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Apr 20, 2016 at 9:07 AM, Alexander Nikles <24790 at novasbe.pt> wrote:> Dear Community, > > > > I hope that I have the right category selected because I am relatively new > to the "R" world. I come with a relatively challenging problem in the > luggage. I would like to realize, that "R" reads text files (there are > several hundred pieces in my folder) sequentially, and screens for specific > terms. If the term is found, the program should write a 1, if not a 0. > Another task is to scrape a ten-digit number from the file after a > particular keyword, so that I can map the results. The Programm should > create an .txt file ideally. > > > > A brief example: > > > > Keywords: "surpassed" "achieved", "very motivated" > > Text1: > > "Personnel number: 0123456789 > > > > The employee has exceeded the set targets and was also otherwise always > motivated (...) " > > > > So I want that my program for this case, ideally reflects the following (in > lines and columns> > > > Personell number;surpassed;achieved; very motivated (do not write) > 0123456789;1;0;1 > > > For the following files, he shall all continue analogously in line 2, 3, 4 > and so on. > > > > Could you give a brief assessment, how to realize such a thing? How do I > start best and whether you are possibly "stumbled" in advance about > something similar in R? I am grateful for any suggestions/proposals. > > > > Thank you in advance, > > > > Alex > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.