Joseph Sorell
2011-Aug-14 04:34 UTC
[R] Using get() or similar function to access more than one element in a vector
Dear R-users, I've written a script that produces a frequency table for a group of texts. The table has a total frequency for each word type and individual frequency counts for each of the files. (I have not included the code for creating the column headers.) Below is a sample: Word Total 01.txt 02.txt 03.txt 04.txt 05.txt the 22442 2667 3651 1579 2132 3097 I 18377 3407 454 824 449 3746 and 15521 2377 2174 891 1006 2450 to 13598 1716 1395 905 1021 1983 of 12834 1647 1557 941 1127 1887 it 12440 2160 916 497 493 2449 you 12036 2283 356 293 106 2435 I've encountered two problems when I try to construct and save the file. The "combined.sorted.freq.list" is a named integer vector in which the integers are the total frequency counts for each word. The names are the words. For each of the individual lists I've created frequency lists that are sorted in the order of the combined list. (NAs have been replaced with "0"). These are called "combined." plus the number of the file. If I were to write the line to save the file manually, it would look like this: combined.table<-paste(names(combined.sorted.freq.list), combined.sorted.freq.list, combined.01, combined.02, combined.03, combined.04, combined.05, combined.06, combined.07, combined.08, combined.09, combined.10, combined.11, combined.12, sep="\t") #creates a table with columns for the combined and all of the component lists However, each time I run the script, there may be a differing number of text files. I created a list of the individual frequency counts called "combined.file.list" combined.file.count<-1:length(selected.files) #counts number of files originally selected combined.file.list<-paste("combined", combined.file.count, sep=".") #creates the file names for the combined lists by catenating "combined" with each file number separated by a period by recycled the string "combined for each number I then tried to include it as one of the elements to be pasted by using get(). combined.table<-paste(names(combined.sorted.freq.list), combined.sorted.freq.list, get(combined.file.list[]), sep="\t") #intended to create a table with columns for the combined and all of the component lists Unfortunately, the get() function only gets the first component list since get() can apparently only access one object. This results in a table with only the total frequency and the amount of the first text: Word Total 01.txt the 22442 2667 I 18377 3407 and 15521 2377 to 13598 1716 of 12834 1647 it 12440 2160 you 12036 2283 If I try to construct the file "piece by piece" as they are created, I get an error message that a vector of more than 1.3 Gb cannot be created. Does anyone know how I could use get() or some other method to access all of the files named in a vector? Many thank for any help you can offer! Joseph
Joshua Wiley
2011-Aug-14 05:57 UTC
[R] Using get() or similar function to access more than one element in a vector
Hi Joseph, Without a reproducible example, you probably will not get the precise code for a solution but look at ?list Rather than doing what you are doing now, put everything into a list, and then you will not need to use get() at all. You will just work with the whole list. It can take a bit to get to get used to working that way, but it is worth it. Cheers, Josh On Sat, Aug 13, 2011 at 9:34 PM, Joseph Sorell <josephsorell at gmail.com> wrote:> Dear R-users, > > I've written a script that produces a frequency table for a group of > texts. The table has a total frequency for each word type and > individual frequency counts for each of the files. (I have not > included the code for creating the column headers.) Below is a sample: > > Word ?Total ? ? 01.txt ?02.txt ?03.txt ?04.txt ?05.txt > the ? ? 22442 ? 2667 ? ?3651 ? ?1579 ? ?2132 ? ?3097 > I ? ? ? 18377 ? 3407 ? ? ?454 ? ? 824 ? ? 449 ? 3746 > and ? ? 15521 ? 2377 ? ?2174 ? ? ?891 ? 1006 ? ?2450 > to ? ? ?13598 ? 1716 ? ?1395 ? ? ?905 ? 1021 ? ?1983 > of ? ? ?12834 ? 1647 ? ?1557 ? ? ?941 ? 1127 ? ?1887 > it ? ? ?12440 ? 2160 ? ? ?916 ? ? 497 ? ? 493 ? 2449 > you ? ? 12036 ? 2283 ? ? ?356 ? ? 293 ? ? 106 ? 2435 > > I've encountered two problems when I try to construct and save the file. > > The "combined.sorted.freq.list" is a named integer vector in which the > integers are the total frequency counts for each word. The names are > the words. For each of the individual lists I've created frequency > lists that are sorted in the order of the combined list. (NAs have > been replaced with "0"). These are called "combined." plus the number > of the file. > If I were to write the line to save the file manually, it would look like this: > > combined.table<-paste(names(combined.sorted.freq.list), > combined.sorted.freq.list, combined.01, combined.02, combined.03, > combined.04, combined.05, combined.06, combined.07, combined.08, > combined.09, combined.10, combined.11, combined.12, sep="\t") > #creates a table with columns for the combined and all of the > component lists > > However, each time I run the script, there may be a differing number > of text files. I created a list of the individual frequency counts > called "combined.file.list" > > combined.file.count<-1:length(selected.files) #counts number of files > originally selected > combined.file.list<-paste("combined", combined.file.count, sep=".") > #creates the file names for the combined lists by catenating > "combined" with each file number separated by a period by recycled the > string "combined for each number > > I then tried to include it as one of the elements to be pasted by using get(). > > combined.table<-paste(names(combined.sorted.freq.list), > combined.sorted.freq.list, get(combined.file.list[]), sep="\t") > #intended to create a table with columns for the combined and all of > the component lists > > Unfortunately, the get() function only gets the first component list > since get() can apparently only access one object. > > This results in a table with only the total frequency and the amount > of the first text: > > Word ?Total ? ? 01.txt > the ? ? 22442 ? 2667 > I ? ? ? 18377 ? 3407 > and ? ? 15521 ? 2377 > to ? ? ?13598 ? 1716 > of ? ? ?12834 ? 1647 > it ? ? ?12440 ? 2160 > you ? ? 12036 ? 2283 > > If I try to construct the file "piece by piece" as they are created, I > get an error message that a vector of more than 1.3 Gb cannot be > created. Does anyone know how I could use get() or some other method > to access all of the files named in a vector? > > Many thank for any help you can offer! > > Joseph > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/
Reasonably Related Threads
- [Bug 12440] New: make "rsync -N" == "rsync --numeric-ids" ?
- [Bug 12036] New: Multiple --link-dest, --copy-dest, or --compare-dest flags produce incorrect behavior
- Invalid Extension
- Hotplug of disk devices in LXC failed with libvirt of version 1.0.2
- Avoid dropped packets