I need to automate a process in R. Basically I have a an R script (I will call it R1) that needs three separate files to run. These three files are the results output of one trial in my study. So from each run in R I obtain the summary results for one trial, in a csv file, plus 32 graphs for each decision point in the trial. One subject goes through of nine trials. I was thinking about putting all the files generated by one subject in one big folder, so I will have 27 files (three files times nine trials). This way I won't have to change working directory multiple times (I wonder if there is a way to have R open a folder with a certain name as directory, run a scripts, move the directory to the next folder, run the script again...) The trials are specified by the labels: AA AB AM BA BB BM MA MB MM. So for subject 1, trial 1, I will have three files with the ending ?mov1_AA For subject one, trial 2, R should choose the three files with the ending ? mov1_AB and so on. At each run, R should save the csv summary output in a folder called ?summary_mov1? and name the files summary_mov1_AA, summary_mov1_AB etc. R should save the 32 graphs in a different folder, named mov1_graphs, graph1_mov1_AA, graph1_mov1_AB and so on (ideally, at this point another R script (R2) should take these nine csv files and build some graphs out of them). Once R has run R1 script nine times, I would proceed to a new subject. So basically: use R1 with three mov1 files, get a summary csv file in a summary folder (plus 32 graphs in a different folder). Do this nine times. Once one get the nine csv summary files in the same folder, use R2 to average them and build a graph. Then do this for each subject (right now I have 7). I have never done this type of automation so I am a bit lost. Any suggestions? Any examples you can point me to? Which would be the best workflow? At which level should I automate and which things should I rather do by hand? Thank you, Serena DeStefani [[alternative HTML version deleted]]
Hi Serena, I think the directory structure you have described is something like this: mov_study___________________________ | | mov1 ... mov9 / \ / \ mov1_csv mov1_graphs mov9_csv mov9_graphs If so, you can put your R scripts in the mov_study directory and change directories like this: for(movdir in paste0("mov",1:9,) { setwd(movdir) source("R1") source("R2") setwd("..") } In R1 and R2 add a "movdir" argument set the target directories for your output like this: R1<-function(...,movdir=movdir) R2<-function(...,movdir=movdir) path_to_csv<-paste(movdir,"csv",sep="_") path_to_graph<-paste(movdir,"graphs",sep="_") and when you write an output file: # for CSV files filename<-paste(path_to_csv,csvfilename,sep="/") # for graph files filename<-paste(path_to_graph,graphfilename,sep="/") Obviously I can't test this on your directory structure, but I think it will do what you want. Jim On Mon, Jul 23, 2018 at 6:26 AM, Serena De Stefani <serenadestefani at gmail.com> wrote:> I need to automate a process in R. Basically I have a an R script (I will > call it R1) that needs three separate files to run. These three files are > the results output of one trial in my study. > > So from each run in R I obtain the summary results for one trial, in a csv > file, plus 32 graphs for each decision point in the trial. > > One subject goes through of nine trials. I was thinking about putting all > the files generated by one subject in one big folder, so I will have 27 > files (three files times nine trials). This way I won't have to change > working directory multiple times (I wonder if there is a way to have R open > a folder with a certain name as directory, run a scripts, move the > directory to the next folder, run the script again...) > > The trials are specified by the labels: AA AB AM BA BB BM MA MB MM. So for > subject 1, trial 1, I will have three files with the ending > ?mov1_AA > > For subject one, trial 2, R should choose the three files with the ending ? > mov1_AB and so on. > > At each run, R should save the csv summary output in a folder called > ?summary_mov1? and name the files summary_mov1_AA, summary_mov1_AB etc. R > should save the 32 graphs in a different folder, named mov1_graphs, > graph1_mov1_AA, graph1_mov1_AB and so on (ideally, at this point another R > script (R2) should take these nine csv files and build some graphs out of > them). > > Once R has run R1 script nine times, I would proceed to a new subject. > > So basically: use R1 with three mov1 files, get a summary csv file in a > summary folder (plus 32 graphs in a different folder). > > Do this nine times. Once one get the nine csv summary files in the same > folder, use R2 to average them and build a graph. > > Then do this for each subject (right now I have 7). > > I have never done this type of automation so I am a bit lost. Any > suggestions? Any examples you can point me to? Which would be the best > workflow? At which level should I automate and which things should I rather > do by hand? > > Thank you, > Serena DeStefani > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Sun, 22 Jul 2018, Serena De Stefani wrote:> I need to automate a process in R. Basically I have a an R script (I will > call it R1) that needs three separate files to run. These three files are > the results output of one trial in my study.> The trials are specified by the labels: AA AB AM BA BB BM MA MB MM. So for > subject 1, trial 1, I will have three files with the ending > ?mov1_AA > > For subject one, trial 2, R should choose the three files with the ending ? > mov1_AB and so on.Serena, In addition to Jim's advice about your directory structure you should seriously consider your file naming convention. Just like variable names in a program, you're almost guaranteed to not remember what each two-character name means within six months of creating them. Spend a little more time typing and use descriptive names ... and think of using a .dat extension and using read.table(*.dat). You can name your files, for example, input_1.R, input_2.R, and input_3.R for your run sources. And, for (e.g.,) subject 1, trial 1, name the file sub1_trial1. This might produce output called sub1_trial1_input1, sub1_trial1_input2, and sub1_trial1_input3. Now when you look at data.frames or output you and everyone else will know just what each contains. Have fun, Rich
Hi Serena I'll add one more "in addition" to this list of suggestions. It may not be what you were thinking of, but may be far simpler in the long run. The complexity of your approach comes from having separate data files for each subject and trial, for which you have to have a convention for naming files and organizing them into coherently named directories. The ideal solution would be to write your data into a single file, in which subjects and trials would just be separate columns. More generally, anything you can do to change separate files into lines/records in a data frame will ease your task. -Michael On 7/22/18 6:40 PM, Rich Shepard wrote:> On Sun, 22 Jul 2018, Serena De Stefani wrote: > >> I need to automate a process in R. Basically I have a an R script (I will >> call it R1) that needs three separate files to run. These three files are >> the results output of one trial in my study. > >> The trials are specified by the labels: AA AB AM BA BB BM MA MB MM. So >> for >> subject 1, trial 1, I will have three files with the ending >> ?mov1_AA >> >> For subject one, trial 2, R should choose the three files with the >> ending ? >> mov1_AB and so on. > > Serena, > > ? In addition to Jim's advice about your directory structure you should > seriously consider your file naming convention. Just like variable names in > a program, you're almost guaranteed to not remember what each two-character > name means within six months of creating them. Spend a little more time > typing and use descriptive names ... and think of using a .dat extension > and > using read.table(*.dat). > > ? You can name your files, for example, input_1.R, input_2.R, and > input_3.R > for your run sources. And, for (e.g.,) subject 1, trial 1, name the file > sub1_trial1. This might produce output called sub1_trial1_input1, > sub1_trial1_input2, and sub1_trial1_input3. > > ? Now when you look at data.frames or output you and everyone else will > know > just what each contains. > > Have fun, > > Rich >