The Platform I am using R on is RHEL3. I run a bash script that collects data into many CSV files and have been processing them one at a time on my local machine with an excel macro. I would like to use R to take data points from each of the CSV files and create line graphs in PDF format because it will save me ALOT of time. I am able to successfully do this when I call the file name directly...however my script bombs when I try to do multiple files. I would like the created pdf's to have the same filename as the original csv files. I have looked quite a bit and not found much help on "batch processing" an entire directory. My current code is as follows: list <- dir("/tmp/data") for(x in list){ d <- read.table(x, sep="\t", header=TRUE) # read data pdf("/tmp/graph/x.pdf") # file for graph plot(d$BlockSeqNum, d$MBs, # Blocks as x, MB/s as y type="l", # plot lines, not points xlab="Blocks", # label x axis ylab="MB/s", # label y axis main=x) # add title dev.off() # close file q() # quit ERROR: Error in plot.window(xlim, ylim, log, asp, ...) : need finite 'xlim' values In addition: Warning messages: 1: no non-missing arguments to min; returning Inf 2: no non-missing arguments to max; returning -Inf 3: no non-missing arguments to min; returning Inf 4: no non-missing arguments to max; returning -Inf Execution halted Thank you for your wisdom! -- View this message in context: http://www.nabble.com/Using-R-to-create-pdf%27s-from-each-file-in-a-directory-tf3621434.html#a10112832 Sent from the R help mailing list archive at Nabble.com.
Duncan Murdoch
2007-Apr-21 02:09 UTC
[R] Using R to create pdf's from each file in a directory
On 4/20/2007 9:40 PM, gecko951 wrote:> The Platform I am using R on is RHEL3. I run a bash script that collects > data into many CSV files and have been processing them one at a time on my > local machine with an excel macro. I would like to use R to take data > points from each of the CSV files and create line graphs in PDF format > because it will save me ALOT of time. I am able to successfully do this > when I call the file name directly...however my script bombs when I try to > do multiple files. I would like the created pdf's to have the same filename > as the original csv files. I have looked quite a bit and not found much > help on "batch processing" an entire directory. My current code is as > follows: > > list <- dir("/tmp/data") > for(x in list){ > d <- read.table(x, sep="\t", header=TRUE) # read data > pdf("/tmp/graph/x.pdf") # file for graph > plot(d$BlockSeqNum, d$MBs, # Blocks as x, MB/s > as y > type="l", # plot lines, not points > xlab="Blocks", # label x axis > ylab="MB/s", # label y axis > main=x) # add title > dev.off() # close file > q() # quit > > ERROR: Error in plot.window(xlim, ylim, log, asp, ...) : > need finite 'xlim' values > In addition: Warning messages: > 1: no non-missing arguments to min; returning Inf > 2: no non-missing arguments to max; returning -Inf > 3: no non-missing arguments to min; returning Inf > 4: no non-missing arguments to max; returning -InfWhat you're doing appears to be a reasonable approach; all you need are some checks that the data is reasonable before calling plot and possibly bombing. I'd suggest adding a "print(x); print(str(d))" statements after you read d; then you'll be able to see when it is that sometimes d doesn't contain plotable data. One other problem you'll find: you're writing all plots to "x.pdf". I think you really want to construct that name differently for each x. I'd suggest using paste() and basename() (and perhaps some regular expressions and gsub()) to construct the output filename. Duncan Murdoch
Robert A LaBudde
2007-Apr-21 03:06 UTC
[R] Using R to create pdf's from each file in a directory
At 09:40 PM 4/20/2007, gecko951 wrote:><snip> >list <- dir("/tmp/data") >for(x in list){ >d <- read.table(x, sep="\t", header=TRUE) # read data >pdf("/tmp/graph/x.pdf") # file for graph ><snip>I'm a tyro at R, but it's obvious here that the line pdf("/tmp/graph/x.pdf") has the intended file name string 'x' embedded with the literal string enclosed in quotation marks. Obviously he needs "/tmp/graph/" string concatenated with x and then ".pdf". I would suggest a fix, but I am unable to find in the documentation how to concatenate two strings into a single string. So I will amplify gecko951's question to include "How you you concatenate two strings in R?". I.e., x<-"abc" y<-"def" What operator or function returns "abcdef" as a result? Thanks. ===============================================================Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: ral at lcfltd.com Least Cost Formulations, Ltd. URL: http://lcfltd.com/ 824 Timberlake Drive Tel: 757-467-0954 Virginia Beach, VA 23464-3239 Fax: 757-467-2947 "Vere scire est per causas scire"
Jeffrey Horner
2007-Apr-21 04:10 UTC
[R] Using R to create pdf's from each file in a directory
gecko951 wrote:> The Platform I am using R on is RHEL3. I run a bash script that collects > data into many CSV files and have been processing them one at a time on my > local machine with an excel macro. I would like to use R to take data > points from each of the CSV files and create line graphs in PDF format > because it will save me ALOT of time. I am able to successfully do this > when I call the file name directly...however my script bombs when I try to > do multiple files. I would like the created pdf's to have the same filename > as the original csv files. I have looked quite a bit and not found much > help on "batch processing" an entire directory. My current code is as > follows: > > list <- dir("/tmp/data") > for(x in list){ > d <- read.table(x, sep="\t", header=TRUE) # read data > pdf("/tmp/graph/x.pdf") # file for graph > plot(d$BlockSeqNum, d$MBs, # Blocks as x, MB/s > as y > type="l", # plot lines, not points > xlab="Blocks", # label x axis > ylab="MB/s", # label y axis > main=x) # add title > dev.off() # close file > q() # quitBelow will get you closer to what you want, assuming that your files end in .csv, which they should, especially if you'll be creating new files in the same directory with a different extension. You certainly don't want to re-run your R code and call read.table on a pdf. Another point is that your current working directory for R, returned by getwd(), is already '/tmp/data'. Otherwise read.table wouldn't work, and a more portable solution is to use a variable to hold the directory name: workdir <- '/tmp/data' for (x in dir(workdir,pattern='.csv$')){ d <- read.table(paste(workdir,'/',x,sep=''), sep="\t", header=TRUE) pdf(paste(workdir,'/',sub('.csv$','.pdf',x),sep='')) plot(d$BlockSeqNum, d$MBs, type="l", xlab="Blocks", ylab="MB/s", main=x) dev.off() } q() Best, Jeff --- http://biostat.mc.vanderbilt.edu/JeffreyHorner
Reasonably Related Threads
- [PATCH node-image] Moved all temporary files into a single work directory to clean up.
- Strange du/df behaviour.
- [PATCH node] Update autobuild and autotest scripts for new build structure
- synthesizing yum transactions
- [PATCH node] Rerunning network config resets all network config. bz#507393