Hello, I am working on a project. The new data files is coming as the data collectors get data, then the data collectors put these new data files in a folder. I need to read these new data files when they are in folder. so far, I did this job manually, that is to say, each time I go to that folder and find new data files, then use my R program to read these new data files. I am wondering if anyone know how to perform this job automatically in R. thanks, jlm
You can read the status of every file in a directory and make the decision to process it. One technique is to create a file in the directory the last time that you processed information from the directory. You could schedule an R script to first read in your 'flag' file and determine the date it was created and then get all the files in the directory that are later than that date to process them. You would then rewrite your flag file to update its modification date for the next round. Does this do what you want? On Sun, Jan 24, 2010 at 3:05 PM, jlfmssm <jlfmssm at gmail.com> wrote:> Hello, > > I am working on a project. The new data files is coming as the data > collectors get data, then > the data collectors put these new data files in a folder. I need to > read these new data files when they are in folder. > so far, I did this job manually, that is to say, each time I go to > that folder and find new data files, then use my R program to > read these new data files. I am wondering if anyone know how to > perform this job automatically in R. > > thanks, > > jlm > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
Carlos J. Gil Bellosta
2010-Jan-24 20:13 UTC
[R] Read files in a folder when new data files come
Hello, Could you tell us something more about your infrastructure? Windows? Linux? On Unix/Linux you could use cron to have a R process to read all the files in the given directory, process them one by one and archive them in another place. On Windows, no idea. Alternatively, you could perhaps ask your users to use some kind of web interface to upload the data. This interface could then trigger an R process. Best regards, Carlos J. Gil Bellosta http://www.datanalytics.com jlfmssm wrote:> Hello, > > I am working on a project. The new data files is coming as the data > collectors get data, then > the data collectors put these new data files in a folder. I need to > read these new data files when they are in folder. > so far, I did this job manually, that is to say, each time I go to > that folder and find new data files, then use my R program to > read these new data files. I am wondering if anyone know how to > perform this job automatically in R. > > thanks, > > jlm > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Barry Rowlingson
2010-Jan-24 20:21 UTC
[R] Read files in a folder when new data files come
On Sun, Jan 24, 2010 at 8:05 PM, jlfmssm <jlfmssm at gmail.com> wrote:> Hello, > > I am working on a project. The new data files is coming as the data > collectors get data, then > the data collectors put these new data files in a folder. I need to > read these new data files when they are in folder. > so far, I did this job manually, that is to say, each time I go to > that folder and find new data files, then use my R program to > read these new data files. I am wondering if anyone know how to > perform this job automatically in R.Without needing some operating-system specific hackery, the easiest way would be to use 'list.files()' and look for new files every so many minutes or seconds (depending on how urgent it is). Or to check file.info() on your directory and test the modification time. You'd then write that into a .R file and run that in the background using your operating system's background job functionality (as a 'service' in Windows, or as a background process in Unix). Use Sys.sleep(seconds) to wait in your loop. Something like (totally untested): lastChange = file.info(dumpLocation)$mtime while(TRUE){ currentM = file.info(dumpLocation)$mtime if(currentM != lastChange){ lastChange = currentM doSomethingWithStuffIn(dumpLocation) } # try again in 10 minutes Sys.sleep(600) } There are ways for programs to get directory content change events when files appear in directories, but they will probably be very operating system specific. There's also the problem of your code firing up when a file is only half-uploaded - what do you do then? Does your data format have an 'end of data' marker? Barry -- blog: http://geospaced.blogspot.com/ web: http://www.maths.lancs.ac.uk/~rowlings web: http://www.rowlingson.com/ twitter: http://twitter.com/geospacedman pics: http://www.flickr.com/photos/spacedman