Dear R helpers, this is a question connex to the subject of reading large data-sets - I have a large db (MySQL) ~80'000 records for each year / 120 variates from which I extract relevant info to be treated in R (no problem) Now each record in my db represents an experimental unit; moreover I will get - in a few days- a set of separate files(one for each records....~80'000 files pro year) named key.txt in which other info (measurements of pressure and velocity at each second ) is stocked; typically each one of this separate files will hold something between 200 and 1500 records.... I'm supposed to do statistical analysis on this dataset, on the yearly information. I haveNO influence on the format on which I' getting all this information Well, as you can suspect my problem is thus: - batch processing of these individual files :reading each file (automatically!), probably pattern analysis, then estimation of the relevant physical entity; for this I already need to make the connexion with the info from the database (geometrical parameters, nb in cycle....) - input of these estimated physical values and relevant info in the statistical analysis of the process What is the best way to get around such a task in R? Many thanks Anne ---------------------------------------------------- Anne Piotet Tel: +41 79 359 83 32 (mobile) Email: anne.piotet@m-td.com --------------------------------------------------- M-TD Modelling and Technology Development PSE-C CH-1015 Lausanne Switzerland Tel: +41 21 693 83 98 Fax: +41 21 646 41 33 -------------------------------------------------- [[alternative HTML version deleted]]
>I will get - in a few days- a set of separate files(one for eachrecords....~80'000 files pro year)> named key.txt in which other info (measurements of pressure and velocityat each second )>is stocked; typically each one of this separate files will hold somethingbetween 200 and>1500 records....I'm supposed to do statistical analysis on this dataset,on the yearly information>Well, as you can suspect my problem is thus: >- batch processing of these individual files :reading each file(automatically!), I suggest you put all files in a single directory, then you use list.files () or dir() to make a list of this files. myfiles <- list.files ("c:\\mydir") #on windows Then you do a loop over these files with read.table() You can create a column named "file" with the name of the file. All of this can be done with: data <- NULL;for (i in dir("c:\\temp")) data <- rbind(data,cbind(file=i,read.table( paste("c:\\mydir\\",i,sep="")))) Hope that helps Mayeul KAUFFMANN Univ. Pierre Mendes France Grenoble - France