I have a question about stacking datasets. I have 40 stata datasets that have exactly the same number of variables, with the same names (~420k rows, 8 columns). The datasets are relatively large ~ 15 megs. If they were text files a linux "cat file1 file2 >> combo" sort of strategy would work. I've considered using a merge command, but I don't want any records merged, only appended. Also, I don't want any variables to be renamed. Given there unique nature, using a simple merge(read.dta('file1'),read.dta('file2')) would give me (I think) what I am looking for but seems incredibly inefficient. There are a number of ways I could approach this all of which involve a non-R solution. Does somebody have an R solution in mind? Thanks in advance, Debra Taylor -------------- next part -------------- An HTML attachment was scrubbed... URL: https://stat.ethz.ch/pipermail/r-help/attachments/20010719/621a5a9d/attachment.html
"Debra Taylor" <debrat at bestweb.net> writes:> I have a question about stacking datasets. > > I have 40 stata datasets that have exactly the same number of variables, > with the same names (~420k rows, 8 columns). > > The datasets are relatively large ~ 15 megs. > > If they were text files a linux "cat file1 file2 >> combo" sort of > strategy would work. > > I've considered using a merge command, but I don't want any records > merged, only appended. Also, I don't want any variables to be renamed. > Given there unique nature, using a simple > merge(read.dta('file1'),read.dta('file2')) would give me (I think) what > I am looking for but seems incredibly inefficient. > > There are a number of ways I could approach this all of which involve a > non-R solution. > > Does somebody have an R solution in mind?rbind? -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Debra Taylor <debrat at bestweb.net> writes:> I have a question about stacking datasets. > > I have 40 stata datasets that have exactly the same number of > variables, with the same names (~420k rows, 8 columns). > > The datasets are relatively large ~ 15 megs. > > If they were text files a linux "cat file1 file2 >> combo" sort of > strategy would work.[Snip]> Does somebody have an R solution in mind?Read each data file in as a data frame then use the rbind() function. Something like: a <- read.table("file1", header = T) b <- read.table("file2", header = T) my.data <- rbind(a, b) # # remove the excess 15MB ... # rm(b) a <- read.table("file3", header = T) my.data <- rbind(ab, a) # # and so on ... finally removing the excess 15MB ... # rm(a) Mark -- Mark Myatt -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._