I have a question about stacking datasets.
I have 40 stata datasets that have exactly the same number of variables,
with the same names (~420k rows, 8 columns).
The datasets are relatively large ~ 15 megs.
If they were text files a linux "cat file1 file2 >> combo" sort
of
strategy would work.
I've considered using a merge command, but I don't want any records
merged, only appended. Also, I don't want any variables to be renamed.
Given there unique nature, using a simple
merge(read.dta('file1'),read.dta('file2')) would give me (I
think) what
I am looking for but seems incredibly inefficient.
There are a number of ways I could approach this all of which involve a
non-R solution.
Does somebody have an R solution in mind?
Thanks in advance,
Debra Taylor
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
https://stat.ethz.ch/pipermail/r-help/attachments/20010719/621a5a9d/attachment.html
"Debra Taylor" <debrat at bestweb.net> writes:> I have a question about stacking datasets. > > I have 40 stata datasets that have exactly the same number of variables, > with the same names (~420k rows, 8 columns). > > The datasets are relatively large ~ 15 megs. > > If they were text files a linux "cat file1 file2 >> combo" sort of > strategy would work. > > I've considered using a merge command, but I don't want any records > merged, only appended. Also, I don't want any variables to be renamed. > Given there unique nature, using a simple > merge(read.dta('file1'),read.dta('file2')) would give me (I think) what > I am looking for but seems incredibly inefficient. > > There are a number of ways I could approach this all of which involve a > non-R solution. > > Does somebody have an R solution in mind?rbind? -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Debra Taylor <debrat at bestweb.net> writes:> I have a question about stacking datasets. > > I have 40 stata datasets that have exactly the same number of > variables, with the same names (~420k rows, 8 columns). > > The datasets are relatively large ~ 15 megs. > > If they were text files a linux "cat file1 file2 >> combo" sort of > strategy would work.[Snip]> Does somebody have an R solution in mind?Read each data file in as a data frame then use the rbind() function. Something like: a <- read.table("file1", header = T) b <- read.table("file2", header = T) my.data <- rbind(a, b) # # remove the excess 15MB ... # rm(b) a <- read.table("file3", header = T) my.data <- rbind(ab, a) # # and so on ... finally removing the excess 15MB ... # rm(a) Mark -- Mark Myatt -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._