Dear all I imported a Stata .dta file with the read.dta-function from the foreign-package. The dataframe's dimensions are > dim(d.apc) [1] 15806 1300 Importing needs up to 15 min and calculations with these data are rather slow (although I subset the data before starting analyses). My questions are: 1. Has someone experiences importing Stata files (alternatives to read.dta) ? 2. To my knowledge R should not have problems handling dataframes of this size. Is there something I can do after importing that makes data handling faster? My hardware is up-to-date (Intel P4, 3 Ghz, 1 GB RAM) and I work on a Windows XP platform. I am working on a Windows XP platform with R version 2.1 (all packages updated). Thanks for your answers. Christian -- Christian Bieli, project assistant Institute of Social and Preventive Medicine University of Basel, Switzerland Steinengraben 49 CH-4051 Basel Tel.: +41 61 270 22 12 Fax: +41 61 270 22 25 christian.bieli at unibas.ch www.unibas.ch/ispmbs
I think it is well worth the effort to start using a database system like e.g. MySql for such purposes. If you look at http://gbi.agrsci.dk/~sorenh/misc/R-SAS-MySql/R-SAS-MySql.html then you'll find a short - and rudimentary - description of how to use MySql in connection with R and SAS (on Windows). The time you'll have to spend to get it up and running (about 30 minutes) is well spent. I suppose you can take your stata data and save as a comma separate file. Such a file is easy to put into a MySql database (although I haven't written how). Perhaps Stata can connect directly to MySql? Best regards S??ren ________________________________ Fra: r-help-bounces at stat.math.ethz.ch p?? vegne af Christian Bieli Sendt: ti 14-02-2006 15:24 Til: R help list Emne: [R] How to handle large dataframes? Dear all I imported a Stata .dta file with the read.dta-function from the foreign-package. The dataframe's dimensions are > dim(d.apc) [1] 15806 1300 Importing needs up to 15 min and calculations with these data are rather slow (although I subset the data before starting analyses). My questions are: 1. Has someone experiences importing Stata files (alternatives to read.dta) ? 2. To my knowledge R should not have problems handling dataframes of this size. Is there something I can do after importing that makes data handling faster? My hardware is up-to-date (Intel P4, 3 Ghz, 1 GB RAM) and I work on a Windows XP platform. I am working on a Windows XP platform with R version 2.1 (all packages updated). Thanks for your answers. Christian -- Christian Bieli, project assistant Institute of Social and Preventive Medicine University of Basel, Switzerland Steinengraben 49 CH-4051 Basel Tel.: +41 61 270 22 12 Fax: +41 61 270 22 25 christian.bieli at unibas.ch www.unibas.ch/ispmbs ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
do you want to use the dataset as it is? if not, it might be better save the data into a database, such as sqlite, and then odbc to it? Then you can use sql to only fetch the data you want for analysis. On 2/14/06, Christian Bieli <christian.bieli@unibas.ch> wrote:> > Dear all > > I imported a Stata .dta file with the read.dta-function from the > foreign-package. The dataframe's dimensions are > > > dim(d.apc) > [1] 15806 1300 > > Importing needs up to 15 min and calculations with these data are rather > slow (although I subset the data before starting analyses). > > My questions are: > 1. Has someone experiences importing Stata files (alternatives to > read.dta) ? > 2. To my knowledge R should not have problems handling dataframes of > this size. Is there something I can do after importing that makes data > handling faster? > > My hardware is up-to-date (Intel P4, 3 Ghz, 1 GB RAM) and I work on a > Windows XP platform. > I am working on a Windows XP platform with R version 2.1 (all packages > updated). > > Thanks for your answers. > Christian > > -- > Christian Bieli, project assistant > Institute of Social and Preventive Medicine > University of Basel, Switzerland > Steinengraben 49 > CH-4051 Basel > Tel.: +41 61 270 22 12 > Fax: +41 61 270 22 25 > christian.bieli@unibas.ch > www.unibas.ch/ispmbs > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >-- WenSui Liu (http://statcompute.blogspot.com) Senior Decision Support Analyst Health Policy and Clinical Effectiveness Cincinnati Children Hospital Medical Center [[alternative HTML version deleted]]