I'm attaching an small text file on "Getting your stuff organized in R". (Sorry if sending an attachment is not considered a correct etiquette in r-help, but this is only 7911 bytes, plain ascii text and I cannot post it in a web page at the moment). Probably all the information in this document is scattered in one or more R introduction guides, but I think that it is useful to have it concentrated under this title. The number of R objects that are created by the user grows fast and the way R stores them is kind of particular (most other packages create a unique disk file for each object). Therefore, it is important for anyone starting with R to learn how to organize his/her R objects and avoid messing up everything into one single, often large .RData file. I send this document to the list with the hope that people will correct errors and suggest alternative, better methods. Please do so directely to alobo at ija.csic.es, not the list. After your feedback, I'll format it as pdf or html and send it to the Contributed Documentation section of the R-CRAN pages. Thanks Agus Dr. Agustin Lobo Instituto de Ciencias de la Tierra (CSIC) Lluis Sole Sabaris s/n 08028 Barcelona SPAIN tel 34 93409 5410 fax 34 93411 0012 alobo at ija.csic.es -------------- next part -------------- Getting your stuff organized in R Probably all this information is scattered in one or more R introduction guides, but I think that it is useful to have it concentrated under this title. If after a first contact with R you have decided to use it, you will want to start working with your own data as soon as possible. R does not create a unique disk file for each object, which is the most comon situation for other packages and probably you are a bit confused with this. Also, the number of data and function objects can grow really fast in your R sessions. Therefore, as the number of your R objects grows and the way R stores them is kind of idiosyncratic, it is important for you to learn how to organize your R objects just prior to start working with your own data. 1. As you know from the R-start.pdf, R keeps everything in memory. Therefore, it is sage to often type> save.image()which will save to disk everything that is listed after> ls()into a file named .RData, which is located in the same directory from whence you started R. Remember that you must use ls -a in order to list this file along with any other file starting by "." in unix systems. Therefore, the first "organizing" rule is simply keep your projects into separate directories and launch R from the appropriate directory. 2. You can save to a different file and/or another directory with:> save(object1,object2,file="myobjects1&2")3. It is useful to take advantage of the capabilities of ls() to select what you want, i.e.:>ls(pat="liss")[1] "lissNPC100" "lissNPC100.ady" "lissNPC100.stat" "lissNPC1100.ref">save(list=ls(pat="liss"),file="lissobjects")4. You normaly will need functions that are not in the base package and that are not made available to you after a default R start. You normally don't want these functions in your workspace, as they would get saved with save.image() into .RData and mixed with your objects (which probably also include "inmature" functions). If you require functions from a CRAN package, you just use:>library(Rstreams)If you type ls() afterwards, you wont see the Rstreams functions. For the shake of organization, R does not load the package into your workspace, although the functions are available for you to be used. If you type>search()you will get something like:> search()[1] ".GlobalEnv" "package:Rstreams" "package:ctest" "Autoloads" [5] "package:base" which lists your workspace (named ".GlobalEnv"), the package you just attached (which goes, by default, to position 2), and "package:ctest", "Autoloads" and "package:base", which were automatically attached at starting R. Now, if you type>ls(2)you will get the listing of the Rstreams package. 5. As you develop your project, you transform your original data and often create new data frames and data matrices. In order to keep the original data safe, it's a good idea to keep them in a separate file. Another reason to separate the original data is that they might be large data files, while you most often work with data that have been selected or sampled from the original file. As R automatically will load your .RData in memory, it's more efficient not to load any large object unless you really need it. You can save the original file to a different file with:>save(data1.ori,"data1ori.rda")and then you can delete the object from your workspace: the next .Rdata file that you'll make by using save.image() or by quiting R and saving the workspace, will not include data1.ori. 6. If it happens that you need data1.ori afterwards, you should use>attach("data1ori.rda")rather than>load("data1ori.rda")Using attach("data1ori.rda"), your object data1.ori will be loaded into a different environment (pos=2 by default), which implies that you'll be able to use it but will not be mixed up with your "every day work" when you use save.image and/or quit R. You can type>search()before and after attach("data1ori.rda") to see the result. 7. As R integrates a large number of statistical methods and graphics with a high-level language, your work will imply creating a number of functions of your own. As soon as your functions attain a certain "maturity" and you consider them of general use for your own work, you should organize them as packages (see "Creating R packages" in R-exts.pdf). 8. Meanwhile, it's also a good idea to save your functions into a different file, or use that file as an intermediate step between the workspace and the library. A good reason to separate functions from other objects is that you might want to use a function that you developped for another project. Keeping functions and data objects in a different files will let you attach the functions while avoiding the data objects. Remember that you do not want to attach anything that you do not need because it costs you memory. The following function will let you list only the functions present in a given environment (your workspace by default):> lsffunction (pos=1) { a <- b <- ls(pos=pos) for (i in 1:length(a)) { b[i] <- mode(get(a[i])) } a[grep("function",b)] }> lsf()[1] "disc.qda" "edges" "ima.explore2" [4] "imagen" "imagenrgb" "lsf" [7] "mat.select" "no.na.mat" "no.rep.mat" [10] "parcelas.lda" "parcelas.liss.func" "reclas" [13] "rescale" "utm2lincol" You can use lsf() to save your functions to a file:> save(list=lsf(), file="Rfunctions.rda")9. Actually, it's more usual to save functions in text format, which you can do with:> dump(list=lsf(),file="testdump.R")But you cannot use either load() or attach() with files created by dump(). Instead, you must use source()> source("testdump.R")but beware that source() will create the functions in your workspace. I've not found any way to direct source() to another position. 10.Sometime wou will want to add an object from your workspace to an existing R disk file. For example, you'll want to add a new function developped in your workspace to the functions file of your project. You just need the option append in dump() for this purpose:> dump("mynewfunc",file="proj1funcs.R",append=T)It's a bit more complicated to add a data object to an R binary file, because there is not an "append" option in save(). But you can use ls() in the following way:> search()[1] ".GlobalEnv" "package:ctest" "Autoloads" "package:base"> attach("lissN543cod.R") > search()[1] ".GlobalEnv" "file:lissN543cod.R" "package:ctest" [4] "Autoloads" "package:base"> ls(2)[1] "lissN543.cod" "lissN543E.cod" "lissN543W.cod" Now, assuming we want to add an object "a" to lissN543cod, we would type:>save(list=c("a",ls(2)),file="lissN543cod_v2.rda")Note the "" in the list argument. Once lissN543cod_v2 is checked, we can delete lissN543cod. 11. In order to copy an object from the workspace to another environment, you can use assign():> search()[1] ".GlobalEnv" "file:lissN543cod.R" "package:ctest" [4] "Autoloads" "package:base"> ls(2)[1] "lissN543.cod" "lissN543E.cod" "lissN543W.cod"> assign(get(a),a,pos=2) > ls(2)[1] "a" "lissN543.cod" "lissN543E.cod" "lissN543W.cod" You can delete a from the workspace, but beware that in such a case a will not be saved by save.image() or at quiting R. You would need to use:> save(list=ls(2),file="newfile.rda")12. If you have several projects, you might forget what objects were in a given R binary file created with save(). Unfortunately, I've not found any way to list the contents of such a file unless it is attached or loaded. Also, selecting objects for loading from a R binary file seems not possible. Hope this notes are useful. Please send your comments, corrections etc. to alobo at ija.cisc.es Note that R is a collaborative project, which also applies for documentation and guides!
Dear list, Here is a feedback on "Getting your stuff organized in R". I find the paper generally useful. However, I would not encourage anybody to use .RData for the storage of data, objects and results, because .RData is overwritten so often that a little carelessness may easily cause loss of important data (R started from wrong directory etc.). It is better to choose clear project-related filenames for save.image, if there is important stuff (if not, why save.image?). Best, Christian *********************************************************************** Christian Hennig University of Hamburg, Faculty of Mathematics - SPST/ZMS (Schwerpunkt Mathematische Statistik und Stochastische Prozesse, Zentrum fuer Modellierung und Simulation) Bundesstrasse 55, D-20146 Hamburg, Germany Tel: x40/42838 4907, privat x40/631 62 79 hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/ ####################################################################### ich empfehle www.boag.de -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Ko-Kang Wang
2001-Sep-27 19:21 UTC
R Introductory Slides (was: Re: [R] Getting your stuff organized in R)
The Getting your stuff organised in R is very useful. I wrote two sets of powerpoint slides (And more are being written) on introduction to R. They were used to teach my fellow Software Developers' Klub (SDK) members R - who have lots of experience in programming but not statistics; as well as some year one statistics students. The slides can be obtained from http://www.stat.auckland.ac.nz/~kwan022/pub/R/ , under the name: R_Tut_00.ppt R_Tut_01.ppt In R_Tut_01.ppt I mentioned a Word document which has information on how one can use R (Windows version) more efficiently - in my opinion. This word file has been zipped and named RTricks.zip in the same page. Cheers, Ko-Kang Wang ---------------------------------------------------------------------------- -- Ko-Kang Kevin Wang Statistical Analysis Division Leader Software Developers' Klub (SDK) University of Auckland New Zealand -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._