Hi there, I would like some advice, not so much about how to use R, but about software that I need to complement R. I've rooted around in the FAQ's and done a few searches on this mailing list but haven't quite found the perspective I need. I am an experienced data analyst in my field (forest ecology and ecological monitoring) but new to R. I am a long time user of SPSS and have gotten pretty handy with it. However, I am frustrated with SPSS for several reasons: There's the cost (I'm a freelancer; I pay for my software myself); the Windows dependence (I use Kubuntu as my usual OS now, and switching back and forth is a pain); the horrible inefficiency when I do certain types of file manipulations; and the inability to do the kind of publication-quality graphs I want... I've usually ended up using a commercial graphing program (another source of expense and limitation). I'd like to switch to using R on Kubuntu, for all those reasons. In addition I think the mathematical formality that R encourages might be good for me. However, reviewing the FAQ's on the R project web site makes me realize that I've been using SPSS as three kinds of software really: a DBMS; a statistical analysis package; and a graphing package. It looks like moving to R might involve learning three kinds of software, not just one. I wonder: 1) What open-source DBMS works most seamlessly with R? I have seen MySQL recommended but wonder if there are alternatives. I sometimes need to handle big data files. In fact a lot of my work involves exploratory and descriptive analyses of rather large and messy databases from ecological monitoring, rather than statistical tests per se. In SPSS the data files I have been generating have dozens of columns and thousands of rows, often with value and variable labels helpful for documenting my work. 2) For the purpose of creating publication-quality graphs, do R users typically need to go outside of the R system? If so, what open-source programs would you all recommend? 3) Any other software I need to learn that would make my work in R more productive? (for example, a code editor). Thank you for your time, Martin J. Brown Portland, Oregon [[alternative HTML version deleted]]
[i sent this message earlier but apparently should have sent it plain text, as follows..] Hi there, I would like some advice, not so much about how to use R, but about software that I need to complement R. I've rooted around in the FAQ's and done a few searches on this mailing list but haven't quite found the perspective I need. I am an experienced data analyst in my field (forest ecology and ecological monitoring) but new to R. I am a long time user of SPSS and have gotten pretty handy with it. However, I am frustrated with SPSS for several reasons: There's the cost (I'm a freelancer; I pay for my software myself); the Windows dependence (I use Kubuntu as my usual OS now, and switching back and forth is a pain); the horrible inefficiency when I do certain types of file manipulations; and the inability to do the kind of publication-quality graphs I want... I've usually ended up using a commercial graphing program (another source of expense and limitation). I'd like to switch to using R on Kubuntu, for all those reasons. In addition I think the mathematical formality that R encourages might be good for me. However, reviewing the FAQ's on the R project web site makes me realize that I've been using SPSS as three kinds of software really: a DBMS; a statistical analysis package; and a graphing package. It looks like moving to R might involve learning three kinds of software, not just one. I wonder: 1) What open-source DBMS works most seamlessly with R? I have seen MySQL recommended but wonder if there are alternatives. I sometimes need to handle big data files. In fact a lot of my work involves exploratory and descriptive analyses of rather large and messy databases from ecological monitoring, rather than statistical tests per se. In SPSS the data files I have been generating have dozens of columns and thousands of rows, often with value and variable labels helpful for documenting my work. 2) For the purpose of creating publication-quality graphs, do R users typically need to go outside of the R system? If so, what open-source programs would you all recommend? 3) Any other software I need to learn that would make my work in R more productive? (for example, a code editor). Thank you for your time, Martin J. Brown Portland, Oregon
I'm just starting to get a grasp on how R works so don't take my words too seriously but have a look at http://addictedtor.free.fr/graphiques/ for some idea of what R can do for publication quality graphics. It is always possible that you might need another graphics package as well but I think it unlikely. About the data bases I don't know really however you might want to have a look at Frank Harrell's Hmic package for things like labels. It also includes SAS and SPSS import funtions as does the foreign package. I'd say you definately need a code editor. I'm on Windows and happy with Tinn-R but for Linux something like http://ess.r-project.org/ seems to be recommended. If you have not already found it Bob Muenchen's R for SAS and SPSS Users http://oit.utk.edu/scc/RforSAS&SPSSusers.pdf may be very helpful. --- Martin Brown <mjb2000 at gmail.com> wrote:> Hi there, > > I would like some advice, not so much about how to > use R, but about software > that I need to complement R. I've rooted around in > the FAQ's and done a few > searches on this mailing list but haven't quite > found the perspective I > need. > > I am an experienced data analyst in my field (forest > ecology and ecological > monitoring) but new to R. I am a long time user of > SPSS and have gotten > pretty handy with it. However, I am frustrated with > SPSS for several > reasons: There's the cost (I'm a freelancer; I pay > for my software > myself); the Windows dependence (I use Kubuntu as > my usual OS now, and > switching back and forth is a pain); the horrible > inefficiency when I do > certain types of file manipulations; and the > inability to do the kind of > publication-quality graphs I want... I've usually > ended up using a > commercial graphing program (another source of > expense and limitation). > > I'd like to switch to using R on Kubuntu, for all > those reasons. In > addition I think the mathematical formality that R > encourages might be good > for me. > > However, reviewing the FAQ's on the R project web > site makes me realize that > I've been using SPSS as three kinds of software > really: a DBMS; a > statistical analysis package; and a graphing > package. It looks like moving > to R might involve learning three kinds of software, > not just one. I > wonder: > > 1) What open-source DBMS works most seamlessly with > R? I have seen MySQL > recommended but wonder if there are alternatives. I > sometimes need to > handle big data files. In fact a lot of my work > involves exploratory and > descriptive analyses of rather large and messy > databases from ecological > monitoring, rather than statistical tests per se. > In SPSS the data files I > have been generating have dozens of columns and > thousands of rows, often > with value and variable labels helpful for > documenting my work. > 2) For the purpose of creating publication-quality > graphs, do R users > typically need to go outside of the R system? If so, > what open-source > programs would you all recommend? > 3) Any other software I need to learn that would > make my work in R more > productive? (for example, a code editor). > > Thank you for your time, > > Martin J. Brown > Portland, Oregon > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >
On 8/18/07, Martin Brown <mjb2000 at gmail.com> wrote:> Hi there, > > I would like some advice, not so much about how to use R, but about software > that I need to complement R. I've rooted around in the FAQ's and done a few > searches on this mailing list but haven't quite found the perspective I > need. > > I am an experienced data analyst in my field (forest ecology and ecological > monitoring) but new to R. I am a long time user of SPSS and have gotten > pretty handy with it. However, I am frustrated with SPSS for several > reasons: There's the cost (I'm a freelancer; I pay for my software > myself); the Windows dependence (I use Kubuntu as my usual OS now, and > switching back and forth is a pain); the horrible inefficiency when I do > certain types of file manipulations; and the inability to do the kind of > publication-quality graphs I want... I've usually ended up using a > commercial graphing program (another source of expense and limitation). > > I'd like to switch to using R on Kubuntu, for all those reasons. In > addition I think the mathematical formality that R encourages might be good > for me.>From a strictly language perspective, mathematical formality is prettyfar from R. Its actually quite loose. Underneath there are some Lisp/Scheme ideas but you are not very close to that as a user.> > However, reviewing the FAQ's on the R project web site makes me realize that > I've been using SPSS as three kinds of software really: a DBMS; a > statistical analysis package; and a graphing package. It looks like moving > to R might involve learning three kinds of software, not just one. I > wonder: > > 1) What open-source DBMS works most seamlessly with R? I have seen MySQL > recommended but wonder if there are alternatives. I sometimes need to > handle big data files. In fact a lot of my work involves exploratory and > descriptive analyses of rather large and messy databases from ecological > monitoring, rather than statistical tests per se. In SPSS the data files I > have been generating have dozens of columns and thousands of rows, often > with value and variable labels helpful for documenting my work.Databases. SQLite is the easiest to install since its embedded rather than client/server so I would use that unless your application requires client/server or other features of MySQL. MySQL is probably the most popular of the free data bases so that would be the next one to go with. If you intend to create a commercial application you might want to consider Postgres instead of MySQL as the latter charges for commercial implementations but Postgres does not. Some heavy Postgres users might feel that it should be considered after SQLite rather than MySQL and there is a certain amount of arbitrariness here. See the R packages RSQLite, RMySQL and DBI. The R packages sqldf and SQLiteDF are beginning to blur the boundary between R and the database.> 2) For the purpose of creating publication-quality graphs, do R users > typically need to go outside of the R system? If so, what open-source > programs would you all recommend?Graphics. R should be ok. Check out: http://cran.r-project.org/src/contrib/Views/Graphics.html and also google for R Graphics Gallery> 3) Any other software I need to learn that would make my work in R more > productive? (for example, a code editor). >Other. You need to know a text editor. I use vim but there are many good choices here with ESS being one that is often mentioned. http://www.sciviews.org/_rgui/projects/Editors.html http://ess.r-project.org/ If you intend to write C routines to run with R then, of course, you need to know C. For certain R packages that interface with outside software (tcltk, Rgraphviz, Ryacas, XML, etc.) you will need to know something about the interfaced-to software if you intend to use those packages. For package development you will need to know latex and possibly subversion, i.e. svn, the UNIX screen program, tar and various other UNIX commands. Certain auxilliary programs that come with and are used with R are written in perl although its unlikely you will need to know it.
> In fact a lot of my work involves exploratory and > descriptive analyses of rather large and messy databases from > ecological > monitoring, rather than statistical tests per se.For the exploratory part of your work you should consider the iPlots package, which provides interactive graphics for R. Antony Unwin