Michael Friendly
2010-Oct-06 21:05 UTC
[R] R: Tools for thinking about data analysis and graphics
I'm giving a talk about some aspects of language and conceptual tools for thinking about how to solve problems in several programming languages for statistical computing and graphics. I'm particularly interested in language features that relate to: o expressive power: ease of translating what you want to do into the results you want o elegance: how well does the code provide a simple human-readable description of what is done? o extensibility: ease of generalizing a method to wider scope o learnability: your learning curve (rate, asymptote) For R, some things to cite are (a) data and function objects, (b) object-oriented methods (S3 & S4); (c) function mapping over data with *apply methods and plyr. What other language features of R should be on this list? I would welcome suggestions (and brief illustrative examples). -Michael -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA
Steve Lianoglou
2010-Oct-06 23:47 UTC
[R] R: Tools for thinking about data analysis and graphics
Hi, On Wed, Oct 6, 2010 at 5:05 PM, Michael Friendly <friendly at yorku.ca> wrote:> ?I'm giving a talk about some aspects of language and conceptual tools for > thinking about how > to solve problems in several programming languages for statistical computing > and graphics.For graphics, I'm guessing you'd mention something about ggplot's grammar of graphics. Also, although I don't use it all that much, I think the "formula" methods (y ~ a + b)-type of stuff is probably handy for certain audiences, as it somehow translates to a "natural language" when doing modeling. -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
Michael Friendly
2010-Oct-07 00:21 UTC
[R] R: Tools for thinking about data analysis and graphics
On 10/6/2010 6:14 PM, David Winsemius wrote:> > Graphics: I realize you were focusing on "language" but the graphical > tools are extremely important if one is describing how data > exploration and summarization is done effectively.Well, I'm focusing on the language features that make it easy or hard to draw a standard sort of graph (controlling details) or to design a new kind of graph. Base graphics, grid + lattice, ggplot all have strengths (and weaknesses) I'm thinking about. Replying to Bert Gunter: Yes, I agree that the formula language for lm/glm and friends, plus extensions to other aspects (plots, tables, etc.) is another important R feature of the type I'm considering. I would put this in the top 5! -Michael -- Michael Friendly Email: friendly at yorku.ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA
Hadley Wickham
2010-Oct-07 02:18 UTC
[R] R: Tools for thinking about data analysis and graphics
On Wed, Oct 6, 2010 at 4:05 PM, Michael Friendly <friendly at yorku.ca> wrote:> ?I'm giving a talk about some aspects of language and conceptual tools for > thinking about how > to solve problems in several programming languages for statistical computing > and graphics. I'm particularly > interested in language features that relate to: > > o expressive power: ease of translating what you want to do into the results > you want > o elegance: how well does the code provide a simple human-readable > description of what is done? > o extensibility: ease of generalizing a method to wider scope > o learnability: your learning curve (rate, asymptote) > > For R, some things to cite are (a) data and function objects, (b) > object-oriented methods (S3 & S4); (c) function mapping over data with > *apply methods and plyr. > > What other language features of R should be on this list? ?I would welcome > suggestions (and brief illustrative examples).* missing values * subsetting * lexical scope and closures (goes along with first class functions) * built-in documentation * CRAN (not exactly a language feature, but important part of ecosystem) * thoughtful interactive features - e.g. a <- 10 doesn't print 10. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/
Jeffrey Spies
2010-Oct-07 04:04 UTC
[R] R: Tools for thinking about data analysis and graphics
Hi, Michael, When I teach/preach on R, I emphasize the language's focus on data, both in its objects and operations. It might seems basic, but it's fundamental to most of the features you and others have mentioned. As a statistical programming language, what we intend to do with R is often very naturally accomplished using vector operations on tabular data, where columns represent variables of the same data type and rows represent observations of these variables for a given member of the dataset. Fortunately, these are core components of R. For instance, we can easily perform complex selections of variables and/or members, which, more often than not, serve as input to or power the functions that generate the statistics and graphics we care about. Unfortunately, vector operations seem to be difficult for people to learn how to use properly, and there are penalties for not using them, but as they say: no pain, no gain. :) If you'd be willing to share the materials you create for your talk, I'd be interested in seeing them. Cheers, Jeff. On Wed, Oct 6, 2010 at 5:05 PM, Michael Friendly <friendly at yorku.ca> wrote:> ?I'm giving a talk about some aspects of language and conceptual tools for > thinking about how > to solve problems in several programming languages for statistical computing > and graphics. I'm particularly > interested in language features that relate to: > > o expressive power: ease of translating what you want to do into the results > you want > o elegance: how well does the code provide a simple human-readable > description of what is done? > o extensibility: ease of generalizing a method to wider scope > o learnability: your learning curve (rate, asymptote) > > For R, some things to cite are (a) data and function objects, (b) > object-oriented methods (S3 & S4); (c) function mapping over data with > *apply methods and plyr. > > What other language features of R should be on this list? ?I would welcome > suggestions (and brief illustrative examples). > > -Michael > > > -- > Michael Friendly ? ? Email: friendly AT yorku DOT ca > Professor, Psychology Dept. > York University ? ? ?Voice: 416 736-5115 x66249 Fax: 416 736-5814 > 4700 Keele Street ? ?Web: ? http://www.datavis.ca > Toronto, ONT ?M3J 1P3 CANADA > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Greg Snow
2010-Oct-07 17:00 UTC
[R] R: Tools for thinking about data analysis and graphics
I think that R/S's biggest advantage is in the ways you can store data. It does not force you to fit your data to a single spreadsheet like table, but rather encourages you to think about your data and find the correct way to store it. Lists and objects are a great advantage for keeping related things combined together. I can have multiple data sets available all at the same time, but still in separate objects. Also the results of routines can be kept in a way that makes working with them easy. I remember working with programs that just had one big spreadsheet and ending up with 3 different columns of residuals from 3 different models, but then forgot which residuals matched which model. With R/S each lm object has the residuals stored with it including the call to remind us what model and options were used. One plot that I like to make when exploring different models is:> plot( fitted(model1), fitted(model2) ); abline(0,1)That is simple and straight forward in R/S, but much more difficult in other programs. I also like the fact that the graphics system will let me create anything I want. There are tools to create the standard plots (and I really like the simplicity of calling plot on an lm object and getting a standard set of diagnostics), but there are also the tools to create any plot I can imagine, or add any information I feel useful to an existing plot. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Michael Friendly > Sent: Wednesday, October 06, 2010 3:06 PM > To: R-help > Subject: [R] R: Tools for thinking about data analysis and graphics > > I'm giving a talk about some aspects of language and conceptual tools > for thinking about how > to solve problems in several programming languages for statistical > computing and graphics. I'm particularly > interested in language features that relate to: > > o expressive power: ease of translating what you want to do into the > results you want > o elegance: how well does the code provide a simple human-readable > description of what is done? > o extensibility: ease of generalizing a method to wider scope > o learnability: your learning curve (rate, asymptote) > > For R, some things to cite are (a) data and function objects, (b) > object-oriented methods (S3 & S4); (c) function mapping over data with > *apply methods and plyr. > > What other language features of R should be on this list? I would > welcome suggestions (and brief illustrative examples). > > -Michael > > > -- > Michael Friendly Email: friendly AT yorku DOT ca > Professor, Psychology Dept. > York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 > 4700 Keele Street Web: http://www.datavis.ca > Toronto, ONT M3J 1P3 CANADA > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.