* What were your biggest misconceptions or stumbling blocks to getting up and running with R? * What documents helped you the most in this initial phase? I especially want to hear from people who are lazy and impatient. Feel free to write to me off-list. Definitely write off-list if you are just confirming what has been said on-list. -- Patrick Burns pburns at pburns.seanet.com http://www.burns-stat.com (home of 'The R Inferno' and 'A Guide for the Unwilling S User')
My biggest stumbling blocks to getting up and running with R was whenever I was lazy and impatient. The more you love R, the more it loves you back. Tal ----------------Contact Details:------------------------------------------------------- Contact me: Tal.Galili@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- On Thu, Feb 25, 2010 at 7:31 PM, Patrick Burns <pburns@pburns.seanet.com>wrote:> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns > pburns@pburns.seanet.com > http://www.burns-stat.com > (home of 'The R Inferno' and 'A Guide for the Unwilling S User') > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Apparently I need to explain the "lazy and impatient" comment. No offence was intended (quite the contrary). The meaning of it is that the higher your level of frustration, the more valuable your comments are likely to be to me. On 25/02/2010 17:31, Patrick Burns wrote:> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. >-- Patrick Burns pburns at pburns.seanet.com http://www.burns-stat.com (home of 'The R Inferno' and 'A Guide for the Unwilling S User')
On 2/25/10, Patrick Burns <pburns at pburns.seanet.com> wrote:> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. >I'm quite resilient so I don't think I got to the point of frustration, but getting up to speed was a lengthy process. The biggest stumbler was getting onto the console, and not knowing what to do next. (My first encounter with stats was SPSS, so it was similar to getting onto a UNIX virtual console after a life-long experience with point-and-click windows: it's not very reassuring to know that there are man pages.) I stayed in the what-do-I-do-next state of mind for about 6-12 months (I learned R myself, and my professors were quite reticent when I first introduced them to R). Of particular help to making progress were JGR (arguments suggestions, editor with syntax highlighting, object browser, etc.), Rcmdr (quick access to examples for performing specific tasks, etc.) and Sweave + LyX (for easy results transfer and report creation, without the burden of learning LaTeX). For graphics, playwith latticist and rggobi come in very handy. From the documentation, right now I can recall Quick-R and "R for SAS and SPSS users". And of course, RSiteSearch (also via the sos package), Rseek and the vignettes are a must. Regards Liviu
Patrick, I would add one more question: * where did you look for help expecting answers, but did not find them? If you add hubris to laziness and impatience, you have Larry Wall's 3 virtues of a programmer. To new users of R who may not understand why Patrick is asking: Patrick Burns is the author of some great tutorials/references on S/R and is probably looking for questions to answer in his next contribution. Lately there have been a large number of questions on some fairly basic issues (and some rather complex issues that people expected to be simple/basic). My initial response (and probably others as well) to some of these requests was to quickly think that the answer is obvious and that the obvious place to look is ..., but then I realize that I am a high school dropout who has been using S/R for over 20 years, majored in statistics but reads Shakespeare for fun, and have been known to saw people in half for the entertainment of others; so I am probably not representative of most beginners. Fortune(89) probably applies here. If R beginners will share their frustrations, where they looked but did not find answers (and why they looked there), what would have helped them, etc. Then we (well probably Patrick mostly) can do more to help the next set of beginners. It does not matter how good our answers are if they answer the wrong questions or are in places that the questioner never sees them. "The best way to spread information is to tell someone that it is a secret, the best way to keep it secret is to put it in a manual." -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Patrick Burns > Sent: Thursday, February 25, 2010 10:31 AM > To: r-help at r-project.org > Subject: [R] two questions for R beginners > > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns > pburns at pburns.seanet.com > http://www.burns-stat.com > (home of 'The R Inferno' and 'A Guide for the Unwilling S User') > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
Patrick Burns wrote:> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient.Can't be bothered with questionnaires and can't wait to see your next book... ;-) -pd -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Well, here goes... I still wish there were a really good monograph on the use and implementation of factors. I had to do a certain amount of digging to learn that {assign, get, eval, expression, call, parse, deparse} all existed and how they play together. Sometimes they are look like the C language's indirect addressing, *foo and &foo , and sometimes they don't. :-) Remembering exactly what " y~x " can do and what it can't took a while. Learning about, and watching for 'lazy evaluation,' especially in variables passed to a function, was a bit of a surprise. And to echo others, "R-inferno" has been invaluable, along with the Zoonek manual. Carl
> "The best way to spread information is to tell someone that it is a secret, the best way to keep it secret is to put it in > a manual."==> Nice quote. ;-) The problem is not that there's too little information, rather there's so much. That's probably because R is so powerful, but it makes it tough to sieve out the relevant bits. Some of the info is way too technical to be practical. If I want to drive a car I do not necessarily need to know all the nitty gritty about engine technology.> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R?==> That R can't deal very well with large data, which is not entirely untrue. Also, I was learning another language (Python) and I didn't want R to interfere with that. Finally, in a working environment, it;s almost impossible to justify the time 'lost' learning a new language. Managers generally don't give a %$# about the beauty and robustness of a language. They just want to get the job done asap.> > * What documents helped you the most in this > initial phase? >==> Many docs. CRAN documents (pdfs), other tutorials, Bob Muenchen's book. Many docs == many angles == a good way to learn things.> I especially want to hear from people who are > lazy and impatient. >==> Lazy? n/a. Impatient? Yup, guilty as charged.> Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list.Cheers!! Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In the face of ambiguity, refuse the temptation to guess. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ --- On Thu, 2/25/10, Greg Snow <Greg.Snow@imail.org> wrote: From: Greg Snow <Greg.Snow@imail.org> Subject: Re: [R] two questions for R beginners To: "Patrick Burns" <pburns@pburns.seanet.com>, "r-help@r-project.org" <r-help@r-project.org> Date: Thursday, February 25, 2010, 9:42 PM Patrick, I would add one more question: * where did you look for help expecting answers, but did not find them? If you add hubris to laziness and impatience, you have Larry Wall's 3 virtues of a programmer. To new users of R who may not understand why Patrick is asking: Patrick Burns is the author of some great tutorials/references on S/R and is probably looking for questions to answer in his next contribution. Lately there have been a large number of questions on some fairly basic issues (and some rather complex issues that people expected to be simple/basic). My initial response (and probably others as well) to some of these requests was to quickly think that the answer is obvious and that the obvious place to look is ..., but then I realize that I am a high school dropout who has been using S/R for over 20 years, majored in statistics but reads Shakespeare for fun, and have been known to saw people in half for the entertainment of others; so I am probably not representative of most beginners. Fortune(89) probably applies here. If R beginners will share their frustrations, where they looked but did not find answers (and why they looked there), what would have helped them, etc. Then we (well probably Patrick mostly) can do more to help the next set of beginners. It does not matter how good our answers are if they answer the wrong questions or are in places that the questioner never sees them. "The best way to spread information is to tell someone that it is a secret, the best way to keep it secret is to put it in a manual." -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow@imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of Patrick Burns > Sent: Thursday, February 25, 2010 10:31 AM > To: r-help@r-project.org > Subject: [R] two questions for R beginners >> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns > pburns@pburns.seanet.com > http://www.burns-stat.com > (home of 'The R Inferno' and 'A Guide for the Unwilling S User') > > ______________________________________________ > R-help@r-project.orgmailing list> https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Patrick Burns wrote:> > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? >R was the first scripting language that I *really* invested time in learning. Prior to R I had a few years experience programming in Fortran and had worked on a few projects using Matlab. Because most of my programming experience was with Fortran, the toughest thing to get my head around was definitely lexical scoping and that unlike Fortran subroutines, R function results had to be assigned to something in order to persist outside of the function. Patrick Burns wrote:> > * What documents helped you the most in this > initial phase? >Definitely the "An Introduction to R" manual that ships with the core distribution. It helped me translate my knowledge of programming concepts to the R language very quickly. Patrick Burns wrote:> > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns >-- View this message in context: http://n4.nabble.com/two-questions-for-R-beginners-tp1569384p1569901.html Sent from the R help mailing list archive at Nabble.com.
On Thu, Feb 25, 2010 at 5:39 PM, Carl Witthoft <carl at witthoft.com> wrote:> Well, here goes... > > I still wish there were a really good monograph on the use and > implementation of factors.To get a good handle on factors, and the sets of contrasts they encode, it is really necessary to study a good statistics book. I recommend mine Statistical Analysis and Data Display, An Intermediate Course with Examples in S-Plus, R, and SAS, Richard M. Heiberger and Burt Holland, Springer 2004 But I will acknowledge that other books are available. Rich
On Thu, Feb 25, 2010 at 9:31 AM, Patrick Burns <pburns at pburns.seanet.com> wrote:> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R?1- Compared to other programming languages it is hard to learn R by example, because it is hard to find code on the web that will do the exact thing you are looking for, sometimes you might get lucky though. By contrast, take Perl for example, it is an easy language to learn by example. 2- The R mailing list. Beginners get frustrated after they struggle for a long time to solve a problem and the easiest thing then is to send an email to the R mailing list. I did this in the past. The best thing that happened was that my request was neglected and I had to spend more time on the problem and find a solution by myself eventually. Do not get me wrong, I am not saying that the mailing list is bad, but it should be more organized. Maybe broken down into couple of other mailing lists. This might bring up a good discussion thread.> > * What documents helped you the most in this > initial phase?An Introduction to R by Venables simpleR ? Using R for Introductory Statistics by Verzani
Patrick Burns wrote:> > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > >(This derives partly from teaching) The fact that this xapply-stuff was not idempotent (worse: not always) and that you need a monster like do.call() to straighten this out. Nowadays, plyr comes close. The concept of environment. With S it was worse, though. That you cannot change values "passed by reference". I noted that the latter is no problem for students who have not worked with c(++/#) before. That there is only one return-result in functions. "[" and the likes as an operator. 10 years ago, when I started, the message was: S4 is the future, S3 is legacy. So I learned S4. Only to never use is in self-written code later. Might be different for BioConductor people. That sometimes you can use vectors not in data= (lattice), and sometimes not (ggplot2). Still a VERY confusing inconsistency. The "why-does-this-not-print" FAQ. Why does par(oma..) not work with lattice? Dieter -- View this message in context: http://n4.nabble.com/two-questions-for-R-beginners-tp1569384p1570249.html Sent from the R help mailing list archive at Nabble.com.
Hi It was class, mode or type.of imported data which I usually believed they are something but actually they were something else until I learned that when something does not look as I would expect I shall blame myself for wrong expectation. For the time being I would say that reshape and factor ordering especially with groupedData objects from nlme are still giving me a headache and many trials and errors to get desired result. And of course regular expressions but they are not related with R but with my laziness to learn it due to fact that on this list there are many clever experts which can solve the problem for a fraction of time I could do it myself. For documents in the beginning I would vote for Paul Johnsons Rtips. About 10 yars ago it was nice collection of several useful Howto's. Regards Petr r-help-bounces at r-project.org napsal dne 25.02.2010 18:31:19:> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns > pburns at pburns.seanet.com > http://www.burns-stat.com > (home of 'The R Inferno' and 'A Guide for the Unwilling S User') > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
Honestly what I remember as the most difficult thing when I 'first' started using R was figuring out how to read in my own datasets. I eventually discovered the R import/export manual, but somehow this alluded me initially. All the R "tutorials" I was working from simply "generated" data or used the built in datasets, and "I" was ready to work on my own datasets. The things that led from "frustration" to "independence" was understanding the difference between data types like matrix and dataframe and learning there were commands to tell what you were working with at any given time. Did the data read in as character, numeric, or factor, etc. Commands like: str, class, mode, ls, search, help, help.search, etc can help you figure out what you are doing. Rob -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Patrick Burns Sent: Thursday, February 25, 2010 11:31 AM To: r-help at r-project.org Subject: [R] two questions for R beginners * What were your biggest misconceptions or stumbling blocks to getting up and running with R? * What documents helped you the most in this initial phase? I especially want to hear from people who are lazy and impatient. Feel free to write to me off-list. Definitely write off-list if you are just confirming what has been said on-list. -- Patrick Burns pburns at pburns.seanet.com http://www.burns-stat.com (home of 'The R Inferno' and 'A Guide for the Unwilling S User') ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Dear Patrick (and all) I'm now working with R a couple of years, before working mostly in Matlab Lazy & impatient is both true for me :-)> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R?> * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > Stumbling: * It took me long to remember getwd () and setwd () (instead of pwd and cd / chdir or the like) * I still discover very useful functions that I would have needed for a long time. Latest discoveries: mapply and ave I knew aggregate. And was always a little angry that it needs a grouping list. I even decided that the aggregate method for my hyperSpec class should work with factors as well as with lists. Some day I read in this mailing list that ave does what I need... I like the crosslinks in the help (see also) very much. Maybe I rely too much on them. So: not lazy today, I attach a patch for aggregate.Rd that adds the seealso to ave. Reading this mailing list once in a while gives me nice new ideas. However, > 50 emails / d is somewhat scary for me, so I read only occasionally. * Vecorization: I like the *apply functions. but I'd really appreciate a comprehensive page/vignette here. I remember that it took me a while to realize that the rule for MARGIN in sweep is "use the same number as in the apply that created the STATS" * I never found the pdf manuals helpful (help pages are easier to access, and there is nothing in the pdf that the help doesn't have. At the beginning I expected the pdf manual to be something that the vignettes are. * I did not arrive at a comfortable debugging cycle for a long time. But now there's the debug package and setBreakpoint and I'm happy.... * As I now start teaching I notice that many students react to error messages "uhh! an error!" (panic). Few realizing that the error message actually gives information on what went wrong. A list with common causes of different error messages would be helpful here, I think. In case someone agrees: I started one at the Wiki: http://rwiki.sciviews.org/doku.php?id=tips:errormessages Cheers, Claudia
Patrick Burns <pburns@pburns.seanet.com>> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R?I came into R from SAS, with its powerful data step language and very simplified data types. Most of my work is data manipulation prior to a variety of univariate statistical calculations. The vector-based nature of R, and thus the variety of indexing schemes used, was a big conceptual hurdle. The often unhelpful attitude of several list respondents, while not unique to this list, was and continues to be another block to advancement. This does not occur on the list for SAS, in which asking 'dumb' questions is generally supported as an inevitable part of learning. Having aggregate() pointed out to me by one kind soul, hidden amidst the assortment of by()/apply() functions, became the basis for much success. I am currently trying to wrap my mind around how missing values are handled; the defaults are quite different than SAS, and mostly in a good way. However the handling of NA values in a slicing statements does not seem quite proper, even if it is addressed in the R documents. aa <- data.frame('id'=letters[1:5], 'x'=1:5, stringsAsFactors=FALSE) aa[aa$x == 3,]$x <- NA aa[aa$x == '4',] # 2 rows instead of 1. aa[aa$x %in% '4',] # 1 row as expected. I am also looking for concise methods for building up dataframes for our unit tests. While there are several ways to accomplish this, depending on what is needed, none are elegant though expand.grid() comes close. next: The R inferno. I *will* understand more than the first few pages. And all those apply()-ish functions, as I'm already good friends with aggregate().> * What documents helped you the most in this > initial phase?RSeek.org was and continues to be a big source of help. I've looked at several texts aimed at beginners, and all provided simple examples that were useful. The most consistent source of instruction has been to make up my own small projects that were either fun or slightly relevant to my job. The ability to make up toy problems, or simplify a complex process have been unexpectedly important skills. Developing unit tests for functions, initially seen as an irritant by some, has become an important tool for honing our advances.> I especially want to hear from people who are > lazy and impatient.And, I hope, incompetent. I've found incompetence to be as professionally important as hubris. I wouldn't want one without the other. cur -- Curt Seeliger, Data Ranger Raytheon Information Services - Contractor to ORD seeliger.curt@epa.gov 541/754-4638 [[alternative HTML version deleted]]
My biggest impediment, as a scientist without previous programming experience, is that the R help is not beginner-friendly. I think it is probably great for experienced programmers and for the people who helped to create the software, to help them remember what they did, but I think it is very difficult for a newcomer without a strong programming background to learn about a new function or to discover the name of a function that you are pretty sure should already exist. Maybe this wouldn?t matter for most programming languages, but as free statistics software R is obviously going to attract many scientists who want to get an analysis done and have varying levels of experience with programming. I found it much easier to learn how to use Mathematica, using only the online help. With R I had to buy several books to get a handle on it, which is fine, but even the books that I have found to be most useful tend to be didactically lacking?either too cursory or mired in unexplained programming jargon. They are OK just not great. What I think would be very helpful is an introduction to programming using R, preferably a big thick college textbook that takes at least a semester to go through, which should be a prerequisite for going through the Introduction to R available on CRAN. Also to do any analysis on real data you have to use the apply family of functions to perform different functions by groups. A long introduction to these functions, with lots of comparisons and contrasts between them would be very helpful. A few random examples concerning the R help: In my version of R (2.7.0 on Windows XP) typing> ?+doesn?t do anything, but then if you type in the next line + ?sum you get the ?Arithmetic Operators? help page. If you had just typed> ?sumin the first place you get the ?Sum of Vector Elements? help page. Most examples in the R help pages use way to many other functions to be useful to a beginner. If an example uses 10 other functions besides the one being described, chances are a beginner won?t know what one of them does, which can set off a chain of having to look up other irrelevant functions. Some function names in the base package are goofy, such as ?rowsum? which is used to ?compute column sums across rows?, not to be confused with ?rowSums? which computes row sums. -- View this message in context: http://n4.nabble.com/two-questions-for-R-beginners-tp1569384p1571243.html Sent from the R help mailing list archive at Nabble.com.
Lazy and impatient? That's me! I find it hard to say what my biggest misconceptions were. Here's one thing: What I realized very early on: - many data analysis functions return a bunch of stuff, not all of which you see when you print() it what I *failed* to realize: - The bunch of stuff such functions return is just a *list* that has follow-on implications: - even if you're just doing some simple analysis like a linear regression, if you want to be able to see/get all the information, you really need to learn how to examine what's in a list and how to operate on the list. I had seen lists as "potentially useful but not something I need to worry about right now, since I'm having enough trouble just grokking why dataframes look different to matrices", whereas I needed to know that lists were absolutely central to what I was trying to achieve. While I have no doubt this information can be found in a dozen places, I read a bunch of introductory documents at the time, and I don't recall it being stated explicitly like that in any of the places I looked. It made a big difference to me when I realized that so many functions just return a list. I mean, it's obvious, and I should have seen that's all it was the first time, but I didn't. -- View this message in context: http://n4.nabble.com/two-questions-for-R-beginners-tp1569384p1571715.html Sent from the R help mailing list archive at Nabble.com.
I don't think I am a tyro but neither am I a wizard. This being said R has a number of aspects that make it difficult. Error messages that are not helpful Manual pages that are written in Martin. Lack of examples on some manual pages Lack of comments in code There are other hurdles. The concept of vectorization and its related syntax took a long time to understand. John John Sorkin JSorkin at grecc.umaryland.edu -----Original Message----- From: Saeed Abu Nimeh <sabunime at gmail.com> Cc: <r-help at r-project.org> To: <ivan.calandra at uni-hamburg.de> Sent: 2/26/2010 11:36:38 PM Subject: Re: [R] two questions for R beginners Hi Ivan, On 2/26/10 6:30 AM, Ivan Calandra wrote:> You are definitely right... > What to do with bad beginner's questions is not a simple issue. > > If a "beginner's mailing list" is created, who will answer to such > questions?If I subscribe to the beginners mailing list, then I have to expect novice questions and I should be willing to help. Otherwise, I should not be there. And moreover, the beginners won't take advantage of the other> questions (I've personally learned a lot trying to understand the > questions and answers to other's problems).They can still subscribe to the advanced, but they will know that they are here to observe and learn, not to ask novice questions. You want to ask basic stuff, go to the beginners list :) Not sure if you guys have been on some of the linux mailing lists out there, but man let me tell you, some of these lists have a RTFM attitude and they will fry you if you ask novice questions. Frankly, that is understandable, as most of the members are geeks and they have higher expectations. This mailing list is different, I have seen posts from different disciplines; biology, biostats, stats, computer science, oceanography, etc. So, IMO, there should be a beginners list to cope with such broad committee. Thanks, Saeed And also, as you said, the> problems might persist. > The beginner's mailing list might be good in one aspect though: the > "experts" who subscribe to it would be willing to help the beginners to > get started with R, knowing that the questions might not be clearly stated. > > As you pointed out, the mailing list is not the best for basic stuff > (the question is of course "what is basic?"). Not everybody knows some > colleagues who work with R (I'm personally the 1st one to use R in my lab). > I think, somehow and I have no idea how, documentation and guidance to > search for help should be more accessible as soon as you start with R. > Maybe a _*clear*_ section on the R homepage or in the "introduction to > R" manual like "where to find help", including all of the most common > and useful resources available (from "?" and RSiteSearch() to R Wiki and > Crantastic). > > I hope that this whole discussion might help to make the R world better. > Thank you Patrick for initiating it! > Regards, > Ivan > > Le 2/26/2010 15:09, Paul Hiemstra a ?crit : >> Ivan Calandra wrote: >>> Since you want input from beginners, here are some thoughts >>> >>> I had and still have two big problems with R: >>> - this vectorization thing. I've read many manuals (including R >>> inferno), but I'm still not completely clear about it. In simple >>> examples, it's fine. But when it gets a bit more complex, then... >>> Related to it, the *apply functions are still a bit difficult to >>> understand. When I have to use them, I just try one and see what >>> happens. I don't understand them well enough to know which one I need. >>> - the second problem is where to find the functions/packages I need. >>> There are many options, and that's actually the problem. R Wiki, >>> Rseek, RSiteSearch, Crantastic, etc... When you start with R, you >>> discover that the capabilities of R are almost unlimited and you >>> don't really know where to start, where to find what you need. >>> >>> As noted in earlier posts, the mailing list is really great, but some >>> people are really hard with beginners. It was noted in a discussion a >>> few days ago, but it looks like some don't realize how difficult it >>> is at the beginning to formulate a good question, clear, with >>> self-contained example and so on. Moreover, not everybody speaks >>> English natively. I don't mean that you must help, even when the >>> question is really vague and not clear and whatever. I'm just saying >>> that if you don't want to help (whatever the reason), you don't have >>> to say it badly. But in any cases, the mailing list is still really >>> helpful. As someone noted (sorry I erased the email so I don't >>> remember who), it might be a good idea to split it. >> Hi everyone, >> >> My 2ct about the mailing list :). I understand that beginners have a >> hard time formulating a good question. But the problem is that we >> can't answer the question when it is unclear. So either I: >> >> - Don't bother answering >> - Try do discuss with the author of the question, taking lots of time >> to find out what exactly is the question. >> - Send a "read the posting guide" answer >> >> I mostly do the first, as I have to get things done during my PhD :). >> So this leaves us with kind of a problem, the person mailing the list >> doesn't have the knowledge to ask the right question, the list can't >> answer properly and consequently, the person mailing the list still >> doesn't get the information he/she needs. We could start an R-beginner >> mailing list, but this would also suffer from this problem. What do >> you guys think? >> >> Maybe the mailing list is not the right medium for really basic stuff. >> For that I would recommend a good R-book or (better) a course in R or >> (even better) some colleagues who work with R that you can ask >> questions to. >> >> cheers, >> Paul >>> >>> Hope that's what you wanted >>> Ivan >>> >>> >>> Le 2/26/2010 08:39, Dieter Menne a ?crit : >>>> >>>> Patrick Burns wrote: >>>>> * What were your biggest misconceptions or >>>>> stumbling blocks to getting up and running >>>>> with R? >>>>> >>>>> >>>> (This derives partly from teaching) >>>> >>>> The fact that this xapply-stuff was not idempotent (worse: not >>>> always) and >>>> that you need a monster like do.call() to straighten this out. >>>> Nowadays, >>>> plyr comes close. >>>> >>>> The concept of environment. With S it was worse, though. >>>> >>>> That you cannot change values "passed by reference". I noted that >>>> the latter >>>> is no problem for students who have not worked with c(++/#) before. >>>> That >>>> there is only one return-result in functions. >>>> >>>> "[" and the likes as an operator. >>>> >>>> 10 years ago, when I started, the message was: S4 is the future, S3 is >>>> legacy. So I learned S4. Only to never use is in self-written code >>>> later. >>>> Might be different for BioConductor people. >>>> >>>> That sometimes you can use vectors not in data= (lattice), and >>>> sometimes not >>>> (ggplot2). Still a VERY confusing inconsistency. >>>> >>>> The "why-does-this-not-print" FAQ. >>>> >>>> Why does par(oma..) not work with lattice? >>>> >>>> Dieter >>>> >>>> >>>> >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> > > [[alternative HTML version deleted]] > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}}
On Thu, 25 Feb 2010 17:31:19 +0000 Patrick Burns <pburns at pburns.seanet.com> wrote:> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R?I didn't have any major stumbling blocks, but even after years of using R I didn't have a clear concept of what exactly a vector, a list and a data frame was, and what was the difference and similarities between them (and stuff like why does x[i] return a different result than x[[i]]). Some things that have tripped my up is reassigning the value of T or F and getting very strange results afterwards (I now use only TRUE and FALSE). FAQ 7.31 and 7.22 have also been troublesome at times, especially 7.31 when used in 'for' loops. Also I found it quite confusing that ?ifelse works, but not ?if (you have to type ?"if") Also, why ?plot didn't give me the information I was looking for but ?plot.default did was rather confusing. I still experience similar problems with other functions. Usually 'methods' help, but some packages use S4 methods, which makes finding the correct help package quite challenging at times.> * What documents helped you the most in this > initial phase?In the initial phase I found the Rtips "http://pj.freefaculty.org/R/Rtips.html" extremely useful. For understanding the difference between the various data types in R, Phil Spector's wonderful book 'Data Manipulation with R' was a great help. When reading it I finally understood things I have been wondering about for years. It really like the book. It's short, crystal clear and immensely useful. Another very useful document of a more advanced nature is the R Inferno. Best read after you've been using R for some time, though. I'm over the initial phase now, but two resources which continue to be of great help is http://www.rseek.org/ (mainly for searching the mailing list) and the 'sos' package (for finding the functions and packages I need). 'sos' really is great. There have been other packages/functions trying to do the same thing, but they have been to time-consuming and difficult to use (and learn), typically requiring you to first do a search, and then do some advanced subsetting to get useful results. This is similar to older search engines requiring many boolean terms to give the needed search results. With 'sos' I just choose some simple search terms describing what I'm looking for, and immediately get relevant results. 'sos' really is the Google of the R world. It has made a great impact on the discoverability of the various R functions and packages. Lastly, the 'demo' function is seldom mentioned, and easy to overlook, but gives a nice (and sometimes impressive) overview of what type of graphics is possible to create with a given packages. I wish more packages would have well-written demos. Also, I think some of the examples from the 'example' sections of help pages for functions could very well be copied to the demo of the corresponding package, e.g. a few of the examples of the 'xyplot' function in 'lattice'. -- Karl Ove Hufthammer
On Fri, 26 Feb 2010 11:56:10 -0800 (PST) Jack Siegrist <jacksie at eden.rutgers.edu> wrote:> What I think would be very helpful is an introduction to programming using > RHere you are: A First Course in Statistical Programming with R http://www.cambridge.org/uk/catalogue/catalogue.asp?isbn=9780521694247 -- Karl Ove Hufthammer
Background: During my uni days, I was taught to use MAPLE, MATLAB, SPSS, SAS, C++ and Java. Then after uni, several years went by without me ever using any of them again and was told to just use Excel. Then I started my PhD and was told I should start using R instead (something I'd never even heard of before). I would class myself as being just above a beginner, but not by very much. Probably within walking distance.> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R?(1) I read a lot about R having awesome graphics capabilities, but when i looked at the the graphs on the R home page I was a little underwhelmed. I thought Excel graphs looked better (though, to be fair, since that first time, i have seen some pretty awesome graphics produced with R, and even a cool animation someone posted on youtube synchronised to music). (2) The whole *apply family of functions just confused me and looking at the examples didn't really help me to be honest. I understood the idea of vectorisation but I couldn't work out how to get what I wanted as the end result. The plyr package has solved that issue for me though and I now appreciate how cool these functions are. (3) There are a lot of cool sounding packages on CRAN. Sometimes I can read the ref manual and still have no idea how they work. A short tutorial on how the author sees the package being used would be helpful. (4) Also, trivial examples are great for conveying the basics of how a function works. Complicated examples give me a headache. (5) I use to have issues trying to find R related material on the web (then i discovered rseek etc). (6) "cannot allocate vector of size..." -- i think this has to be the most asked question ever on r-help. Not so much of a stumbling block for me anymore, but i always got annoyed whenever i saw it.> * What documents helped you the most in this > initial phase?(1) R cheat sheets are fantastic because I can never remember most of them off the top of my head. (2) Rwiki has save me many precious hours by have easy to follow examples. (3) r-help is great for trying to find answers to questions for the most part. I've learned loads just reading responses people have kindly contributed. Some threads can get long and it would be nice if the origin author would summarise at the end once a suitable solution is found (some other lists do this). (4) Random little blog posts that describe how to do a fun* task in R. These short posts are usually the best way for me to learn, because they don't require too much effort, are sort of easy to understand and follow through from beginning to end, and give you a cool** end result. (5) I prefer 'cookbooks' that show you how to do fun stuff (and hence learn from) as opposed to looking at the official R guides (confession time: I haven't looked at the intro to R guide since my 1st month of using R... which was a couple years ago now).> * where did you look for help expecting answers, but did not find them?Often times, the ?[function name] help pages just didn't make sense to me, even after trying to understand the examples. Sometimes it'd be nice to have something like they do on thottbot for World of Warcraft where each quest has a thread for people to discuss how it works and little tricks. So the R equivalent I guess would be to have a link at the bottom of each help page which links to a thread dedicated to a specific function and where users talk discuss it and offer their own examples and points of view about it. Of course that is probably overkill. I just wanted to see if i could mention WoW in my post.> I especially want to hear from people who are > lazyI did a degree in Maths.> and impatient.I sometimes produce graphics in Excel. Cheers, Tony * = fun is a relative term. I still get a buzz out of seeing ascii art. ** = cool is also relative term. I still think Babylon 5 was cooler than Star Trek DS9. Though nowhere near as cool as Doctor Who. On 25 Feb, 17:31, Patrick Burns <pbu... at pburns.seanet.com> wrote:> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. ?Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns > pbu... at pburns.seanet.comhttp://www.burns-stat.com > (home of 'The R Inferno' and 'A Guide for the Unwilling S User') > > ______________________________________________ > R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Patrick Burns escribi?:> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. >Perhaps my biggest problem was that I couldn't (and still haven't) seen *absolute beginners* documents. It took me about six months to start using R for my everyday data analysis, and now I can't imagine life without it! My problem was that I knew some programming (Java) and had never used a command line for statistics. All my statistical needs had been accomplished through the graphical interface of SPSS or similar software (even Excel!). I have a feeling that almost all "Introduction to R" documents are made for making the switch from SPSS and SAS scripting, to R. But I have had a very difficult time using R as an *entry level* statistical scripting language to help my colleagues (none of us are either programmers nor statisticians, mostly biology PhDs and a couple of MDs). I would love to see a text oriented towards someone who has never used anything but Excel, but realizes that to do science today you have to go beyond the "Data analysis" toolbar from Excel. (Plese tell me if you know of any) Best to all, Keo.
William, I agree that changing syntax can lead to problems. I don't, however think extending the language will break existing code. Providing a common syntax for accessing matrices and dataframes will not change the way things have been done to date, but rather how things will be done in the future. John John Sorkin JSorkin at grecc.umaryland.edu -----Original Message----- From: "William Dunlap" <wdunlap at tibco.com> To: John Sorkin <jsorkin at grecc.umaryland.edu> To: Karl Ove Hufthammer <karl at huftis.org> To: <r-help at stat.math.ethz.ch> Sent: 3/2/2010 11:53:45 AM Subject: RE: [R] two questions for R beginners> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of John Sorkin > Sent: Tuesday, March 02, 2010 3:46 AM > To: Karl Ove Hufthammer; r-help at stat.math.ethz.ch > Subject: Re: [R] two questions for R beginners > > Please take what follows not as an ad hominem statement, but > rather as an attempt to improve what is already an excellent > program, that has been built as a result of many, many hours > of dedicated work by many, many unpaid, unsung volunteers. > > It troubles me a bit that when a confusing aspect of R is > pointed out the response is not to try to improve the > language so as to avoid the confusion, but rather to state > that the confusion is inherent in the language. I understand > that to make changes that would avoid the confusing aspect of > the language that has been discussed in this thread would > take time and effort by an R wizard (which I am not), time > and effort that would not be compensated in the traditional > sense. This does not mean that we should not acknowledge the > confusion. If we what R to be the de facto lingua franca of > statistical analysis doesn't it make sense to strive for > syntax that is as straight forward and consistent as possible?Whenever one changes the language that way old code will break. The developers can, with a lot of effort, fix their own code, and perhaps even user-written code on CRAN, but code that thousands of users have written will break. There is a lot of code out there that was written by trial and error and by folks who no longer work at an institution: the code works but no one knows exactly why it works. Telling folks they need to change that code because we have a cleaner but different syntax now is not good. Why would one spend time writing a package that might stop working when R is "upgraded"? I think the solution is not to change current semantics but to write functions that behave better and encourage users to use them, gradually abandoning the old constructs. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> > Again, please understand that my comment is made with deepest > respect for the many people who have unselfishly contributed > to the R project. Many thanks to each and every one of you. > > John > > > >>> Karl Ove Hufthammer <karl at huftis.org> 3/2/2010 4:00 AM >>> > On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch > <murdoch at stats.uwo.ca> > wrote: > > Suppose X is a dataframe or a matrix. What would you > expect to get from > > X[1]? What about as.vector(X), or as.numeric(X)? > > All this of course depends on type of object one is speaking > of. There > are plenty of surprises available, and it's best to use the > most logical > way of extracting. E.g., to extract the top-left element of a 2D > structure (data frame or matrix), use 'X[1,1]'. > > Luckily, R provides some shortcuts. For example, you can > write 'X[2,3]' > on a data frame, just as if it was a matrix, even though the > underlying > structure is completely different. (This doesn't work on a > normal list; > there you have to type the whole 'X[[2]][3]'.) > > The behaviour of the 'as.' functions may sometimes be surprising, at > least for me. For example, 'as.data.frame' on a named vector gives a > single-column data frame, instead of a single-row data frame. > > (I'm not sure what's the recommended way of converting a > named vector to > row data frame, but 'as.data.frame(t(X))' works, even though both 'X' > and 't(X)' looks like a row of numbers.) > > > The point is that a dataframe is a list, and a matrix > isn't. If users > > don't understand that, then they'll be confused somewhere. Making > > matrices more list-like in one respect will just move the confusion > > elsewhere. The solution is to understand the difference. > > My main problem is not understanding the difference, which is > easy, but > knowing which type of I have when I get the output a function in a > package. If I know the object is a named vector or a matrix > with column > names, it's easy enough to type 'X[,"colname"]', and if it's a data > frame one may use the shortcut 'X$colname'. > > Usually, it *is* documented what the return value of a > function is, but > just looking at the output is much faster, and *usually* gives the > correct answer. > > For example, 'mean' applied on a data frame gives a named > vector, not a > data frame, which is somewhat surprising (given that the columns of a > data frame may be of different types, while the elements of a > vector may > not). (And yes, I know that it's *documented* that it returns a named > vector.) On the other hand, perhaps it is surprising that > 'mean' works > on data frames at all. :-) > > -- > Karl Ove Hufthammer > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > Confidentiality Statement: > This email message, including any attachments, is for=...{{dropped:18}}
Patrick, 1. Implicit intercepts. Implicit intercepts are not too bad for the main model, but they creep in occasionally in strange places where they might not be expected. For example, in some of the variance structures specified in lme, (~x) automatically expands to (~1+x). Venables said in the "Exegeses" paper: "For teaching purposes it would be useful to have a switch that required users to include the intercept term in formulae if it is needed. This would definitely help more students than it would hinder. In other words it should be possible to override the automatic intercept term." 2. Working with colors. There are a number of functions in R for working with colors and since colors can be specified by palette number, name, hexadecimal string, values between 0 and 1, or values between 0 and 256, things can be confusing. One problem is that not all functions accept the same type of arguments or produce the same type of return values. For example, the awkward need of "t" and conversion to [0,255] in adding alpha levels to a color: rgb(t(col2rgb(c("navy","maroon"))),alpha=120,max=255) 3. Factors. R tries to convert everything that it possibly can into a factor. Except, occasionally, it doesn't try. Further, after sub-setting data so that some factor levels have no data, too many functions fail. I shouldn't need to use "drop.levels" from gdata package all over the place to keep automated scripts running smoothly. Let's not forget: R> as.numeric(factor(c(NA,0,1))) [1] NA 1 2 4. is.list(list(1)[1]) [1] TRUE is.matrix(matrix(1)[1,]) [1] FALSE Ouch. Ouch. Ouch. 5. Most useful: "apropos" and Rseek. Best, Kevin On Thu, Feb 25, 2010 at 11:31 AM, Patrick Burns <pburns@pburns.seanet.com>wrote:> * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns > pburns@pburns.seanet.com > http://www.burns-stat.com > (home of 'The R Inferno' and 'A Guide for the Unwilling S User') > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Kevin Wright [[alternative HTML version deleted]]