Hi All, Can anyone tell me why the length function does not use na.rm? I know how to work around it, I'm just curious to know why such a useful option was left out. I'm also interested in the logic of setting na.rm=TRUE as the default on mean, sd, etc. This is the opposite of the many other stat packages I have used, so I assume it provides some programming benefit that is not obvious to me. Thanks, Bob ======================================================== Bob Muenchen (pronounced Min'-chen), Manager Statistical Consulting Center U of TN Office of Information Technology 200 Stokely Management Center, Knoxville, TN 37996-0520 Voice: (865) 974-5230 FAX: (865) 974-4810 Email: muenchen at utk.edu Web: http://oit.utk.edu/scc, News: http://listserv.utk.edu/archives/statnews.html
On 5/18/2007 10:32 AM, Muenchen, Robert A (Bob) wrote:> Hi All, > > Can anyone tell me why the length function does not use na.rm? I know > how to work around it, I'm just curious to know why such a useful option > was left out.length() is used very frequently in other functions, so it is encoded as a primitive for speed. Adding an optional argument to it would slow it down.> I'm also interested in the logic of setting na.rm=TRUE as the default on > mean, sd, etc. This is the opposite of the many other stat packages I > have used, so I assume it provides some programming benefit that is not > obvious to me.That's also the opposite of what R does. Did you mean to ask why na.rm=FALSE is the default? I think it follows from thinking of NA as meaning "not known", rather than "missing at random". If you don't know why values are missing, you may get biased results by calculating the mean of the others: and R would rather not give you biased results. Duncan Murdoch