Faraway's book titled "Practical Regression and Anova using R", with full text available online at: http://cran.r-project.org/doc/contrib/Faraway-PRA.pdf refers to a data set, stat500, which compares midterm and final grades. It can be used to illustrate similar concepts. A google search for faraway.zip will locate the actual data. --- Date: Sun, 15 Feb 2004 10:37:08 -0800 From: Ann Loraine <loraine at loraine.net> [ Add to Address Book | Block Address | Report as Spam ] To: rhelp <r-help at stat.math.ethz.ch> Subject: Re: [R] father and son heights Actually, no. It's a data set that is used to teach Pearson's correlation coefficient in a popular statistics textbook - "Statistics" by Freedman, Pisani, et al. It contains over a thousand measurements of son's and their father's heights. I would like to find it in electronic form so that I can use it to prepare figures and examples for a lecture. If anyone knows where I could find it, please let me know. I've done a few Google searches but haven't had any luck so far. I also used the data() command to look through R's built-in data sets and couldn't find it. Any suggestions would be most welcome! Yours, Ann Loraine On Feb 15, 2004, at 5:46 AM, Spencer Graves wrote:> Do you have this data set in any form? If yes, do you have it in > any electronic form? If yes, have you tried following relevant > suggestions in the manual on "R Data Import/Export"? [I got this as a > hot link from within R 1.8.1 "help.start()".] > > hope this helps. > spencer graves > > Ann Loraine wrote: > >> >> Hello, >> >> I'm looking for Pearson's father and son height data. >> >> Is this data set available in R? >> >> Thanks! >> >> Ann Loraine >> >> ______________________________________________ >> R-help at stat.math.ethz.ch mailing list >> https://www.stat.math.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide! >> http://www.R-project.org/posting-guide.html > > >
Ann Loraine wrote:> I'm looking for Pearson's father and son height data. > ........... It's a data set that is used to teach Pearson's > correlation coefficient in a popular statistics textbook - "Statistics" > by Freedman, Pisani, et al. > > It contains over a thousand measurements of son's and their father's > heights. > > I would like to find it in electronic form so that I can use it to > prepare figures and examples for a lecture. > > If anyone knows where I could find it, please let me know. I've done a > few Google searches but haven't had any luck so far. I also used the > data() command to look through R's built-in data sets and couldn't find > it. Any suggestions would be most welcome!I believe that you have been searching under the wrong name. The data are most closely associated with Galton (the bloke to whom the word ``regression'' is due) rather than with Pearson. A search on Galton height led me immediately to http://wiener.math.csi.cuny.edu/UsingR/Data/galton.html where the data appear to be readily available. I ***presume*** that these are the data you seek, although there are only 930 observations, not ``over a thousand''. (Close, but!) The data are given to a limited accurracy, which induces a strangely grid-like appearance when they are plotted, but that is presumably the nature of this data set. They were apparently taken from a table prepared by Galton. Values which were originally given in Galton's table as ``>= 73.7'' or ``<= 61.7'' are truncated to their respective bounds. One thing that puzzles me: The documentation says that the data pertain to 928 children, yet there are 930 data points. (????) I can't find an explanation in the documentation. Maybe I'm just blind. Or thick. cheers, Rolf Turner rolf at math.unb.ca
Marc Schwartz wrote:> Found it here: > > http://stat-www.berkeley.edu/users/juliab/141C/pearson.dat > > It consists of 1,078 pairs of father (col 1) and son (col 2) paired > data. Used to not only show Pearson correlation, but also used to > demonstrate Regression to the Mean, or as Galton called it, Regression > Towards Mediocrity.Whoops. I guess I jumped to an unwarrented conclusion when I found the Galton data set. Sorry 'bout that folks! cheers, Rolf Turner rolf at math.unb.ca
According to: http://www.spss.com/research/wilkinson/Publications/galton.pdf there are actually two father/son height datasets. One was collected by Galton. Apparently Pearson used that data but also collected and used a second dataset together with Alice Lee in roughly the same time frame. --- Date: Sun, 15 Feb 2004 15:30:43 -0400 (AST) From: Rolf Turner <rolf at math.unb.ca> To: <loraine at loraine.net> Cc: <r-help at stat.math.ethz.ch> Subject: Re: [R] father and son heights Ann Loraine wrote:> I'm looking for Pearson's father and son height data. > ........... It's a data set that is used to teach Pearson's > correlation coefficient in a popular statistics textbook - "Statistics" > by Freedman, Pisani, et al. > > It contains over a thousand measurements of son's and their father's > heights. > > I would like to find it in electronic form so that I can use it to > prepare figures and examples for a lecture. > > If anyone knows where I could find it, please let me know. I've done a > few Google searches but haven't had any luck so far. I also used the > data() command to look through R's built-in data sets and couldn't find > it. Any suggestions would be most welcome!I believe that you have been searching under the wrong name. The data are most closely associated with Galton (the bloke to whom the word ``regression'' is due) rather than with Pearson. A search on Galton height led me immediately to http://wiener.math.csi.cuny.edu/UsingR/Data/galton.html where the data appear to be readily available. I ***presume*** that these are the data you seek, although there are only 930 observations, not ``over a thousand''. (Close, but!) The data are given to a limited accurracy, which induces a strangely grid-like appearance when they are plotted, but that is presumably the nature of this data set. They were apparently taken from a table prepared by Galton. Values which were originally given in Galton's table as ``>= 73.7'' or ``<= 61.7'' are truncated to their respective bounds. One thing that puzzles me: The documentation says that the data pertain to 928 children, yet there are 930 data points. (????) I can't find an explanation in the documentation. Maybe I'm just blind. Or thick. cheers, Rolf Turner rolf at math.unb.ca
I just noticed that the faraway package on CRAN contains the stat500 dataset so you can just install that in the usualy way, rather than googling around for faraway.zip . Date: Sun, 15 Feb 2004 14:27:52 -0500 (EST) From: Gabor Grothendieck <ggrothendieck at myway.com> To: <loraine at loraine.net>, <r-help at stat.math.ethz.ch> Subject: Re: [R] father and son heights Faraway's book titled "Practical Regression and Anova using R", with full text available online at: http://cran.r-project.org/doc/contrib/Faraway-PRA.pdf refers to a data set, stat500, which compares midterm and final grades. It can be used to illustrate similar concepts. A google search for faraway.zip will locate the actual data. --- Date: Sun, 15 Feb 2004 10:37:08 -0800 From: Ann Loraine <loraine at loraine.net> [ Add to Address Book | Block Address | Report as Spam ] To: rhelp <r-help at stat.math.ethz.ch> Subject: Re: [R] father and son heights Actually, no. It's a data set that is used to teach Pearson's correlation coefficient in a popular statistics textbook - "Statistics" by Freedman, Pisani, et al. It contains over a thousand measurements of son's and their father's heights. I would like to find it in electronic form so that I can use it to prepare figures and examples for a lecture. If anyone knows where I could find it, please let me know. I've done a few Google searches but haven't had any luck so far. I also used the data() command to look through R's built-in data sets and couldn't find it. Any suggestions would be most welcome! Yours, Ann Loraine On Feb 15, 2004, at 5:46 AM, Spencer Graves wrote:> Do you have this data set in any form? If yes, do you have it in > any electronic form? If yes, have you tried following relevant > suggestions in the manual on "R Data Import/Export"? [I got this as a > hot link from within R 1.8.1 "help.start()".] > > hope this helps. > spencer graves > > Ann Loraine wrote: > >> >> Hello, >> >> I'm looking for Pearson's father and son height data. >> >> Is this data set available in R? >> >> Thanks! >> >> Ann Loraine