thr3ads.net - R help - [R] Datasets for "The Statistical Sleuth" [Oct 2009]

If this information is useful, please help other people find it:
Share via:

Yihui Xie

2009-Oct-25 04:48 UTC

[R] Datasets for "The Statistical Sleuth"

Hi everyone,

I wonder if there already exists any R packages containing all the
data sets for the book "The Statistical Sleuth"
(http://www.proaxis.com/~panorama/home.htm; also available at StatLib
http://lib.stat.cmu.edu/datasets/sleuth).

I'm writing an R package with a friend for one of our stat courses
where SAS is the main tool being used. As the time is limited and half
of the semester has gone, we want to finish the package ASAP before
the biased (my personal feeling) impression towards R comes up. It
will save us some time (especially the time on writing R
documentation) if anyone has already done the work of packing up all
the data sets. Thanks a lot!

Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Phone: 515-294-6609 Web: http://yihui.name
Department of Statistics, Iowa State University
3211 Snedecor Hall, Ames, IA

Barry Rowlingson

2009-Oct-25 08:24 UTC

head link

[R] Datasets for "The Statistical Sleuth"

On Sun, Oct 25, 2009 at 5:48 AM, Yihui Xie <xieyihui at gmail.com>
wrote:> Hi everyone,
>
> I wonder if there already exists any R packages containing all the
> data sets for the book "The Statistical Sleuth"
> (http://www.proaxis.com/~panorama/home.htm; also available at StatLib
> http://lib.stat.cmu.edu/datasets/sleuth).
>
> I'm writing an R package with a friend for one of our stat courses
> where SAS is the main tool being used. As the time is limited and half
> of the semester has gone, we want to finish the package ASAP before
> the biased (my personal feeling) impression towards R comes up. It
> will save us some time (especially the time on writing R
> documentation) if anyone has already done the work of packing up all
> the data sets. Thanks a lot!
 You should be able to read the spss versions of the data files using
'read.spss' from the "foreign" package. I've just read in
all the .sav
files from the 2nd edition data sets with no errors.

 Probably all you then need to do is convert them to data frames and
save them as a .RData file which your students can "attach". Actually
it's turning out quicker for me to do this than to tell you how :)

 Get the spss.exe, unzip it to create a load of .sav files, install
the 'foreign' package if you don't have it already, then do this in
R:

require(foreign)
e=new.env()
for(f in list.files(pattern=".sav")){
  name = sub(".sav","",f)
  data = as.data.frame(read.spss(f))
  assign(name,data,env=e)
}
save(file="statsleuth.RData",list=ls(e),envir=e)

Then to test start a new R session and do:

 > attach("statsleuth.RData")
 > summary(ex1611)
          COUNTRY      PCTCATH         P2PRATIO       PCTINDIG
 Argentina    : 1   Min.   : 1.20   Min.   : 0.9   Min.   : 13.00
 Australia    : 1   1st Qu.:28.60   1st Qu.: 1.8   1st Qu.: 58.50
 Bolivia      : 1   Median :82.10   Median : 3.8   Median : 76.00
 Brazil       : 1   Mean   :63.74   Mean   : 5.1   Mean   : 70.53
 Chile        : 1   3rd Qu.:95.50   3rd Qu.: 8.3   3rd Qu.: 92.00
 Ecuador      : 1   Max.   :97.60   Max.   :11.9   Max.   :100.00
 (Other)      :15                                  NA's   :  2.00

 > ls("file:statsleuth.RData")
  [1] "case0101" "case0102" "case0201"
"case0202" "case0301" "case0302"
  [7] "case0401" "case0402" "case0501"
"case0502" "case0601" "case0602"
 [13] "case0701" "case0702" "case0801"
"case0802" "case0901" "case0902"
[etc etc etc etc]

 My only worry is whether all the data sets convert to data frames
okay, and nothing is lost in the conversion. It's possible that SPSS
has all sorts of other metadata that is dropped, or something. I'd
suggest you check all 140 data sets first...

Barry

Apparently Analagous Threads

Search for more apparently analagous threads

R help - Oct 2009 - Datasets for "The Statistical Sleuth"

[R] Datasets for "The Statistical Sleuth"

[R] Datasets for "The Statistical Sleuth"

Apparently Analagous Threads