One of the products of Project MOSAIC (funded by an NSF CCLI grant)
has been the development of an R package with the goal of making it
easier to use R, especially in teaching situations. We're not quite
ready to declare that we've reached version 1.0, but version 0.4 does
represent a fairly large step in that direction. You can find out
more about the package on CRAN or by installing it, but here are some
of the highlights (some example code appears at the end of this
message):
* extensions of syntax to promote consistency across functions and
make wider use of the formula interface
* simplified ways of creating and plotting functions, including
extracting model fits as functions
* a tally() function that combines features of table() and xtabs()
and more in a common syntax
* expanded syntax for summary functions like mean(), median(),
max(), sd(), var(), etc. that accepts formulas and data frames
* a do() function that simplifies resampling-based statistical
analysis
* numerical integration and differentiation to support using
calculus techniques in R
* first drafts of vignettes on teaching resampling and calculus in R
* some functions that add extra features to familiar functions
(e.g., xchisq.test(), xhistogram(), xpnorm(), ...)
* some data sets
If you are using mosaic and discover bugs, or have suggestions for
future development, consider submitting an issue on our github
development site:
http://github.com/rpruim/mosaic/issues/
You can also look there to see what's already on our to-do list.
---rjp (on behalf of the development team that includes Danny Kaplan
and Nick Horton)
=======================================================================
Randall Pruim phone: 616.526.7113
Dept. of Mathematics and Statistics email: rpruim at calvin.edu
Calvin College office: NH 284
1740 Knollcrest Circle SE URL: http://www.calvin.edu/~rpruim/
Grand Rapids, MI 49546-4403 FAX: 616.526.6501
---------------------------------------------
Here are the promised code examples to give you a feel for what mosaic
makes possible:
> mean(age, data=HELPrct)
[1] 35.65342
> mean(~age, data=HELPrct)
[1] 35.65342
> mean(age ~ sex, data=HELPrct)
female male
36.25234 35.46821
> mean(age ~ sex & treat, data=HELPrct)
female.no male.no female.yes male.yes
37.56364 35.90173 34.86538 35.03468
> interval(binom.test( ~ eruptions > 3, faithful))
probability of success lower upper
0.6433824 0.5832982 0.7003038
> pval(binom.test( ~ eruptions > 3, faithful))
p.value
2.608528e-06
> xchisq.test(phs) # physicians health study example (data entry
> omitted)
Pearson's Chi-squared test with Yates' continuity correction
data: phs
X-squared = 24.4291, df = 1, p-value = 7.71e-07
104.00 10933.00
( 146.52) (10890.48)
[12.34] [ 0.17]
<-3.51> < 0.41>
189.00 10845.00
( 146.48) (10887.52)
[12.34] [ 0.17]
< 3.51> <-0.41>
key:
observed
(expected)
[contribution to X-squared]
<residual>
> model <- lm(length ~ width + sex, KidsFeet)
> L <- makeFun(model)
> L( 9.0, 'B')
1
24.80017
> L( 9.0, 'B', interval='confidence')
fit lwr upr
1 24.80017 24.30979 25.29055
> xyplot( length ~ width, groups= sex, KidsFeet ) # scatter plot
> with different symbols for boys and girls
> plotFun(L(x,'B') ~ x, add=TRUE) # add model fit (for boys) to plot
> plotFun(L(x,'G') ~ x, add=TRUE, lty=2) # add model fit (for girls)
> to plot
> rflip(10) # flip a coin 10 times
Flipping 10 coins [ Prob(Heads) = 0.5 ] ...
T H T H H H T H H T
Result: 6 heads.
> do(2) * rflip(10) # do that twice; notice that do() extracts
> interesting info
n heads tails
1 10 4 6
2 10 6 4
> ladyTastingTea <- do(5000) * rflip(10) # simulate 5000 ladies
tasting tea
> tally(~heads, ladyTastingTea)
0 1 2 3 4 5 6 7 8 9 10 Total
5 52 221 573 1032 1227 1027 606 198 52 7 5000
> tally(~heads, ladyTastingTea, format='proportion')
0 1 2 3 4 5 6 7 8
9 10 Total
0.0010 0.0104 0.0442 0.1146 0.2064 0.2454 0.2054 0.1212 0.0396 0.0104
0.0014 1.0000
# do() extracts useful information from lm objects so that
randomization tests are easy.> do(2) * lm( length ~ width + shuffle(sex), data=KidsFeet )
Intercept width sexG sigma r-squared
1 9.646822 1.693137 -0.3057453 1.026824 0.4246224
2 11.416739 1.453416 0.4860068 1.013323 0.4396534
> tally( ~ sex & substance, HELPrct )
substance
sex alcohol cocaine heroin Total
female 36 41 30 107
male 141 111 94 346
Total 177 152 124 453
> tally( ~ sex | substance, HELPrct ) # auto switch to proportions
> for conditional distributions
substance
sex alcohol cocaine heroin
female 0.2033898 0.2697368 0.2419355
male 0.7966102 0.7302632 0.7580645
Total 1.0000000 1.0000000 1.0000000
> favstats(age ~ sex & substance, data=HELPrct)
min Q1 median Q3 max mean sd n missing
female.alcohol 23 33 37.0 45 58 39.16667 7.980333 36 0
male.alcohol 20 32 38.0 42 58 37.95035 7.575644 141 0
female.cocaine 24 31 34.0 38 49 34.85366 6.195002 41 0
male.cocaine 23 30 33.0 37 60 34.36036 6.889772 111 0
female.heroin 21 29 34.0 39 55 34.66667 8.035839 30 0
male.heroin 19 27 32.5 39 53 33.05319 7.973568 94 0
> D(sin(a*x) ~ x) # return derivative as a function with parameter a
function (x, a)
cos(a * x) * a
[[alternative HTML version deleted]]
_______________________________________________
R-packages mailing list
R-packages at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages