Rory Campbell-Lange
2004-Apr-06 14:07 UTC
[R] Ignorant lack of bliss : summarise table by column attribute
Having read the list posting guidelines I fear my first post is about to break the rules. Apologies in advance. We have been asked to produce some graphs of relative performance of 3 groups of people in relation to the trend of their previous performance. I am neither a mathematician or a statistician, but wondered if R (which I have been using as a desktop calculator!) and some knowledge from this list may be able to help. We have a dataset something like this: group | previousavg | lastreading | finalreading ------------------------------------------------ 1 | 9.5 | 10 | 12 1 | 7 | 9 | 11 1 | 12 | 11 | 12 2 | 13 | 14 | 16 2 | 11 | 10 | 9 3 | 10 | 10 | 10.5 3 | 8.5 | 10 | 12 I need to produce some graphs typifying the change for each group between a _projected_ final reading and the final reading given. The time difference between previousavg and lastreading is 1/2 that between lastreading and finalreading. Where I have got to so far: I have read the result set (less than 200 rows) into a table 'results', attached it and then rather crudely constructed projected figures : results$projected = ((lastreading - previousavg) * 2) + lastreading then I can see the differentials between projected and finalreading: > result$projected - results$finalreading [1] -1.4 6.9 1.1 3.4 0.0 3.6 -3.8 0.1 -0.1 0.9 1.2 -3.4 -1.5 0.1 5.6 [16] -3.3 -1.9 0.9 -3.1 1.5 0.7 -1.6 -0.3 1.1 -0.1 -0.6 1.5 0.2 0.8 -1.0 [31] 0.8 -0.5 1.9 -4.0 -3.3 3.1 2.8 -0.6 1.2 2.0 -1.9 -1.6 -1.1 -3.9 NA ... Aims: - Summarise these by groups (I can't work out how to use tapply...) - Produce a sensible 'typification' of each group's change in relation to the projected figure. I assume this would use a statistical algorithm to exclude exceptions. - Plot the 3 'typifications' in sensible relation to each other, possibly with data points showing the source of these lines. My sincere apologies if this is completely off-topic for this list. I'm hoping to learn a little by understanding how certain functions are used (approaching this like a programmer rather than a statistician.) If I needed to learn more is the book "Introductory Statistics with R" a good place to start? Thanks Rory -- Rory Campbell-Lange <rory at campbell-lange.net> <www.campbell-lange.net>
Jason Turner
2004-Apr-06 20:19 UTC
[R] Ignorant lack of bliss : summarise table by column attribute
Using the data you supplied, and the work you've done so far, I also did this: ## make sure R knows that "group" isn't numeric ## might not be necessary, depending how you imported your data. results$group <- factor(results$group) Since you attached the data frame, then altered it, a warning is in order: as the help page for "attach" says, the altered copy and the attached copy are *two different objects*. Beware. See the help page for "with" for a less confusing way to save typing. results$resid <- results$projected - results$finalreading Rory Campbell-Lange wrote: > Aims: > > - Summarise these by groups (I can't work out how to use tapply...) ## some fun stats tapply(results$resid, results$group, fivenum) tapply(results$resid, results$group, mean) tapply(results$resid, results$group, sd) ... ## nice plots: library(lattice) dotplot(group ~ resid, data=results) bwplot(group ~ resid, data=results)> - Produce a sensible 'typification' of each group's change in > relation to the projected figure. I assume this would use a > statistical algorithm to exclude exceptions.You'll want input from a Real Statistician for that. This will get you both started: res.mod <- lm(resid ~ group, data=results) summary(res.mod) plot(res.mod) ## before proceeding, find someone who understands the ## following help page. help(p.adjust)> If I needed to learn more is the book "Introductory Statistics with R" a > good place to start?Definately. Among other things, it'll explain the cryptic remark above about the help page. After you've munched through that book, get "Modern Applied Statistics with S", by Venables and Ripley, published by Springer-Verlag, preferably the 4th edition (2002). They're both great texts for their jobs. Cheers Jason
Possibly Parallel Threads
- In regards to the Linux ''Bliss'' Virus.
- New package: hbim - Hill/Bliss Independence Model for Multicomponent Vaccines
- New package: hbim - Hill/Bliss Independence Model for Multicomponent Vaccines
- bliss version 0.4.0
- How to summarise several models in a single table