Not exactly sure what you mean, but here is something that might be close.
I used only a subset of your data to see it this is what you want. This
computes the mean of all hpgpa, excluding that row:
> data[x[['2005.e']],] # subset of your data for yr=2005,
conf='e'
case hsgpa yr conf
73 3442 3.406104 2005 e
216 3017 4.071830 2005 e
284 3626 3.418870 2005 e
797 2184 3.459729 2005 e
881 3030 3.147831 2005 e
1030 9600 4.140025 2005 e
1071 1972 3.423202 2005 e
1100 8293 3.880199 2005 e
1219 5162 3.470179 2005 e
1276 5905 3.533801 2005 e
1312 3785 3.521670 2005 e
1363 8880 2.975047 2005 e
1426 123 3.070349 2005 e
1427 947 NA 2005 e
1475 3592 3.955794 2005 e
1635 366 3.172360 2005 e
1708 5257 3.612822 2005 e
1736 6256 NA 2005 e
1831 2112 3.719371 2005 e
1943 6528 3.322816 2005 e
1997 553 NA 2005 e
2208 2849 3.657016 2005 e
2240 6543 NA 2005 e
2360 9360 NA 2005 e
2611 4354 3.123671 2005 e
2659 1444 4.080455 2005 e
2704 9502 NA 2005 e
2714 8594 3.657861 2005 e
2732 4453 2.251620 2005 e
2778 875 3.913294 2005 e
2802 4022 3.970620 2005 e
2884 4473 3.650706 2005 e
2945 181 3.777851 2005 e
3059 6755 3.809683 2005 e
3327 8153 NA 2005 e
3380 3737 3.676996 2005 e
3404 4419 2.306697 2005 e
3577 3577 4.196025 2005 e
3608 457 4.150389 2005 e
3857 8642 3.220720 2005 e
3967 482 2.147233 2005 e
4122 4363 NA 2005 e
4185 651 4.087515 2005 e
4226 544 4.153056 2005 e
4362 1496 3.835143 2005 e
4475 1614 3.978524 2005 e
4680 6883 3.633342 2005 e
4739 5212 NA 2005 e
4843 3515 3.020855 2005 e
4867 2580 3.814048 2005 e
4887 7937 3.797753 2005 e> y <- data[x[['2005.e']],]
> str(y)
`data.frame': 51 obs. of 4 variables:
$ case : num 3442 3017 3626 2184 3030 ...
$ hsgpa: num 3.41 4.07 3.42 3.46 3.15 ...
$ yr : num 2005 2005 2005 2005 2005 ...
$ conf : chr "e" "e" "e" "e"
...> # compute the mean of all except the given row
> sapply(seq(nrow(y)), function(x) mean(y$hsgpa[-x],na.rm=TRUE))
[1] 3.556268 3.540030 3.555956 3.554960 3.562567 3.538367 3.555851 3.544704
3.554705 3.553153
[11] 3.553449 3.566781 3.564457 3.552692 3.542861 3.561969 3.551226 3.552692
3.548627 3.558299
[21] 3.552692 3.550148 3.552692 3.552692 3.563156 3.539820 3.552692 3.550127
3.584426 3.543897
[31] 3.542499 3.550302 3.547201 3.546424 3.552692 3.549660 3.583082 3.537001
3.538114 3.560789
[41] 3.586972 3.552692 3.539648 3.538049 3.545803 3.542306 3.550725 3.552692
3.565664 3.546318
[51] 3.546715> y$mean <- sapply(seq(nrow(y)), function(x) mean(y$hsgpa[-x],na.rm=TRUE))
> y
case hsgpa yr conf mean
73 3442 3.406104 2005 e 3.556268
216 3017 4.071830 2005 e 3.540030
284 3626 3.418870 2005 e 3.555956
797 2184 3.459729 2005 e 3.554960
881 3030 3.147831 2005 e 3.562567
1030 9600 4.140025 2005 e 3.538367
1071 1972 3.423202 2005 e 3.555851
1100 8293 3.880199 2005 e 3.544704
1219 5162 3.470179 2005 e 3.554705
1276 5905 3.533801 2005 e 3.553153
1312 3785 3.521670 2005 e 3.553449
1363 8880 2.975047 2005 e 3.566781
1426 123 3.070349 2005 e 3.564457
1427 947 NA 2005 e 3.552692
1475 3592 3.955794 2005 e 3.542861
1635 366 3.172360 2005 e 3.561969
1708 5257 3.612822 2005 e 3.551226
1736 6256 NA 2005 e 3.552692
1831 2112 3.719371 2005 e 3.548627
1943 6528 3.322816 2005 e 3.558299
1997 553 NA 2005 e 3.552692
2208 2849 3.657016 2005 e 3.550148
2240 6543 NA 2005 e 3.552692
2360 9360 NA 2005 e 3.552692
2611 4354 3.123671 2005 e 3.563156
2659 1444 4.080455 2005 e 3.539820
2704 9502 NA 2005 e 3.552692
2714 8594 3.657861 2005 e 3.550127
2732 4453 2.251620 2005 e 3.584426
2778 875 3.913294 2005 e 3.543897
2802 4022 3.970620 2005 e 3.542499
2884 4473 3.650706 2005 e 3.550302
2945 181 3.777851 2005 e 3.547201
3059 6755 3.809683 2005 e 3.546424
3327 8153 NA 2005 e 3.552692
3380 3737 3.676996 2005 e 3.549660
3404 4419 2.306697 2005 e 3.583082
3577 3577 4.196025 2005 e 3.537001
3608 457 4.150389 2005 e 3.538114
3857 8642 3.220720 2005 e 3.560789
3967 482 2.147233 2005 e 3.586972
4122 4363 NA 2005 e 3.552692
4185 651 4.087515 2005 e 3.539648
4226 544 4.153056 2005 e 3.538049
4362 1496 3.835143 2005 e 3.545803
4475 1614 3.978524 2005 e 3.542306
4680 6883 3.633342 2005 e 3.550725
4739 5212 NA 2005 e 3.552692
4843 3515 3.020855 2005 e 3.565664
4867 2580 3.814048 2005 e 3.546318
4887 7937 3.797753 2005 e 3.546715>
On 6/12/06, David Kling <klingd@reed.edu> wrote:>
> Hello:
>
> I hope none of you will mind helping a newbie. I'm a student research
> assistant working with a large data set in which observations are
> categorized according to two factors. I'm trying to calculate the group
> mean and variance of a variable (called 'hsgpa' in the example data
> presented below) to each observation , excluding that observation. For
> example, if there are 20 observations with the same value of the two
> factors, for each of the 20 I'd like to generate the mean and variance
> of the 'hsgpa' values of the other 19 group members. This must be
done
> for every observation in the data set.
>
> I've searched the R mail archives, read the manuals, and read
> documentation for tapply() andby() as well as summaryBy() in the
'doBy'
> package and with() from 'Hmisc.' It may be that since I'm new
to
> writing functions and R is the first language I've ever worked with
I'm
> less able to come up with a solution than some other new R users. None
> of the functions I have tried have been succesful, and it doesn't seem
> worth it to reproduce and explain my best effort. I hope someone has
> some ideas! Looking at what an experienced user would try should help
> me with my present task as well as future problems.
>
> Below I've included some lines that will generate a sample data set
> similar to the one I'm working with:
>
> #
> #Example data:
> #
> case <- sample(seq(1,10000,1),5000,replace=FALSE)
> hsgpa <- rbeta(5000,7,1.5)*4.25
> yr <- sample(seq(1993,2005,1),5000,replace=TRUE)
> conf <- sample(letters[1:5],5000,replace=TRUE)
> data <- data.frame(case=case,hsgpa=hsgpa,yr=yr,conf=conf)
> data$conf <- as.character(data$conf)
> s1 <- sample(seq(1,5000,1),500,replace=FALSE)
> k <- data$hsgpa
> k[row.names(data) %in% s1] <- NA
> data$hsgpa <- k
> s2 <- sample(seq(1,5000,1),100,replace=FALSE)
> k <- data$yr
> k[row.names(data) %in% s2] <- NA
> data$yr <- k
> k <- data$conf
> k[row.names(data) %in% s2] <- NA
> data$conf <- k
> remove(case,hsgpa,yr,conf,s1,s2,k)
> #
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390 (Cell)
+1 513 247 0281 (Home)
What is the problem you are trying to solve?
[[alternative HTML version deleted]]