?aggregate says:
"... the result is reformatted into a data frame containing the variables
in
by and x. The ones arising from by contain the unique combinations of
grouping values used for determining the subsets, and the ones arising from
x the corresponding summary statistics for the subset of the respective
variables in x. "
so meansbymsa does not have the same number of rows as your original data
frame, which it must for subsetting to work properly (meansbymsa[,2] was
recycled to be of the right length by default, which produces the nonsense
you got. See ?xyplot)
Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404
650-467-7374
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Peter Flom
Sent: Wednesday, February 07, 2007 12:10 PM
To: r-help at r-project.org
Subject: [R] Problem with subsets and xyplot
Hello
I have a dataframe that looks like this
MSA CITY HIVEST YEAR YR CAT
1 0200 Albuquerque 0.50 1996 1996 5
2 0520 Atlanta 13.00 1997 1997 5
3 0720 Baltimore 29.10 1994 1994 1
4 0720 Baltimore 13.00 1995 1995 5
5 0720 Baltimore 3.68 1996 1996 3
6 0720 Baltimore 9.00 1997 1997 5
7 0720 Baltimore 11.00 1998 1998 5
8 0875 Bergen-Passaic 51.80 1990 1990 5
many more rows....
I would like to create some xyplots, but separately for MSAs that are
high, moderate or low on HIVEST. Here's what I tried
#### READ IN DATA AND RECODE SOME VARIABLES
attach(hivest)
cat <- CAT
cat[cat > 5] <- 6
msa <- as.numeric(MSA)
msa[msa == 7361] <- 7360
msa[msa == 7362] <- 7360
msa[msa == 7363] <- 7360
msa[msa == 5601] <- 5600
msa[msa == 5602] <- 5600
msa[msa == 6484] <- 6483
#### FIND MEANS FOR EACH MSA, FOR SUBSETTING LATER
meanbymsa <- aggregate(HIVEST, by = list(msa), FUN = mean, na.rm = T)
#### meanbymsa[,2] gives me the column I want; the 25%tile of this
column is about 3.1.
but when I try
plot1 <- xyplot(HIVEST~YEAR|as.factor(msa), pch = LETTERS[cat], subset
= (meanbymsa[,2] < 3.1))
plot1
I don't get what I expect. No errors, and it is a subset, but the
subset is NOT MSAs with low values of HIVEST.
Any help appreciated.
Peter
Peter L. Flom, PhD
Assistant Director, Statistics and Data Analysis Core
Center for Drug Use and HIV Research
National Development and Research Institutes
71 W. 23rd St
http://cduhr.ndri.org
www.peterflom.com
New York, NY 10010
(212) 845-4485 (voice)
(917) 438-0894 (fax)
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.