Dan E. Kelley
2000-Mar-14 23:29 UTC
[R] boxplots of 1 datum AND comparing rank and boolean
Q: When R does 'plot()' in a context that yields boxplots, is there a
way to force it to draw something even if there are only 1 or two data
in the category? I'd like for it to draw the data, perhaps using the
outlier symbols. My code is (*** marks the line in question) is the
following, for R-1.0.0:
d <- read.table("nserc-results-pgsb", header=FALSE,
col.names=c("name","dept","rank","accept"))
# These data look like:
# First.Student Some.Department 1 1
# Second.Student Another.Department 2 1
# Third.Student Another.Department 3 0
attach(d)
rank.inv <- 1/rank
ll <- lm(accept ~ rank.inv + dept, data=d)
print(summary(ll))
print(anova(ll))
plot(dept,resid(ll)) # makes boxplots ***
Actually, if anybody has a bright idea how I should analyse such data,
I'd love to hear it. As you can see in the above, I transformed to
1/rank since our committee recorded high 'rank' values for students we
favoured. It's not clear to me how to compare rankings to boolean
(accept/deny) results, so the 'lm()' above might be silly.
Thanks in advance for any advice. This group is so generous, it
amazes me.
PS: just because I think it's fun to read what sort of work folks are
doing, the above is work I'm doing in trying to analyze the patterns
in the granting of scholarships by NSERC, the science granting agency
in Canada. I chair a committee at my university that ranks
postgraduate students and sends the files to NSERC. While NSERC
nearly obeys our rankings, it seems to me that favour some
departments. I'd like to test that (hence "accept ~ rank.inv +
dept"
in the above).
Dan E. Kelley internet: mailto:Dan.Kelley at Dal.CA
Oceanography Department phone: (902)494-1694
Dalhousie University fax: (902)494-2885
Halifax, NS, CANADA, B3H 4J1 http://www.phys.ocean.dal.ca/~kelley
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>>>>> "Dan" == Dan E Kelley <kelley at Phys.Ocean.Dal.CA> writes:Dan> Q: When R does 'plot()' in a context that yields boxplots, is there a Dan> way to force it to draw something even if there are only 1 or two data Dan> in the category? I'd like for it to draw the data, perhaps using the Dan> outlier symbols. My code is (*** marks the line in question) is the Dan> following, for R-1.0.0: Dan> d <- read.table("nserc-results-pgsb", header=FALSE, Dan> col.names=c("name","dept","rank","accept")) Dan> # These data look like: Dan> # First.Student Some.Department 1 1 Dan> # Second.Student Another.Department 2 1 Dan> # Third.Student Another.Department 3 0 but contain more than just three observations, right ? Dan> attach(d) Dan> rank.inv <- 1/rank Dan> ll <- lm(accept ~ rank.inv + dept, data=d) Dan> print(summary(ll)) Dan> print(anova(ll)) Dan> plot(dept,resid(ll)) # makes boxplots *** Dan> Actually, if anybody has a bright idea how I should analyse such data, Dan> I'd love to hear it. As you can see in the above, I transformed to Dan> 1/rank since our committee recorded high 'rank' values for students we Dan> favoured. It's not clear to me how to compare rankings to boolean Dan> (accept/deny) results, so the 'lm()' above might be silly. I have misunderstood you completely.. Problem is I cannot repeat your example, since you didn't use "public" data. (Usually, you'd construct data, something like d <- data.frame(accept = rbinom(100, size=1, pr = .4), rank = sample(1:100), dept = gl(5, 20)) ) Are you discussing the boxplots that are produced with only 1 or 2 observations per group? Here are boxplots for n=1, 2, 3, and 4 obs. per group. What's wrong with these ? do.call("boxplot", lapply(1:4,seq)) title("Boxplot()s of very few points") *Or* are you suggesting that for n=1, n=2 (and maybe n=3) per group plot(factor, continuous) shouldn't use boxplot()s but rather dot plots ? This is a suggestion that I've heard and had myself before, very well worth discussing. - How should the decision boxplot / dotplot be made, just depend on n? Wouldn't one want the box + the single observations, e.g. when in one group n = 3, but in all other groups n ~= 20 (which would make boxplots there in any case)? - (When) should jittering be used ? Regards, Martin Maechler <maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO D10 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <>< -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._