thr3ads.net - R help - [R] boxplots of 1 datum AND comparing rank and boolean [Mar 2000]

If this information is useful, please help other people find it:
Share via:

Dan E. Kelley

2000-Mar-14 23:29 UTC

[R] boxplots of 1 datum AND comparing rank and boolean

Q: When R does 'plot()' in a context that yields boxplots, is there a
way to force it to draw something even if there are only 1 or two data
in the category?  I'd like for it to draw the data, perhaps using the
outlier symbols.  My code is (*** marks the line in question) is the
following, for R-1.0.0:

	d <- read.table("nserc-results-pgsb", header=FALSE, 
                        
col.names=c("name","dept","rank","accept"))
	# These data look like:
	#   First.Student   Some.Department     1  1
	#   Second.Student  Another.Department  2  1
	#   Third.Student   Another.Department  3  0
	attach(d)
	rank.inv <- 1/rank
        ll <- lm(accept ~ rank.inv + dept, data=d)
	print(summary(ll))
	print(anova(ll))
	plot(dept,resid(ll))	# makes boxplots ***

Actually, if anybody has a bright idea how I should analyse such data,
I'd love to hear it.  As you can see in the above, I transformed to
1/rank since our committee recorded high 'rank' values for students we
favoured.  It's not clear to me how to compare rankings to boolean
(accept/deny) results, so the 'lm()' above might be silly.

Thanks in advance for any advice.  This group is so generous, it
amazes me.

PS: just because I think it's fun to read what sort of work folks are
doing, the above is work I'm doing in trying to analyze the patterns
in the granting of scholarships by NSERC, the science granting agency
in Canada.  I chair a committee at my university that ranks
postgraduate students and sends the files to NSERC.  While NSERC
nearly obeys our rankings, it seems to me that favour some
departments.  I'd like to test that (hence "accept ~ rank.inv +
dept"
in the above).

Dan E. Kelley                   internet:   mailto:Dan.Kelley at Dal.CA
Oceanography Department         phone:                 (902)494-1694
Dalhousie University            fax:                   (902)494-2885
Halifax, NS, CANADA, B3H 4J1    http://www.phys.ocean.dal.ca/~kelley

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Martin Maechler

2000-Mar-15 11:26 UTC

head link

[R] boxplots of 1 (or 2 or 3) "datum" ..

>>>>> "Dan" == Dan E Kelley <kelley at
Phys.Ocean.Dal.CA> writes:
    Dan> Q: When R does 'plot()' in a context that yields boxplots,
is there a
    Dan> way to force it to draw something even if there are only 1 or two
data
    Dan> in the category?  I'd like for it to draw the data, perhaps
using the
    Dan> outlier symbols.  My code is (*** marks the line in question) is the
    Dan> following, for R-1.0.0:

    Dan> d <- read.table("nserc-results-pgsb", header=FALSE, 
    Dan>                
col.names=c("name","dept","rank","accept"))
    Dan> # These data look like:
    Dan> #   First.Student   Some.Department     1  1
    Dan> #   Second.Student  Another.Department  2  1
    Dan> #   Third.Student   Another.Department  3  0
but contain more than just three observations, right ?

    Dan> attach(d)
    Dan> rank.inv <- 1/rank
    Dan> ll <- lm(accept ~ rank.inv + dept, data=d)
    Dan> print(summary(ll))
    Dan> print(anova(ll))
    Dan> plot(dept,resid(ll))	# makes boxplots ***

    Dan> Actually, if anybody has a bright idea how I should analyse such
data,
    Dan> I'd love to hear it.  As you can see in the above, I transformed
to
    Dan> 1/rank since our committee recorded high 'rank' values for
students we
    Dan> favoured.  It's not clear to me how to compare rankings to
boolean
    Dan> (accept/deny) results, so the 'lm()' above might be silly.

I have misunderstood you completely..
Problem is I cannot repeat your example, since you didn't use
"public" data.
(Usually, you'd construct data, something like
	 d <- data.frame(accept = rbinom(100, size=1, pr = .4),
	                 rank = sample(1:100),
			 dept = gl(5, 20))
)
Are you discussing the boxplots that are produced with only 1 or 2
observations per group?

Here are boxplots for n=1, 2, 3, and 4 obs. per group.
What's wrong with these ?

   do.call("boxplot", lapply(1:4,seq))
   title("Boxplot()s of very few points")

*Or* are you suggesting that for n=1, n=2 (and maybe n=3) per group
        plot(factor, continuous)
shouldn't use boxplot()s but rather dot plots ?
This is a suggestion that I've heard and had myself before,
very well worth discussing.

- How should the decision  boxplot / dotplot be made, just depend on n?
  Wouldn't one want the box + the single observations, e.g. when in
  one group n = 3, but in all other groups n ~= 20 (which would make
					     boxplots there in any case)?
- (When) should jittering be used ?

Regards,
Martin Maechler <maechler at stat.math.ethz.ch>
http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO D10	Leonhardstr. 27
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1228			<><
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Reasonably Related Threads

Search for more apparently analagous threads

R help - Mar 2000 - boxplots of 1 datum AND comparing rank and boolean

[R] boxplots of 1 datum AND comparing rank and boolean

[R] boxplots of 1 (or 2 or 3) "datum" ..

Reasonably Related Threads