thr3ads.net - R help - [R] comparing 3 datasets [Jun 2001]

If this information is useful, please help other people find it:
Share via:

pauljohn@ukans.edu

2001-Jun-19 23:59 UTC

[R] comparing 3 datasets

I have 3 datasets with the same variables.  I want to find out what
differences there are between the three, to know if an experimental
condition has an effect.  So I decided first to make histograms.  So I
created this handy "histomatic" function that creates a picture with
the
3 histograms on a single image:

I thought I was being clever, but in the end, no!

#read in 3 tables worth
NoFlagMod0<-read.table("NoFlagMod0.txt",header=TRUE);
RandMastMod0<-read.table("RandMastMod0.txt",header=TRUE);
NoMastMod0<-read.table("NoMastMod0.txt",header=TRUE);

#here''s my magical function
histomatic <- function (s1,s2,s3,var){
  if (is.numeric (s2[[var]])) {     
  par(mfrow=c(3,1));
  hist(s1[[var]],breaks=40,xlab=var);
  
  hist(s2[[var]], breaks=40,xlab=var);
  hist(s3[[var]], breaks=40,xlab=var);
  }
}
#cycle through all the variables, just grab names from first set.
nameList<-names(RandMastMod0);
par(ask=Yes)
for (var in nameList) histomatic(NoFlagMod0,RandMastMod0,NoMastMod0,var)

I knew I wanted a pretty fine grained display, so I set breaks at 40.
Other than that, I don''t know for sure what else I want.

Here''s the problem:
The histograms shown do not have the same ranges.  SInce the datasets
are slightly different, the ranges displayed are different, so they are
difficult to compare visually.  Is there a solution?

Other than that, if you have other ideas about comparing 3 datasets,
i''m
glad to hear.  I''m especially curious to know if there is a
significance
test of the hypothesis that 3 samples are drawn from a common
distribution. (apart from testing the means with an F test, that is).  

-- 
Paul E. Johnson                       email: pauljohn at ukans.edu
Dept. of Political Science            http://lark.cc.ukans.edu/~pauljohn
University of Kansas                  Office: (785) 864-9086
Lawrence, Kansas 66045                FAX: (785) 864-5700
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Deepayan Sarkar

2001-Jun-20 00:56 UTC

head link

[R] comparing 3 datasets

--- pauljohn at ukans.edu wrote:> I have 3 datasets with the same variables.  I want
> to find out what
> differences there are between the three, to know if
> an experimental
> condition has an effect.  So I decided first to make
> histograms.  So I
> created this handy "histomatic" function that
> creates a picture with the
> 3 histograms on a single image:
> 
> I thought I was being clever, but in the end, no!
> 
> #read in 3 tables worth
>
NoFlagMod0<-read.table("NoFlagMod0.txt",header=TRUE);>
RandMastMod0<-read.table("RandMastMod0.txt",header=TRUE);>
NoMastMod0<-read.table("NoMastMod0.txt",header=TRUE);> 
> #here''s my magical function
> histomatic <- function (s1,s2,s3,var){
>   if (is.numeric (s2[[var]])) {     
>   par(mfrow=c(3,1));
>   hist(s1[[var]],breaks=40,xlab=var);
>   
>   hist(s2[[var]], breaks=40,xlab=var);
>   hist(s3[[var]], breaks=40,xlab=var);
>   }
> }
> #cycle through all the variables, just grab names
> from first set.
> nameList<-names(RandMastMod0);
> par(ask=Yes)
> for (var in nameList)
> histomatic(NoFlagMod0,RandMastMod0,NoMastMod0,var)
> 
> I knew I wanted a pretty fine grained display, so I
> set breaks at 40.
> Other than that, I don''t know for sure what else I
> want.
> 
> Here''s the problem:
> The histograms shown do not have the same ranges. 
> SInce the datasets
> are slightly different, the ranges displayed are
> different, so they are
> difficult to compare visually.  Is there a solution?

An xlim argument should set the same limits for all
the
histograms. Change your function to:


histomatic <- function (s1,s2,s3,var){
   if (is.numeric (s2[[var]])) {     
   par(mfrow=c(3,1));
   xlim <- range(s1[[var]], s2[[var]], s3[[var]]);
   hist(s1[[var]], breaks=40,xlab=var, xlim = xlim);
   hist(s2[[var]], breaks=40,xlab=var, xlim = xlim);
   hist(s3[[var]], breaks=40,xlab=var, xlim = xlim);
}


This should work.

(You might also consider using the lattice (package in
Devel section) function histogram().)


__________________________________________________
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail
http://personal.mail.yahoo.com/
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Jonathan Baron

2001-Jun-20 01:25 UTC

head link

[R] comparing 3 datasets

>I have 3 datasets with the same variables.  I want to find out what
>differences there are between the three, to know if an experimental
>condition has an effect.  So I decided first to make histograms.  So I
>created this handy "histomatic" function that creates a picture
with the
>3 histograms on a single image:
>
>I thought I was being clever, but in the end, no!
>
>#read in 3 tables worth
>NoFlagMod0<-read.table("NoFlagMod0.txt",header=TRUE);
>RandMastMod0<-read.table("RandMastMod0.txt",header=TRUE);
>NoMastMod0<-read.table("NoMastMod0.txt",header=TRUE);
>
>#here''s my magical function
>histomatic <- function (s1,s2,s3,var){
>  if (is.numeric (s2[[var]])) {     
>  par(mfrow=c(3,1));
>  hist(s1[[var]],breaks=40,xlab=var);
>  
>  hist(s2[[var]], breaks=40,xlab=var);
>  hist(s3[[var]], breaks=40,xlab=var);
>  }
>}
>#cycle through all the variables, just grab names from first set.
>nameList<-names(RandMastMod0);
>par(ask=Yes)
>for (var in nameList) histomatic(NoFlagMod0,RandMastMod0,NoMastMod0,var)
>
>I knew I wanted a pretty fine grained display, so I set breaks at 40.
>Other than that, I don''t know for sure what else I want.
>
>Here''s the problem:
>The histograms shown do not have the same ranges.  SInce the datasets
>are slightly different, the ranges displayed are different, so they are
>difficult to compare visually.  Is there a solution?
>Paul E. Johnson                       email: pauljohn at ukans.edu
I can solve this problem, not the other one.  Here is an example,
the results of which are about half way down the page on:
http://www.psych.upenn.edu/courses/psych1_Fall2000/a4help.htm

The critical thing is using the different format for "breaks,"
which specifies the cutoffs rather than the number of them.  This
way you can use the same cutoffs for all the histograms.

par(mfrow=c(4,1))
hist(female$current,breaks=10*1:10,freq=FALSE,ylim=c(0,.1))
hist(female$ideal,breaks=10*1:10,freq=FALSE,ylim=c(0,.1))
hist(female$attract,breaks=10*1:10,freq=FALSE,ylim=c(0,.1))
hist(male$otherattract,breaks=10*1:10,freq=FALSE,ylim=c(0,.1))

Jon Baron

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Prof Brian D Ripley

2001-Jun-20 06:36 UTC

head link

[R] comparing 3 datasets

On Tue, 19 Jun 2001, Deepayan Sarkar wrote:

[...]
> An xlim argument should set the same limits for all
> the
> histograms. Change your function to:
Unfortunately not so.  The critical lines in hist.default are

        rx <- range(x)
        breaks <- pretty(rx, n = nnb, min.n = 1)

so the breaks depend on the range of the data and not on xlim.

Jon Baron had the better idea: use a grid for breaks.  So something like

rz <- range(s1[[var]], s2[[var]], s3[[var]])
breaks <- pretty(rz, 40)

...
hist(s1[[var]], breaks, xlab=var)
>
>
> histomatic <- function (s1,s2,s3,var){
>    if (is.numeric (s2[[var]])) {
>    par(mfrow=c(3,1));
>    xlim <- range(s1[[var]], s2[[var]], s3[[var]]);
>    hist(s1[[var]], breaks=40,xlab=var, xlim = xlim);
>    hist(s2[[var]], breaks=40,xlab=var, xlim = xlim);
>    hist(s3[[var]], breaks=40,xlab=var, xlim = xlim);
> }
>
[...]

Another idea: use truehist() in package MASS, which wants the grid spacing,
not the number of breaks.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

R help - Jun 2001 - comparing 3 datasets

[R] comparing 3 datasets

[R] comparing 3 datasets

[R] comparing 3 datasets

[R] comparing 3 datasets