I am exploring the result of clustering a large multivariate data set 
into a number of groups, represented, say, by a factor G.
I wrote a function to see how categorical variables vary between groups:
 >  ddisp <- function(dvar) {
+  csqt <- chisq.test(G,dvar)
+  print(csqt$statistic)
+  print(csqt$observed)
+  print(round(csqt$expected))
+  round(csqt$residuals)
+  }
 >
 >  x <- ceiling(4*runif(100))
 >  G <- gl(4,1,100)
 >  ddisp(x)
X-squared
  6.235645
    dvar
G    1  2  3  4
   1 10  5  5  5
   2  6  9  5  5
   3  8  6  5  6
   4  7  4  4 10
    dvar
G   1 2 3 4
   1 8 6 5 6
   2 8 6 5 6
   3 8 6 5 6
   4 8 6 5 6
    dvar
G    1  2  3  4
   1  1  0  0 -1
   2 -1  1  0 -1
   3  0  0  0  0
   4  0 -1  0  1
Warning message:
Chi-squared approximation may be incorrect in: chisq.test(G, dvar)
As I need to apply this function to a large number of variables x it 
would be helpful if the function printed "x" rather than the formal 
argument "dvar". I have a vague idea that things like deparse() and 
substitute() will come into the solution but I have not yet come up with 
the right incantation. Any help appreciated!
Murray Jorgensen
-- 
Dr Murray Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: maj at waikato.ac.nz                                Fax 7 838 4155
Phone  +64 7 838 4773 wk    Home +64 7 825 0441    Mobile 021 1395 862
ddisp <- function(dvar) {
     yn <- substitute(dvar)
     csqt <- eval.parent(substitute(chisq.test(G,dvar), list(dvar=yn)))
     ....
}
There are other ways, such as forming the cross-classification table, 
setting its dimnames and passing that to chisq.test.
On Mon, 13 Nov 2006, Murray Jorgensen wrote:
> I am exploring the result of clustering a large multivariate data set
> into a number of groups, represented, say, by a factor G.
>
> I wrote a function to see how categorical variables vary between groups:
>
> >  ddisp <- function(dvar) {
> +  csqt <- chisq.test(G,dvar)
> +  print(csqt$statistic)
> +  print(csqt$observed)
> +  print(round(csqt$expected))
> +  round(csqt$residuals)
> +  }
> >
> >  x <- ceiling(4*runif(100))
> >  G <- gl(4,1,100)
> >  ddisp(x)
> X-squared
>  6.235645
>    dvar
> G    1  2  3  4
>   1 10  5  5  5
>   2  6  9  5  5
>   3  8  6  5  6
>   4  7  4  4 10
>    dvar
> G   1 2 3 4
>   1 8 6 5 6
>   2 8 6 5 6
>   3 8 6 5 6
>   4 8 6 5 6
>    dvar
> G    1  2  3  4
>   1  1  0  0 -1
>   2 -1  1  0 -1
>   3  0  0  0  0
>   4  0 -1  0  1
> Warning message:
> Chi-squared approximation may be incorrect in: chisq.test(G, dvar)
>
> As I need to apply this function to a large number of variables x it
> would be helpful if the function printed "x" rather than the
formal
> argument "dvar". I have a vague idea that things like deparse()
and
> substitute() will come into the solution but I have not yet come up with
> the right incantation. Any help appreciated!
>
> Murray Jorgensen
>
>
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595
Thanks for these suggestions, Professor Ripley. It's interesting that the function parameters in R are not truly "dummy" as they can effect the result of a function. Murray Prof Brian Ripley wrote:> ddisp <- function(dvar) { > yn <- substitute(dvar) > csqt <- eval.parent(substitute(chisq.test(G,dvar), list(dvar=yn))) > .... > } > > There are other ways, such as forming the cross-classification table, > setting its dimnames and passing that to chisq.test. > > On Mon, 13 Nov 2006, Murray Jorgensen wrote: > >> I am exploring the result of clustering a large multivariate data set >> into a number of groups, represented, say, by a factor G. >> >> I wrote a function to see how categorical variables vary between groups: >> >> > ddisp <- function(dvar) { >> + csqt <- chisq.test(G,dvar) >> + print(csqt$statistic) >> + print(csqt$observed) >> + print(round(csqt$expected)) >> + round(csqt$residuals) >> + } >> > >> > x <- ceiling(4*runif(100)) >> > G <- gl(4,1,100) >> > ddisp(x) >> X-squared >> 6.235645 >> dvar >> G 1 2 3 4 >> 1 10 5 5 5 >> 2 6 9 5 5 >> 3 8 6 5 6 >> 4 7 4 4 10 >> dvar >> G 1 2 3 4 >> 1 8 6 5 6 >> 2 8 6 5 6 >> 3 8 6 5 6 >> 4 8 6 5 6 >> dvar >> G 1 2 3 4 >> 1 1 0 0 -1 >> 2 -1 1 0 -1 >> 3 0 0 0 0 >> 4 0 -1 0 1 >> Warning message: >> Chi-squared approximation may be incorrect in: chisq.test(G, dvar) >> >> As I need to apply this function to a large number of variables x it >> would be helpful if the function printed "x" rather than the formal >> argument "dvar". I have a vague idea that things like deparse() and >> substitute() will come into the solution but I have not yet come up with >> the right incantation. Any help appreciated! >> >> Murray Jorgensen >> >> >-- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: maj at waikato.ac.nz Fax 7 838 4155 Phone +64 7 838 4773 wk Home +64 7 825 0441 Mobile 021 1395 862 -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: maj at waikato.ac.nz Fax 7 838 4155 Phone +64 7 838 4773 wk Home +64 7 825 0441 Mobile 021 1395 862