thr3ads.net - R help - [R] strange behavior of names<- [May 2004]

If this information is useful, please help other people find it:
Share via:

Liaw, Andy

2004-May-10 02:41 UTC

[R] strange behavior of names<-

Dear R-help,

I've encounter what seems to me a strange problem with
"names<-".  Suppose I
define the function:

fun <- function(x, f) {
    m <- tapply(x, f, mean)
    ans <- x - m[match(f, unique(f))]
    names(ans) <- names(x)
    ans
}

which subtract out the means of `x' grouped by `f' (which is the same
as,
e.g., resid(lm(x~f)) if `f' is a factor).  If `x' does not have names,
then
I'd expect the output of the function not to have names, as names(x) would
be NULL, and assigning NULL to names(ans) should wipe out the names of
`ans'.  However, I get:
> x = rnorm(20)
> f = factor(sample(rep(letters[1:4], 5)))
> fun(x, f)          a           b           c           b           c           c
d 
-0.53791639  1.03704065  0.95727411  0.89219177 -0.04218746  0.57976675
-2.15799919 
          a           c           d           a           d           b
d 
 1.28422452 -0.92881186  0.40526262 -0.13471983 -0.72599709  1.68726680
-0.95420354 
          a           c           a           b           b           d 
-2.28013373  1.02522037  0.07728352  0.54321899  0.95742354 -1.68420455 

What am I missing?

[BTW, this is using the tip that Thomas Lumley posted about forming the
group means.  I've wanted to write a `tsweep' function that's sort
of the
cross of tapply() and sweep().]

Best,
Andy Liaw, PhD
Biometrics Research      PO Box 2000, RY33-300     
Merck Research Labs           Rahway, NJ 07065
mailto:andy_liaw at merck.com        732-594-0820

Gabor Grothendieck

2004-May-10 04:58 UTC

head link

[R] strange behavior of names<-

Execute these two commands:
   ans <- fun(x,f)
   attributes(ans)

and you get this:

   $dim
   [1] 20

   $dimnames
   $dimnames[[1]]
[1] "a" "a" "b" "c" "a"
"d" "a" "b" "d" "d"
"a" "b" "d" "c" "c"
"c" "b" "c" "b"
[20] "d"

so ans does not have names, it has dimnames.  If you try dimnames(ans) <-
NULL
then its dimnames do get nulled out.  


Liaw, Andy <andy_liaw <at> merck.com> writes:

: 
: Dear R-help,
: 
: I've encounter what seems to me a strange problem with
"names<-".  Suppose I
: define the function:
: 
: fun <- function(x, f) {
:     m <- tapply(x, f, mean)
:     ans <- x - m[match(f, unique(f))]
:     names(ans) <- names(x)
:     ans
: }
: 
: which subtract out the means of `x' grouped by `f' (which is the same
as,
: e.g., resid(lm(x~f)) if `f' is a factor).  If `x' does not have names,
then
: I'd expect the output of the function not to have names, as names(x) would
: be NULL, and assigning NULL to names(ans) should wipe out the names of
: `ans'.  However, I get:
: 
: > x = rnorm(20)
: > f = factor(sample(rep(letters[1:4], 5)))
: > fun(x, f)
:           a           b           c           b           c           c
: d 
: -0.53791639  1.03704065  0.95727411  0.89219177 -0.04218746  0.57976675
: -2.15799919 
:           a           c           d           a           d           b
: d 
:  1.28422452 -0.92881186  0.40526262 -0.13471983 -0.72599709  1.68726680
: -0.95420354 
:           a           c           a           b           b           d 
: -2.28013373  1.02522037  0.07728352  0.54321899  0.95742354 -1.68420455 
: 
: What am I missing?
: 
: [BTW, this is using the tip that Thomas Lumley posted about forming the
: group means.  I've wanted to write a `tsweep' function that's sort
of the
: cross of tapply() and sweep().]
: 
: Best,
: Andy Liaw, PhD
: Biometrics Research      PO Box 2000, RY33-300     
: Merck Research Labs           Rahway, NJ 07065
: mailto:andy_liaw <at> merck.com        732-594-0820
: 
: ______________________________________________
: R-help <at> stat.math.ethz.ch mailing list
: https://www.stat.math.ethz.ch/mailman/listinfo/r-help
: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
: 
:

Prof Brian Ripley

2004-May-10 06:21 UTC

head link

[R] strange behavior of names<-

Remember tapply with a single factor in R returns a 1D array.  What you
are seeing are the dimnames, not the names: look at attributes() on your
return value (or even name() or str() on it).

I suspect you intended an as.vector() call in the formation of m.

Brian

On Sun, 9 May 2004, Liaw, Andy wrote:
> I've encounter what seems to me a strange problem with
"names<-".  Suppose I
> define the function:
> 
> fun <- function(x, f) {
>     m <- tapply(x, f, mean)
>     ans <- x - m[match(f, unique(f))]
>     names(ans) <- names(x)
>     ans
> }
> 
> which subtract out the means of `x' grouped by `f' (which is the
same as,
> e.g., resid(lm(x~f)) if `f' is a factor).  If `x' does not have
names, then
> I'd expect the output of the function not to have names, as names(x)
would
> be NULL, and assigning NULL to names(ans) should wipe out the names of
> `ans'.  However, I get:
> 
> > x = rnorm(20)
> > f = factor(sample(rep(letters[1:4], 5)))
> > fun(x, f)
>           a           b           c           b           c           c
> d 
> -0.53791639  1.03704065  0.95727411  0.89219177 -0.04218746  0.57976675
> -2.15799919 
>           a           c           d           a           d           b
> d 
>  1.28422452 -0.92881186  0.40526262 -0.13471983 -0.72599709  1.68726680
> -0.95420354 
>           a           c           a           b           b           d 
> -2.28013373  1.02522037  0.07728352  0.54321899  0.95742354 -1.68420455 
> 
> What am I missing?

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Peter Dalgaard

2004-May-10 07:33 UTC

head link

[R] strange behavior of names<-

"Liaw, Andy" <andy_liaw at merck.com> writes:
> [BTW, this is using the tip that Thomas Lumley posted about forming the
> group means.  I've wanted to write a `tsweep' function that's
sort of the
> cross of tapply() and sweep().]
Also notice that this is

unsplit(lapply(split(x, g), scale, scale=FALSE), g)

and the generalized sweep might be written along the lines of

unsplit(mapply("-",split(x,g),tapply(x,g,mean)),g)

Can't vouch for the speed, though.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907

Christophe Pallier

2004-May-10 08:09 UTC

head link

[R] Getting the groupmean for each person

Liaw, Andy wrote:
>Suppose I
>define the function:
>
>fun <- function(x, f) {
>    m <- tapply(x, f, mean)
>    ans <- x - m[match(f, unique(f))]
>    names(ans) <- names(x)
>    ans
>}
>
>  
>
May I ask what is the purpose of match(f,unique(f)) ?

To remove the group means, I have be using:

x-tapply(x,f,mean)[f]

for a while, (and I am now changing to 
x-tapply(x,f,mean)[as.character(f)] because of the peculiarities of 
indexing named vectors with factors )

The use of tapply(x,f,mean)[match(f,unique(f))] assumes a particular 
order in the result of tapply, no? It seems a bit dangerous to me.


Christophe Pallier

kjetil@acelerate.com

2004-May-10 10:41 UTC

head link

[R] Getting the groupmean for each person

On 10 May 2004 at 10:09, Christophe Pallier wrote:
> 
> 
> Liaw, Andy wrote:
> 
> >Suppose I
> >define the function:
> >
> >fun <- function(x, f) {
> >    m <- tapply(x, f, mean)
> >    ans <- x - m[match(f, unique(f))]
> >    names(ans) <- names(x)
> >    ans
> >}
> >
> >  
> >
> 
> May I ask what is the purpose of match(f,unique(f)) ?
> 
> To remove the group means, I have be using:
> 
> x-tapply(x,f,mean)[f]
> 
> for a while, (and I am now changing to 
> x-tapply(x,f,mean)[as.character(f)] because of the peculiarities of
wouldn't 
 sweep(as.array(x), 1, tapply(x,f,mean)[as.character(f)] , "-")

be more natural?

Kjetil Halvorsen
> indexing named vectors with factors )
> 
> The use of tapply(x,f,mean)[match(f,unique(f))] assumes a particular
> order in the result of tapply, no? It seems a bit dangerous to me.
> 
> 
> Christophe Pallier
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>

Thomas Lumley

2004-May-10 13:56 UTC

head link

[R] strange behavior of names<-

On Sun, 9 May 2004, Liaw, Andy wrote:
> Dear R-help,
>
> I've encounter what seems to me a strange problem with
"names<-".  Suppose I
> define the function:
>
> fun <- function(x, f) {
>     m <- tapply(x, f, mean)
>     ans <- x - m[match(f, unique(f))]
>     names(ans) <- names(x)
>     ans
> }
>
> which subtract out the means of `x' grouped by `f' (which is the
same as,
> e.g., resid(lm(x~f)) if `f' is a factor).  If `x' does not have
names, then
> I'd expect the output of the function not to have names, as names(x)
would
> be NULL, and assigning NULL to names(ans) should wipe out the names of
> `ans'.  However, I get:
That's because ans is a 1-d matrix, not a vector. If you want ans to be a
vector you need
	ans <- as.vector(x-m[match(f, unique(f))])
	names(ans)<-names(x)

	-thomas

Thomas Lumley

2004-May-10 13:59 UTC

head link

[R] Getting the groupmean for each person

On Mon, 10 May 2004, Christophe Pallier wrote:>
> The use of tapply(x,f,mean)[match(f,unique(f))] assumes a particular
> order in the result of tapply, no? It seems a bit dangerous to me.
>
My original code for the group means problem used rowsum(,reorder=FALSE)
rather than tapply(), and we do know that this produces the same order as
unique().

	-thomas

R help - May 2004 - strange behavior of names<-

[R] strange behavior of names<-

[R] strange behavior of names<-

[R] strange behavior of names<-

[R] strange behavior of names<-

[R] Getting the groupmean for each person

[R] Getting the groupmean for each person

[R] strange behavior of names<-

[R] Getting the groupmean for each person