thr3ads.net - R help - [R] Data manipulation question [Oct 2007]

If this information is useful, please help other people find it:
Share via:

Julien Barnier

2007-Oct-10 10:10 UTC

[R] Data manipulation question

Hi all,

Suppose I have the following data.frame, with an id column and two
variables columns :

id        X       Y
0001      NA      21
0002      NA      13
0003      0001    45
0004      NA      71
0005      0003    20

What I would like to do is to create a new variable Z whose values are
the Y value for the id value in X, that is :

id        X       Y          Z
0001      NA      21         NA
0002      NA      13         NA
0003      0001    45         21
0004      NA      71         NA
0005      0003    20         45

Do you have an idea on how to obtain that without using a for loop ?

Thanks in advance for any help,

Julien



Here is the R code to reproduce the first data.frame :

id <-
c("0001","0002","0003","0004","0005")
x <- c(NA, NA, "0001", NA, "0003")
y <- c(21,13,45,71,20)
d <- data.frame(id,x,y)



-- 
Julien Barnier
Groupe de recherche sur la socialisation
ENS-LSH - Lyon, France

Petr PIKAL

2007-Oct-10 11:07 UTC

head link

[R] Odp: Data manipulation question

Hi

r-help-bounces at r-project.org napsal dne 10.10.2007 12:10:29:
> Hi all,
> 
> Suppose I have the following data.frame, with an id column and two
> variables columns :
> 
> id        X       Y
> 0001      NA      21
> 0002      NA      13
> 0003      0001    45
> 0004      NA      71
> 0005      0003    20
> 
> What I would like to do is to create a new variable Z whose values are
> the Y value for the id value in X, that is :
> 
> id        X       Y          Z
> 0001      NA      21         NA
> 0002      NA      13         NA
> 0003      0001    45         21
> 0004      NA      71         NA
> 0005      0003    20         45
> 
> Do you have an idea on how to obtain that without using a for loop ?
d$z<-NA
d$z[d$x %in% d$id] <- d$y[d$id %in% d$x]

works in this particular case but it means you do not have multiple same 
ids and X

Regards
Petr
> 
> Thanks in advance for any help,
> 
> Julien
> 
> 
> 
> Here is the R code to reproduce the first data.frame :
> 
> id <-
c("0001","0002","0003","0004","0005")
> x <- c(NA, NA, "0001", NA, "0003")
> y <- c(21,13,45,71,20)
> d <- data.frame(id,x,y)
> 
> 
> 
> -- 
> Julien Barnier
> Groupe de recherche sur la socialisation
> ENS-LSH - Lyon, France
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.

Julien Barnier

2007-Oct-10 11:55 UTC

head link

[R] Odp: Data manipulation question

Hi Petr,
> d$z<-NA
> d$z[d$x %in% d$id] <- d$y[d$id %in% d$x]
>
> works in this particular case but it means you do not have multiple same 
> ids and X
Thanks for the idea. But the problem is that I can have multiple
ids...

In fact in the meantime I found a solution by using row names :

R> d
    id    x  y
1 0001 <NA> 21
2 0002 <NA> 13
3 0003 0001 45
4 0004 <NA> 71
5 0005 0003 20

R> rownames(d) <- d$id
R> d$z <- NA
R> d$z <- d[d$x,"y"]
R> d
       id    x  y  z
0001 0001 <NA> 21 NA
0002 0002 <NA> 13 NA
0003 0003 0001 45 21
0004 0004 <NA> 71 NA
0005 0005 0003 20 13


Thanks for your help,

Julien

-- 
Julien Barnier
Groupe de recherche sur la socialisation
ENS-LSH - Lyon, France

Gabor Grothendieck

2007-Oct-10 12:38 UTC

head link

[R] Data manipulation question

Try this:

transform(d, z = y[match(x, id)])


On 10/10/07, Julien Barnier <jbarnier at ens-lsh.fr>
wrote:> Hi all,
>
> Suppose I have the following data.frame, with an id column and two
> variables columns :
>
> id        X       Y
> 0001      NA      21
> 0002      NA      13
> 0003      0001    45
> 0004      NA      71
> 0005      0003    20
>
> What I would like to do is to create a new variable Z whose values are
> the Y value for the id value in X, that is :
>
> id        X       Y          Z
> 0001      NA      21         NA
> 0002      NA      13         NA
> 0003      0001    45         21
> 0004      NA      71         NA
> 0005      0003    20         45
>
> Do you have an idea on how to obtain that without using a for loop ?
>
> Thanks in advance for any help,
>
> Julien
>
>
>
> Here is the R code to reproduce the first data.frame :
>
> id <-
c("0001","0002","0003","0004","0005")
> x <- c(NA, NA, "0001", NA, "0003")
> y <- c(21,13,45,71,20)
> d <- data.frame(id,x,y)
>
>
>
> --
> Julien Barnier
> Groupe de recherche sur la socialisation
> ENS-LSH - Lyon, France
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Petr PIKAL

2007-Oct-10 13:03 UTC

head link

[R] Odp: Data manipulation question

r-help-bounces at r-project.org napsal dne 10.10.2007 13:55:32:
> Hi Petr,
> 
> > d$z<-NA
> > d$z[d$x %in% d$id] <- d$y[d$id %in% d$x]
> >
> > works in this particular case but it means you do not have multiple 
same > > ids and X
> 
> Thanks for the idea. But the problem is that I can have multiple
> ids...
> 
> In fact in the meantime I found a solution by using row names :
are you sure?
> 
> R> d
>     id    x  y
> 1 0001 <NA> 21
> 2 0002 <NA> 13
> 3 0003 0001 45
> 4 0004 <NA> 71
> 5 0005 0003 20
> 
> R> rownames(d) <- d$id
> R> d$z <- NA
> R> d$z <- d[d$x,"y"]
> R> d
>        id    x  y  z
> 0001 0001 <NA> 21 NA
> 0002 0002 <NA> 13 NA
> 0003 0003 0001 45 21
> 0004 0004 <NA> 71 NA
> 0005 0005 0003 20 13
Why 13 in row 5. And using your code my result is
> d    id    x  y
1 0001 <NA> 21
2 0002 <NA> 13
3 0003 0001 45
4 0004 <NA> 71
5 0005 0003 20> d$z <- NA
> rownames(d) <- d$id
> d$z <- d[d$x,"y"]
> d       id    x  y  z
0001 0001 <NA> 21 21
0002 0002 <NA> 13 21
0003 0003 0001 45 13
0004 0004 <NA> 71 21
0005 0005 0003 20 45

Regards
Petr
> 
> 
> Thanks for your help,
> 
> Julien
> 
> -- 
> Julien Barnier
> Groupe de recherche sur la socialisation
> ENS-LSH - Lyon, France
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.

R help - Oct 2007 - Data manipulation question

[R] Data manipulation question

[R] Odp: Data manipulation question

[R] Odp: Data manipulation question

[R] Data manipulation question

[R] Odp: Data manipulation question