thr3ads.net - R help - [R] creating a new variable, conditional on the value of an existing variable, selected conditionally [Jun 2010]

If this information is useful, please help other people find it:
Share via:

Malcolm Fairbrother

2010-Jun-09 14:03 UTC

[R] creating a new variable, conditional on the value of an existing variable, selected conditionally

Dear all,

I have a data frame f, with four variables:

f <- data.frame(A=c(0,0,1,1), B=c(0,1,0,1), C=c(1,1,0,1), D=c(3,1,2,3))
f
  A B C D
1 0 0 1 3
2 0 1 1 1
3 1 0 0 2
4 1 1 1 3

I want to create a new variable (f$E), such that each of its elements is drawn
from either f$A, f$B, or f$C, according to the value (for each row) of f$D
(values of which range from 1 to 3).

In the first row, D is 3, so I want the value from the third variable (C), which
for the first row is 1. In the second row, D is 1, so I want the value from the
first variable (A), which for the second row is 0. And so forth, such that in
the end my new data frame looks like:

  A B C D E
1 0 0 1 3 1
2 0 1 1 1 0
3 1 0 0 2 0
4 1 1 1 3 1

My question is: How do I do this for a much larger dataset, where my "index
variable" (f$D in this example) actually indexes a much larger number of
variables (not just three)?

I know that in principle I could do this with a long series of nested ifelse
statements (as below), but I assume there is some less cumbersome option, and
I'd like to know what it is. Any help would be much appreciated. Apologies
if I'm missing something obvious.

f$E <- ifelse(f$D==3, f$C, ifelse(f$D==2, f$B, f$A))

Thanks,
Malcolm

Erik Iverson

2010-Jun-09 16:55 UTC

head link

[R] creating a new variable, conditional on the value of an existing variable, selected conditionally

Can your data.frame be properly coerced to a matrix like your example?

If so,

apply(f, 1, function(x) x[eval(x)["D"]])

Malcolm Fairbrother wrote:> Dear all,
> 
> I have a data frame f, with four variables:
> 
> f <- data.frame(A=c(0,0,1,1), B=c(0,1,0,1), C=c(1,1,0,1), D=c(3,1,2,3))
> f
>   A B C D
> 1 0 0 1 3
> 2 0 1 1 1
> 3 1 0 0 2
> 4 1 1 1 3
> 
> I want to create a new variable (f$E), such that each of its elements is
drawn from either f$A, f$B, or f$C, according to the value (for each row) of f$D
(values of which range from 1 to 3).
> 
> In the first row, D is 3, so I want the value from the third variable (C),
which for the first row is 1. In the second row, D is 1, so I want the value
from the first variable (A), which for the second row is 0. And so forth, such
that in the end my new data frame looks like:
> 
>   A B C D E
> 1 0 0 1 3 1
> 2 0 1 1 1 0
> 3 1 0 0 2 0
> 4 1 1 1 3 1
> 
> My question is: How do I do this for a much larger dataset, where my
"index variable" (f$D in this example) actually indexes a much larger
number of variables (not just three)?
> 
> I know that in principle I could do this with a long series of nested
ifelse statements (as below), but I assume there is some less cumbersome option,
and I'd like to know what it is. Any help would be much appreciated.
Apologies if I'm missing something obvious.
> 
> f$E <- ifelse(f$D==3, f$C, ifelse(f$D==2, f$B, f$A))
> 
> Thanks,
> Malcolm
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Doran, Harold

2010-Jun-09 17:15 UTC

head link

[R] creating a new variable, conditional on the value of an existing variable, selected conditionally

How about this:

f <- data.frame(A=c(0,0,1,1), B=c(0,1,0,1), C=c(1,1,0,1), D=c(3,1,2,3))

N <- nrow(f)

mat <- cbind(1:N,f$D)

f$E <- f[mat]

f
  A B C D E
1 0 0 1 3 1
2 0 1 1 1 0
3 1 0 0 2 0
4 1 1 1 3 1

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Malcolm Fairbrother
Sent: Wednesday, June 09, 2010 10:03 AM
To: r-help at r-project.org
Subject: [R] creating a new variable, conditional on the value of an existing
variable, selected conditionally

Dear all,

I have a data frame f, with four variables:

f <- data.frame(A=c(0,0,1,1), B=c(0,1,0,1), C=c(1,1,0,1), D=c(3,1,2,3))
f
  A B C D
1 0 0 1 3
2 0 1 1 1
3 1 0 0 2
4 1 1 1 3

I want to create a new variable (f$E), such that each of its elements is drawn
from either f$A, f$B, or f$C, according to the value (for each row) of f$D
(values of which range from 1 to 3).

In the first row, D is 3, so I want the value from the third variable (C), which
for the first row is 1. In the second row, D is 1, so I want the value from the
first variable (A), which for the second row is 0. And so forth, such that in
the end my new data frame looks like:

  A B C D E
1 0 0 1 3 1
2 0 1 1 1 0
3 1 0 0 2 0
4 1 1 1 3 1

My question is: How do I do this for a much larger dataset, where my "index
variable" (f$D in this example) actually indexes a much larger number of
variables (not just three)?

I know that in principle I could do this with a long series of nested ifelse
statements (as below), but I assume there is some less cumbersome option, and
I'd like to know what it is. Any help would be much appreciated. Apologies
if I'm missing something obvious.

f$E <- ifelse(f$D==3, f$C, ifelse(f$D==2, f$B, f$A))

Thanks,
Malcolm

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Henrique Dallazuanna

2010-Jun-09 20:49 UTC

head link

[R] creating a new variable, conditional on the value of an existing variable, selected conditionally

Try this:

 f$E <- diag(as.matrix(f[f$D]))

On Wed, Jun 9, 2010 at 11:03 AM, Malcolm Fairbrother <
m.fairbrother@bristol.ac.uk> wrote:
> Dear all,
>
> I have a data frame f, with four variables:
>
> f <- data.frame(A=c(0,0,1,1), B=c(0,1,0,1), C=c(1,1,0,1), D=c(3,1,2,3))
> f
>  A B C D
> 1 0 0 1 3
> 2 0 1 1 1
> 3 1 0 0 2
> 4 1 1 1 3
>
> I want to create a new variable (f$E), such that each of its elements is
> drawn from either f$A, f$B, or f$C, according to the value (for each row)
of
> f$D (values of which range from 1 to 3).
>
> In the first row, D is 3, so I want the value from the third variable (C),
> which for the first row is 1. In the second row, D is 1, so I want the
value
> from the first variable (A), which for the second row is 0. And so forth,
> such that in the end my new data frame looks like:
>
>  A B C D E
> 1 0 0 1 3 1
> 2 0 1 1 1 0
> 3 1 0 0 2 0
> 4 1 1 1 3 1
>
> My question is: How do I do this for a much larger dataset, where my
"index
> variable" (f$D in this example) actually indexes a much larger number
of
> variables (not just three)?
>
> I know that in principle I could do this with a long series of nested
> ifelse statements (as below), but I assume there is some less cumbersome
> option, and I'd like to know what it is. Any help would be much
appreciated.
> Apologies if I'm missing something obvious.
>
> f$E <- ifelse(f$D==3, f$C, ifelse(f$D==2, f$B, f$A))
>
> Thanks,
> Malcolm
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

	[[alternative HTML version deleted]]

Dennis Murphy

2010-Jun-10 11:02 UTC

head link

[R] creating a new variable, conditional on the value of an existing variable, selected conditionally

Hi:

I had Harold's idea (matrix indexing), but I was curious to see which of
these ran fastest. I simulated
1000 rows and three columns of binary data, along with a fourth column that
sampled the values 1:3
1000 times. Here are the timings:
> f <- as.data.frame(matrix(rbinom(3000, 1, 0.4), nrow = 1000))
> names(f) <- LETTERS[1:3]
> f$D <- sample(1:3, 1000, replace = TRUE)
> system.time(E1 <- f[cbind(1:nrow(f), f$D)])   user  system elapsed
      0       0       0> system.time(E2 <- apply(f, 1, function(x) x[eval(x)["D"]]))   user  system elapsed
   0.03    0.00    0.03> system.time(E3 <- diag(as.matrix(f[f$D])))   user  system elapsed
   0.26    0.03    0.30> identical(E1, E2)
[1] TRUE> identical(E2, E3)[1] TRUE


HTH,
Dennis

On Wed, Jun 9, 2010 at 7:03 AM, Malcolm Fairbrother <
m.fairbrother@bristol.ac.uk> wrote:
> Dear all,
>
> I have a data frame f, with four variables:
>
> f <- data.frame(A=c(0,0,1,1), B=c(0,1,0,1), C=c(1,1,0,1), D=c(3,1,2,3))
> f
>  A B C D
> 1 0 0 1 3
> 2 0 1 1 1
> 3 1 0 0 2
> 4 1 1 1 3
>
> I want to create a new variable (f$E), such that each of its elements is
> drawn from either f$A, f$B, or f$C, according to the value (for each row)
of
> f$D (values of which range from 1 to 3).
>
> In the first row, D is 3, so I want the value from the third variable (C),
> which for the first row is 1. In the second row, D is 1, so I want the
value
> from the first variable (A), which for the second row is 0. And so forth,
> such that in the end my new data frame looks like:
>
>  A B C D E
> 1 0 0 1 3 1
> 2 0 1 1 1 0
> 3 1 0 0 2 0
> 4 1 1 1 3 1
>
> My question is: How do I do this for a much larger dataset, where my
"index
> variable" (f$D in this example) actually indexes a much larger number
of
> variables (not just three)?
>
> I know that in principle I could do this with a long series of nested
> ifelse statements (as below), but I assume there is some less cumbersome
> option, and I'd like to know what it is. Any help would be much
appreciated.
> Apologies if I'm missing something obvious.
>
> f$E <- ifelse(f$D==3, f$C, ifelse(f$D==2, f$B, f$A))
>
> Thanks,
> Malcolm
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Seemingly Similar Threads

Search for more reasonably related threads

R help - Jun 2010 - creating a new variable, conditional on the value of an existing variable, selected conditionally

[R] creating a new variable, conditional on the value of an existing variable, selected conditionally

[R] creating a new variable, conditional on the value of an existing variable, selected conditionally

[R] creating a new variable, conditional on the value of an existing variable, selected conditionally

[R] creating a new variable, conditional on the value of an existing variable, selected conditionally

[R] creating a new variable, conditional on the value of an existing variable, selected conditionally

Seemingly Similar Threads