Hi everyone,
I've got a dataset with 12,000 observations. One of the variables
(cleary$D1) is for an individual's country, coded 1 - 15. I'd like to
create
a dummy variable for the Baltic states which are coded 4,6, and 7. In other
words, as a dummy variable Baltic states would be coded 1, else 0. I've
attempted the following for loop:
dummy <- matrix(NA, nrow=nrow(cleary), ncol=1)
for (i in 1:length(cleary$D1)){
if (cleary$D1 == 4){dummy[i] = 1}
else {dummy[i] = 0}
}
Unfortunately it generates the following error:
1: In if (cleary$D1 == 4) { ... :
the condition has length > 1 and only the first element will be used
Another options I've tried is the following:
binary <- vector(length=length(cleary$D1))
for (i in 1:length(cleary$D1)) {
if (cleary$D1 == 4 | cleary$D1 == 6 | cleary$D1 == 7 ) {binary[i] = 1}
else {binary[i] = 0}
}
Unfortunately it simply responds with "syntax error".
Any thoughts would be greatly appreciated!
--
View this message in context:
http://r.789695.n4.nabble.com/For-loop-dummy-variables-tp3001396p3001396.html
Sent from the R help mailing list archive at Nabble.com.
I should have noted that the first attempt list above obviously was practice when cleary$D1== 4. To reiterate, this still didn't work. -- View this message in context: http://r.789695.n4.nabble.com/For-loop-dummy-variables-tp3001396p3001398.html Sent from the R help mailing list archive at Nabble.com.
you might try
dummy <- with(cleary,
cbind(B4 = as.numeric(D1 == 4),
B6 = as.numeric(D1 == 6),
B7 = as.numeric(D1 == 7)))
and do it all in one go.
___
to fix up your apporach you need to use
if(cleary$D1[i] == 4) dummy[i] <- 1 else dummy[i] <- 0
but this is a very clumsy and slow way of going about it.
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of gravityflyer
Sent: Tuesday, 19 October 2010 1:24 PM
To: r-help at r-project.org
Subject: [R] For-loop dummy variables?
Hi everyone,
I've got a dataset with 12,000 observations. One of the variables
(cleary$D1) is for an individual's country, coded 1 - 15. I'd like to
create
a dummy variable for the Baltic states which are coded 4,6, and 7. In other
words, as a dummy variable Baltic states would be coded 1, else 0. I've
attempted the following for loop:
dummy <- matrix(NA, nrow=nrow(cleary), ncol=1)
for (i in 1:length(cleary$D1)){
if (cleary$D1 == 4){dummy[i] = 1}
else {dummy[i] = 0}
}
Unfortunately it generates the following error:
1: In if (cleary$D1 == 4) { ... :
the condition has length > 1 and only the first element will be used
Another options I've tried is the following:
binary <- vector(length=length(cleary$D1))
for (i in 1:length(cleary$D1)) {
if (cleary$D1 == 4 | cleary$D1 == 6 | cleary$D1 == 7 ) {binary[i] = 1}
else {binary[i] = 0}
}
Unfortunately it simply responds with "syntax error".
Any thoughts would be greatly appreciated!
--
View this message in context:
http://r.789695.n4.nabble.com/For-loop-dummy-variables-tp3001396p3001396.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
gravityflyer <gravityflyer <at> yahoo.com> writes:> > Hi everyone, > > I've got a dataset with 12,000 observations. One of the variables > (cleary$D1) is for an individual's country, coded 1 - 15. I'd like to create > a dummy variable for the Baltic states which are coded 4,6, and 7. In other > words, as a dummy variable Baltic states would be coded 1, else 0. I've > attempted the following for loop: > > dummy <- matrix(NA, nrow=nrow(cleary), ncol=1) > for (i in 1:length(cleary$D1)){ > if (cleary$D1 == 4){dummy[i] = 1} > else {dummy[i] = 0} > } > > Unfortunately it generates the following error: > > 1: In if (cleary$D1 == 4) { ... : > the condition has length > 1 and only the first element will be used > > Another options I've tried is the following: > > binary <- vector(length=length(cleary$D1)) > for (i in 1:length(cleary$D1)) { > if (cleary$D1 == 4 | cleary$D1 == 6 | cleary$D1 == 7 ) {binary[i] = 1} > else {binary[i] = 0} > } > > Unfortunately it simply responds with "syntax error". > > Any thoughts would be greatly appreciated! >Be aware that R is a vectorised programming language, therefore your for loop in completely unnecessary. This is what I'd do: dummy <- rep(0, nrow(cleary)) dummy[cleary$D1 %in% c(4,6,7)] <- 1 This is your dummy variable. Below is your working (though VERY inefficient) version of the for loop: binary <- vector(length=length(cleary$D1)) for (i in 1:length(cleary$D1)) { if (cleary$D1[i] == 4 | cleary$D1[i] == 6 | cleary$D1[i] == 7 ) { binary[i] = 1 } else { binary[i] = 0 } } Now try to figure out: - what is the difference between your for() loop and mine? - which code is more simple (and better), the vectorised or the for() loop? I hope it helps, Adrian
I always find R useful to solve problems like this:
dummy = as.numeric(cleary$D1 %in% c(4,6,7))
If, for some reason you want to use a loop, try
dummy <- matrix(NA, nrow=nrow(cleary), ncol=1)
for (i in 1:length(cleary$D1)){
if (cleary$D1[i] %in% c(4,6,7)){dummy[i] = 1}
else {dummy[i] = 0}
}
When you write a loop, you need to use the loop index
to select the individual value you're working with.
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spector at stat.berkeley.edu
On Mon, 18 Oct 2010, gravityflyer wrote:
>
> Hi everyone,
>
> I've got a dataset with 12,000 observations. One of the variables
> (cleary$D1) is for an individual's country, coded 1 - 15. I'd like
to create
> a dummy variable for the Baltic states which are coded 4,6, and 7. In other
> words, as a dummy variable Baltic states would be coded 1, else 0.
I've
> attempted the following for loop:
>
> dummy <- matrix(NA, nrow=nrow(cleary), ncol=1)
> for (i in 1:length(cleary$D1)){
> if (cleary$D1 == 4){dummy[i] = 1}
> else {dummy[i] = 0}
> }
>
> Unfortunately it generates the following error:
>
> 1: In if (cleary$D1 == 4) { ... :
> the condition has length > 1 and only the first element will be used
>
>
> Another options I've tried is the following:
>
> binary <- vector(length=length(cleary$D1))
> for (i in 1:length(cleary$D1)) {
> if (cleary$D1 == 4 | cleary$D1 == 6 | cleary$D1 == 7 ) {binary[i] = 1}
> else {binary[i] = 0}
> }
>
> Unfortunately it simply responds with "syntax error".
>
> Any thoughts would be greatly appreciated!
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/For-loop-dummy-variables-tp3001396p3001396.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
On Tuesday 19 October 2010, Phil Spector wrote:> I always find R useful to solve problems like this: > > dummy = as.numeric(cleary$D1 %in% c(4,6,7))Indeed, and this works too: dummy <- 1*(cleary$D1 %in% c(4,6,7)) Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd. 050025 Bucharest sector 5 Romania Tel.:+40 21 3126618 \ +40 21 3120210 / int.101 Fax: +40 21 3158391