Muhammad Subianto
2005-Apr-29 11:57 UTC
[R] How to change variables in datasets automatically
Dear R-helpers, Suppose I have a dataset, data(iris) a <- data.frame(Sepal.Length=c(1:4), Sepal.Width=c(2:5), Petal.Length=c(3:6), Petal.Width=c(4:7), Species=rep("rosa",4)) b <- iris[1:10,] newtest.iris <- rbind(a,b)> newtest.irisSepal.Length Sepal.Width Petal.Length Petal.Width Species 1 1.0 2.0 3.0 4.0 rosa 2 2.0 3.0 4.0 5.0 rosa 3 3.0 4.0 5.0 6.0 rosa 4 4.0 5.0 6.0 7.0 rosa 11 5.1 3.5 1.4 0.2 setosa 21 4.9 3.0 1.4 0.2 setosa 31 4.7 3.2 1.3 0.2 setosa 41 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa 7 4.6 3.4 1.4 0.3 setosa 8 5.0 3.4 1.5 0.2 setosa 9 4.4 2.9 1.4 0.2 setosa 10 4.9 3.1 1.5 0.1 setosa I want to change each labels (variables) like: Sepal.Length=SL, Sepal.Width=SW, Petal.Length=PL, Petal.Width=PW, and Species=Class. Then I want to change each cell in Species variable like rosa=0 and setosa=1. The result something like this,> NewIrisSL SW PL PW Class 1 1.0 2.0 3.0 4.0 0 2 2.0 3.0 4.0 5.0 0 3 3.0 4.0 5.0 6.0 0 4 4.0 5.0 6.0 7.0 0 5 5.1 3.5 1.4 0.2 1 6 4.9 3.0 1.4 0.2 1 7 4.7 3.2 1.3 0.2 1 8 4.6 3.1 1.5 0.2 1 9 5.0 3.6 1.4 0.2 1 10 5.4 3.9 1.7 0.4 1 11 4.6 3.4 1.4 0.3 1 12 5.0 3.4 1.5 0.2 1 13 4.4 2.9 1.4 0.2 1 14 4.9 3.1 1.5 0.1 1>I can do it the result above like this,> Class <- ifelse(newtest.iris$Species=="rosa", 0, 1) > NewIris <- data.frame(SL = newtest.iris$Sepal.Length,+ SW = newtest.iris$Sepal.Width, + PL = newtest.iris$Petal.Length, + PW = newtest.iris$Petal.Width, + Class) Because I have more variables in my datasets which I must to change. Is there any way to change automatically and which library contains a function to compute that? I would be very happy if anyone could help me. Thank you very much in advance. Kindly regards, Muhammad Subianto
Try: a <- data.frame(Sepal.Length=1:4, Sepal.Width=2:5, Petal.Length=3:6, Petal.Width=4:7, Species=rep("rosa",4)) b <- iris[1:10,] newtest.iris <- rbind(a,b) names(newtest.iris) <- c("SL", "SW", "PL", "PW", "Class") newtest.iris$Class <- as.numeric(newtest.iris$Class) - 1 HTH, Andy> From: Muhammad Subianto > > Dear R-helpers, > Suppose I have a dataset, > data(iris) > a <- data.frame(Sepal.Length=c(1:4), Sepal.Width=c(2:5), > Petal.Length=c(3:6), Petal.Width=c(4:7), Species=rep("rosa",4)) > b <- iris[1:10,] > newtest.iris <- rbind(a,b) > > newtest.iris > Sepal.Length Sepal.Width Petal.Length Petal.Width Species > 1 1.0 2.0 3.0 4.0 rosa > 2 2.0 3.0 4.0 5.0 rosa > 3 3.0 4.0 5.0 6.0 rosa > 4 4.0 5.0 6.0 7.0 rosa > 11 5.1 3.5 1.4 0.2 setosa > 21 4.9 3.0 1.4 0.2 setosa > 31 4.7 3.2 1.3 0.2 setosa > 41 4.6 3.1 1.5 0.2 setosa > 5 5.0 3.6 1.4 0.2 setosa > 6 5.4 3.9 1.7 0.4 setosa > 7 4.6 3.4 1.4 0.3 setosa > 8 5.0 3.4 1.5 0.2 setosa > 9 4.4 2.9 1.4 0.2 setosa > 10 4.9 3.1 1.5 0.1 setosa > > I want to change each labels (variables) like: > Sepal.Length=SL, Sepal.Width=SW, > Petal.Length=PL, Petal.Width=PW, and Species=Class. Then I want to > change each cell > in Species variable like rosa=0 and setosa=1. The result > something like this, > > > NewIris > SL SW PL PW Class > 1 1.0 2.0 3.0 4.0 0 > 2 2.0 3.0 4.0 5.0 0 > 3 3.0 4.0 5.0 6.0 0 > 4 4.0 5.0 6.0 7.0 0 > 5 5.1 3.5 1.4 0.2 1 > 6 4.9 3.0 1.4 0.2 1 > 7 4.7 3.2 1.3 0.2 1 > 8 4.6 3.1 1.5 0.2 1 > 9 5.0 3.6 1.4 0.2 1 > 10 5.4 3.9 1.7 0.4 1 > 11 4.6 3.4 1.4 0.3 1 > 12 5.0 3.4 1.5 0.2 1 > 13 4.4 2.9 1.4 0.2 1 > 14 4.9 3.1 1.5 0.1 1 > > > I can do it the result above like this, > > > Class <- ifelse(newtest.iris$Species=="rosa", 0, 1) > > NewIris <- data.frame(SL = newtest.iris$Sepal.Length, > + SW = newtest.iris$Sepal.Width, > + PL = newtest.iris$Petal.Length, > + PW = newtest.iris$Petal.Width, > + Class) > > Because I have more variables in my datasets which I must to change. > Is there any way to change automatically and which library contains a > function to compute that? > I would be very happy if anyone could help me. > Thank you very much in advance. > > Kindly regards, > Muhammad Subianto > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > >
Muhammad Subianto
2005-Apr-29 12:47 UTC
[R] How to change variables in datasets automatically
Excellent, this is exactly what I was looking for. Many thanks and best regards, Muhammad Subianto On 4/29/05, Liaw, Andy <andy_liaw at merck.com> wrote:> Try: > > a <- data.frame(Sepal.Length=1:4, Sepal.Width=2:5, > Petal.Length=3:6, Petal.Width=4:7, > Species=rep("rosa",4)) > b <- iris[1:10,] > newtest.iris <- rbind(a,b) > names(newtest.iris) <- c("SL", "SW", "PL", "PW", "Class") > newtest.iris$Class <- as.numeric(newtest.iris$Class) - 1 > > HTH, > Andy > > > From: Muhammad Subianto > > > > Dear R-helpers, > > Suppose I have a dataset, > > data(iris) > > a <- data.frame(Sepal.Length=c(1:4), Sepal.Width=c(2:5), > > Petal.Length=c(3:6), Petal.Width=c(4:7), Species=rep("rosa",4)) > > b <- iris[1:10,] > > newtest.iris <- rbind(a,b) > > > newtest.iris > > Sepal.Length Sepal.Width Petal.Length Petal.Width Species > > 1 1.0 2.0 3.0 4.0 rosa > > 2 2.0 3.0 4.0 5.0 rosa > > 3 3.0 4.0 5.0 6.0 rosa > > 4 4.0 5.0 6.0 7.0 rosa > > 11 5.1 3.5 1.4 0.2 setosa > > 21 4.9 3.0 1.4 0.2 setosa > > 31 4.7 3.2 1.3 0.2 setosa > > 41 4.6 3.1 1.5 0.2 setosa > > 5 5.0 3.6 1.4 0.2 setosa > > 6 5.4 3.9 1.7 0.4 setosa > > 7 4.6 3.4 1.4 0.3 setosa > > 8 5.0 3.4 1.5 0.2 setosa > > 9 4.4 2.9 1.4 0.2 setosa > > 10 4.9 3.1 1.5 0.1 setosa > > > > I want to change each labels (variables) like: > > Sepal.Length=SL, Sepal.Width=SW, > > Petal.Length=PL, Petal.Width=PW, and Species=Class. Then I want to > > change each cell > > in Species variable like rosa=0 and setosa=1. The result > > something like this, > > > > > NewIris > > SL SW PL PW Class > > 1 1.0 2.0 3.0 4.0 0 > > 2 2.0 3.0 4.0 5.0 0 > > 3 3.0 4.0 5.0 6.0 0 > > 4 4.0 5.0 6.0 7.0 0 > > 5 5.1 3.5 1.4 0.2 1 > > 6 4.9 3.0 1.4 0.2 1 > > 7 4.7 3.2 1.3 0.2 1 > > 8 4.6 3.1 1.5 0.2 1 > > 9 5.0 3.6 1.4 0.2 1 > > 10 5.4 3.9 1.7 0.4 1 > > 11 4.6 3.4 1.4 0.3 1 > > 12 5.0 3.4 1.5 0.2 1 > > 13 4.4 2.9 1.4 0.2 1 > > 14 4.9 3.1 1.5 0.1 1 > > > > > I can do it the result above like this, > > > > > Class <- ifelse(newtest.iris$Species=="rosa", 0, 1) > > > NewIris <- data.frame(SL = newtest.iris$Sepal.Length, > > + SW = newtest.iris$Sepal.Width, > > + PL = newtest.iris$Petal.Length, > > + PW = newtest.iris$Petal.Width, > > + Class) > > > > Because I have more variables in my datasets which I must to change. > > Is there any way to change automatically and which library contains a > > function to compute that? > > I would be very happy if anyone could help me. > > Thank you very much in advance. > > > > Kindly regards, > > Muhammad Subianto > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > > > > > ------------------------------------------------------------------------------ > Notice: This e-mail message, together with any attachment...{{dropped}}
Gabor Grothendieck
2005-Apr-29 14:18 UTC
[R] How to change variables in datasets automatically
On 4/29/05, Muhammad Subianto <subianto at gmail.com> wrote:> Dear R-helpers, > Suppose I have a dataset, > data(iris) > a <- data.frame(Sepal.Length=c(1:4), Sepal.Width=c(2:5), > Petal.Length=c(3:6), Petal.Width=c(4:7), Species=rep("rosa",4)) > b <- iris[1:10,] > newtest.iris <- rbind(a,b) > > newtest.iris > Sepal.Length Sepal.Width Petal.Length Petal.Width Species > 1 1.0 2.0 3.0 4.0 rosa > 2 2.0 3.0 4.0 5.0 rosa > 3 3.0 4.0 5.0 6.0 rosa > 4 4.0 5.0 6.0 7.0 rosa > 11 5.1 3.5 1.4 0.2 setosa > 21 4.9 3.0 1.4 0.2 setosa > 31 4.7 3.2 1.3 0.2 setosa > 41 4.6 3.1 1.5 0.2 setosa > 5 5.0 3.6 1.4 0.2 setosa > 6 5.4 3.9 1.7 0.4 setosa > 7 4.6 3.4 1.4 0.3 setosa > 8 5.0 3.4 1.5 0.2 setosa > 9 4.4 2.9 1.4 0.2 setosa > 10 4.9 3.1 1.5 0.1 setosa > > I want to change each labels (variables) like: Sepal.Length=SL, Sepal.Width=SW, > Petal.Length=PL, Petal.Width=PW, and Species=Class. Then I want to > change each cell > in Species variable like rosa=0 and setosa=1. The result something like this, > > > NewIris > SL SW PL PW Class > 1 1.0 2.0 3.0 4.0 0 > 2 2.0 3.0 4.0 5.0 0 > 3 3.0 4.0 5.0 6.0 0 > 4 4.0 5.0 6.0 7.0 0 > 5 5.1 3.5 1.4 0.2 1 > 6 4.9 3.0 1.4 0.2 1 > 7 4.7 3.2 1.3 0.2 1 > 8 4.6 3.1 1.5 0.2 1 > 9 5.0 3.6 1.4 0.2 1 > 10 5.4 3.9 1.7 0.4 1 > 11 4.6 3.4 1.4 0.3 1 > 12 5.0 3.4 1.5 0.2 1 > 13 4.4 2.9 1.4 0.2 1 > 14 4.9 3.1 1.5 0.1 1 > > > I can do it the result above like this, > > > Class <- ifelse(newtest.iris$Species=="rosa", 0, 1) > > NewIris <- data.frame(SL = newtest.iris$Sepal.Length, > + SW = newtest.iris$Sepal.Width, > + PL = newtest.iris$Petal.Length, > + PW = newtest.iris$Petal.Width, > + Class) > > Because I have more variables in my datasets which I must to change. > Is there any way to change automatically and which library contains a > function to compute that? > I would be very happy if anyone could help me. > Thank you very much in advance. > > Kindly regards, > Muhammad Subianto > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >Someone else has already indicated how to do this but as you say you have a large number of columns you might want an automated way as well. For example the following removes lower case letters and dots from the names and then changes Species to Class. Note that there is a dot after a-z # remove lower case letters and dots from column names and # change name of col5 to Class data(iris) names(iris) <- gsub("[a-z.]", "", names(iris)) names(iris)[5] <- "Class" Another possibility might be to use abbreviate. This does not give the exact result you are looking for but its close and its very easy: data(iris) names(iris) <- abbreviate(names(iris)) names(iris)[5] <- "Class"