Hello, When I use "apply" on a data frame, it seems like I get an error when I have a column that is not numeric. Via trial and error I realized that if I remove that column, I can get it to run. Is there a better way to tell the function not to worry about the character columns, especially since I am not even trying to do any calculations on it? R_Cat is the character column that is causing error and I am trying to do calculations on t5R. Data_F_Cat$t5R_Cat = apply(Data_F[,names(Data_F) != "R_Cat"],1,function(row) { ifelse( abs(row["t5R"]) <Thresh1, 0, ifelse( abs(row["t5R"]) <Thresh2, ifelse( row["t5R"] <0, -1, 1), ifelse( abs(row["t5R"]) <Thresh3, ifelse( row["t5R"] <0, -2, 2), ifelse( abs(row["t5R"]) <Thresh4, ifelse( row["t5R"] <0, -3, 3), ifelse( abs(row["t5R"]) <0, -4, 4))))) }) Thank you, Pooya. THIS E-MAIL IS FOR THE SOLE USE OF THE INTENDED RECIPIENT(S) AND MAY CONTAIN CONFIDENTIAL AND PRIVILEGED INFORMATION. ANY UNAUTHORIZED REVIEW, USE, DISCLOSURE OR DISTRIBUTION IS PROHIBITED. IF YOU ARE NOT THE INTENDED RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND DESTROY ALL COPIES OF THE ORIGINAL E-MAIL. [[alternative HTML version deleted]]
The apply() function works on an array or a matrix. There is no need to guess, just read the manual page: ? apply So including a character variable forces the entire matrix to characters. Excluding character variables will let you operate on the numeric values, but your code suggests you are trying to create a new variable based on a single column so apply() is the wrong tool. Without a working example, it is not possible to be more specific. Including a sample of your data.frame with dput() would make that possible. ------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Pooya Lalehzari Sent: Thursday, May 9, 2013 10:51 AM To: r-help at r-project.org Subject: [R] Question with "apply" Hello, When I use "apply" on a data frame, it seems like I get an error when I have a column that is not numeric. Via trial and error I realized that if I remove that column, I can get it to run. Is there a better way to tell the function not to worry about the character columns, especially since I am not even trying to do any calculations on it? R_Cat is the character column that is causing error and I am trying to do calculations on t5R. Data_F_Cat$t5R_Cat = apply(Data_F[,names(Data_F) != "R_Cat"],1,function(row) { ifelse( abs(row["t5R"]) <Thresh1, 0, ifelse( abs(row["t5R"]) <Thresh2, ifelse( row["t5R"] <0, -1, 1), ifelse( abs(row["t5R"]) <Thresh3, ifelse( row["t5R"] <0, -2, 2), ifelse( abs(row["t5R"]) <Thresh4, ifelse( row["t5R"] <0, -3, 3), ifelse( abs(row["t5R"]) <0, -4, 4))))) }) Thank you, Pooya. THIS E-MAIL IS FOR THE SOLE USE OF THE INTENDED RECIPIENT(S) AND MAY CONTAIN CONFIDENTIAL AND PRIVILEGED INFORMATION. ANY UNAUTHORIZED REVIEW, USE, DISCLOSURE OR DISTRIBUTION IS PROHIBITED. IF YOU ARE NOT THE INTENDED RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND DESTROY ALL COPIES OF THE ORIGINAL E-MAIL. [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On May 9, 2013, at 8:50 AM, Pooya Lalehzari wrote:> Hello, > When I use "apply" on a data frame, it seems like I get an error when I have a column that is not numeric. Via trial and error I realized that if I remove that column, I can get it to run. Is there a better way to tell the function not to worry about the character columns, especially since I am not even trying to do any calculations on it?Not really.> R_Cat is the character column that is causing error and I am trying to do calculations on t5R. > > apply(Data_F[,names(Data_F) != "R_Cat"],1,function(row) { > ifelse( abs(row["t5R"]) <Thresh1, 0, > ifelse( abs(row["t5R"]) <Thresh2, ifelse( row["t5R"] <0, -1, 1), > ifelse( abs(row["t5R"]) <Thresh3, ifelse( row["t5R"] <0, -2, 2), > ifelse( abs(row["t5R"]) <Thresh4, ifelse( row["t5R"] <0, -3, 3), > ifelse( abs(row["t5R"]) <0, -4, 4)))))There would be a better way of writing that code and avoid all those ugly and inefficient nested 'ifelse' calls by replacing with vectorized operations: Data_F_Cat$t5R_Cat <- findInterval( abs( Data_F_Cat$t5R ) , c(Thresh1, Thresh2, Thresh3, Thresh4) ) Data_F_Cat$t5R_Cat <- sign(Data_F_Cat$t5R) * Data_F_Cat$t5R_Cat The last clause is clearly not coded correctly since abs( anything) is never less than 0. Note that there could be a re-coding difference if t5R is >= Thresh4, which was not a condition you were testing for. I suspect you wanted tehm coded 4 or -4 and that is what my code should accomplish. -- David Winsemius Alameda, CA, USA