suparna mitra
2009-Jun-17 11:14 UTC
[R] Problem in 'Apply' function: does anybody have other solution
Dear All, I am having some problem in apply function. I have some data like below. I want to get a range vector (which is max-min value for each row , ignoring NA values.)> Species.all[1:10,]V2 V3 V4 V5 V6 V7 V8 V9 1 57543 55938 47175 54922 36032 5785 29497 7286 2 42364 40472 29887 40107 19723 2691 14445 3258 3 19461 19646 18538 22392 6744 794 4919 1024 4 45 41 28 34 33 NA 26 NA 5 45 41 28 34 33 NA 26 NA 6 45 41 28 34 33 NA 26 NA 7 14 9 14 14 7 NA 10 NA 8 20 25 10 15 21 NA 10 NA 9 20 25 10 15 21 NA 10 NA 10 578 566 478 753 361 150 262 170> dim(Species.all)[1] 1862 8 I used apply function like below. I used this same function for some other data, there it worked. But here its not working (giving error message).> Range.j=apply(Species.all,1,max,na.rm TRUE)-apply(Species.all,1,min,na.rm = TRUE)Error in apply(Species.all, 1, max, na.rm = TRUE) - apply(Species.all, : non-numeric argument to binary operator When i tried to check, you can see from the steps it is giving totally wrong results.> apply(Species.all[1:10,],1,max)1 2 3 4 5 6 7 8 9 10 "7286" "3258" "1024" NA NA NA NA NA NA " 753"> apply(Species.all[1:10,],1,min)1 2 3 4 5 6 7 8 9 10 " 47175" " 29887" " 18538" NA NA NA NA NA NA " 262" Main problem is, this code is working for some cases, but not for all. Does any body have an idea, why it is so? Or can anyone show me some other way to do the same. Thanks in advance, With best regard, Suparna [[alternative HTML version deleted]]
suparna mitra
2009-Jun-17 11:41 UTC
[R] Problem in 'Apply' function: does anybody have other solution
Dear All, Just to add some more lines in my previous query I am writing this. I was checking with several data. The cases where the apply function is working, the part of result looks like :> apply(Species.all[1:10,],1,max,na.rm=TRUE)1 2 3 4 5 6 7 8 9 10 22392 45 45 45 14 25 25 753 101 10 and with the problematic data it looks like:> apply(Species.all[1:10,],1,max,na.rm=TRUE)1 2 3 4 5 6 7 8 9 10 "7286" "3258" "1024" " 45" " 45" " 45" " 9" " 25" " 25" " 753" But my all the datasets are in CSV format. I am reading those datasets as read.csv or read.delim Can anybody please suggest me how to this problem? Thanks and regards, Suparna. On Wed, Jun 17, 2009 at 1:14 PM, suparna mitra <suparna.mitra@googlemail.com> wrote:> Dear All, > I am having some problem in apply function. > I have some data like below. I want to get a range vector (which is max-min > value for each row , ignoring NA values.) > > Species.all[1:10,] > V2 V3 V4 V5 V6 V7 V8 V9 > 1 57543 55938 47175 54922 36032 5785 29497 7286 > 2 42364 40472 29887 40107 19723 2691 14445 3258 > 3 19461 19646 18538 22392 6744 794 4919 1024 > 4 45 41 28 34 33 NA 26 NA > 5 45 41 28 34 33 NA 26 NA > 6 45 41 28 34 33 NA 26 NA > 7 14 9 14 14 7 NA 10 NA > 8 20 25 10 15 21 NA 10 NA > 9 20 25 10 15 21 NA 10 NA > 10 578 566 478 753 361 150 262 170 > > dim(Species.all) > [1] 1862 8 > > I used apply function like below. I used this same function for some other > data, there it worked. But here its not working (giving error message). > > > Range.j=apply(Species.all,1,max,na.rm > TRUE)-apply(Species.all,1,min,na.rm = TRUE) > Error in apply(Species.all, 1, max, na.rm = TRUE) - apply(Species.all, : > non-numeric argument to binary operator > > When i tried to check, you can see from the steps it is giving totally > wrong results. > > > apply(Species.all[1:10,],1,max) > 1 2 3 4 5 6 7 8 9 10 > "7286" "3258" "1024" NA NA NA NA NA NA " 753" > > apply(Species.all[1:10,],1,min) > 1 2 3 4 5 6 7 > 8 9 10 > " 47175" " 29887" " 18538" NA NA NA NA > NA NA " 262" > > > Main problem is, this code is working for some cases, but not for all. Does > any body have an idea, why it is so? Or can anyone show me some other way to > do the same. > Thanks in advance, > With best regard, > Suparna >[[alternative HTML version deleted]]
jude.ryan at ubs.com
2009-Jun-18 14:59 UTC
[R] Problem in 'Apply' function: does anybody have other solution
David Winsemius' solution:> apply(data.matrix(df), 1, I)[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] x 1 2 3 4 5 6 7 8 9 10 y 1 3 4 5 6 7 8 9 10 2 For y and [,2] above the value is 3. Why is the value not 2? It looks like the value is 2 for y and [,10] (this should be 10, right?) and values 3 to 10 are shifted one position to the left for "y". I got the same results when I ran this code. Thanks, Jude David Winsemius wrote: On Jun 17, 2009, at 9:27 AM, jim holtman wrote:> Do an 'str' of your object. It looks like one of the columns is> probably> character/factor since there are quotes around the 'numbers'. You> can also> explicity convert the offending columns to numeric is you want to.> Also use> colClasses on the read.csv to define the class of the data in each> column.> This will should you where the error is.One function that might be of use is data.matrix which will attempt to convert character vectors to numeric vectors across an entire dataframe. I hope this is not beating a dead horse, but see if these examples are helpful in any way: > ?data.matrix > df <- data.frame(x=1:10,y=as.character(1:10)) > df x y 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9 9 9 10 10 10 # .... not all is as it seems > apply(df,1,I) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] x " 1" " 2" " 3" " 4" " 5" " 6" " 7" " 8" " 9" "10" y "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" > df2 <- data.frame(x=1:10,y=1:10) > apply(df2,1,I) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] x 1 2 3 4 5 6 7 8 9 10 y 1 2 3 4 5 6 7 8 9 10 > str(df) 'data.frame': 10 obs. of 2 variables: $ x: int 1 2 3 4 5 6 7 8 9 10 $ y: Factor w/ 10 levels "1","10","2","3",..: 1 3 4 5 6 7 8 9 10 2 # so that's weird. y isn't even a character vector !?!? Such are the strange beasts called factors. # solution? or at least one strategy > apply(data.matrix(df), 1, I) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] x 1 2 3 4 5 6 7 8 9 10 y 1 3 4 5 6 7 8 9 10 2 ___________________________________________ Jude Ryan Director, Client Analytical Services Strategy & Business Development UBS Financial Services Inc. 1200 Harbor Boulevard, 4th Floor Weehawken, NJ 07086-6791 Tel. 201-352-1935 Fax 201-272-2914 Email: jude.ryan at ubs.com -------------- next part -------------- Please do not transmit orders or instructions regarding a UBS account electronically, including but not limited to e-mail, fax, text or instant messaging. The information provided in this e-mail or any attachments is not an official transaction confirmation or account statement. For your protection, do not include account numbers, Social Security numbers, credit card numbers, passwords or other non-public information in your e-mail. Because the information contained in this message may be privileged, confidential, proprietary or otherwise protected from disclosure, please notify us immediately by replying to this message and deleting it from your computer if you have received this communication in error. Thank you. UBS Financial Services Inc. UBS International Inc. UBS Financial Services Incorporated of Puerto Rico UBS AG UBS reserves the right to retain all messages. Messages are protected and accessed only in legally justified cases.
Carl Witthoft
2009-Jun-18 22:05 UTC
[R] Problem in 'Apply' function: does anybody have other solution
Several folks pointed out the problem is most likely that a column of data is being read in as a factor. I prefer to solve this by setting as.is=TRUE as one of the arguments to read.table() (unless I am mis-remembering and it needs to be set to FALSE :-) ). The point is to tell the read function to read character strings in as character strings, rather than converting them into factors. After that you can convert the offending column with as.numeric or one of the date-conversion functions (if the input is date/time strings) Carl
Apparently Analagous Threads
- R: Best way to plot a Matrix of all possible pair combinations
- Need some help in R : value more than equals to a row.
- Rplot save problem after using "identify" with R version 3.0.0
- Forming Portfolios for Fama / French Regression
- warning message when running quantile regression