Hello everyone I'm a beginner in Stats and R, I'm using R 2.10.1. I need to create a multivariate qq plot, there is 8 variable group with each has 55 number of input. An example of what I did so far, just to get my point out:> data=read.csv(file.choose(),header=T) > datacountry village group av_expen P2ary_ed no_fisher 1 Cook Islands Aitutaki D 5239.12747 0.6666667 666.99986 2 Cook Islands Mangaia C 4587.36188 0.6021505 207.69228 3 Cook Islands Palmerston B 7784.31874 0.1666667 24.00000 ... 53 Wallis And Futuna All Futuna D 11023.30674 0.2789855 1056.63143 54 Wallis And Futuna Halalo B 8783.54979 0.2794118 153.51715 55 Wallis And Futuna Vailala A 12231.95400 0.2395833 100.00000 The problem I'm having starts now. I use the following command trying to work out the mahalanobis before plotting the QQ plot, but the following error is prompt:> mah=mahalanobis(data,apply(data,2,mean),var(data))Error in FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...) : non-numeric argument to binary operator In addition: Warning messages: 1: In mean.default(newX[, i], ...) : argument is not numeric or logical: returning NA 2: In mean.default(newX[, i], ...) : argument is not numeric or logical: returning NA 3: In mean.default(newX[, i], ...) : argument is not numeric or logical: returning NA 4: In mean.default(newX[, i], ...) : argument is not numeric or logical: returning NA 5: In mean.default(newX[, i], ...) : argument is not numeric or logical: returning NA 6: In mean.default(newX[, i], ...) : argument is not numeric or logical: returning NA 7: In mean.default(newX[, i], ...) : argument is not numeric or logical: returning NA 8: In mean.default(newX[, i], ...) : argument is not numeric or logical: returning NA 9: In mean.default(newX[, i], ...) : argument is not numeric or logical: returning NA 10: In mean.default(newX[, i], ...) : argument is not numeric or logical: returning NA Then I thought to myself, maybe the error I got the wrong input for variable. So I adjust the variable to see is my assumption was correct to the command below, but I still got the same error:> mah=mahalanobis(data,apply(data,2,mean),var(no_fisher))I absolutely got no clue where I got wrong and don't know how to fix it. Anyways I thought to myself, no worries I don't use mahalanobis then I'll still try the QQ plot and see what happen. This is the command and the error I got from it: qqplot(qchisq(ppoints(data),ncol(data),data) Error in qchisq(p, df, lower.tail, log.p) : Non-numeric argument to mathematical function -- View this message in context: http://n4.nabble.com/producing-a-QQ-plot-tp1693228p1693228.html Sent from the R help mailing list archive at Nabble.com.
Dear Philip, It is difficult to tell what is wrong without a reproducible example. It would be very helpful if you would provide sample data. That said, the most obvious issue from what you have provided is that some of your data is character. mean() and var() will not work with character data. It needs to be numeric or coercible to numeric. I would try specifically excluding the character data (e.g., data[,3:5] from what I can make out). HTH, Josh On Sat, Mar 27, 2010 at 2:45 AM, Philip Wong <tomb_fighter at hotmail.com> wrote:> > Hello everyone I'm a beginner in Stats and R, I'm using R 2.10.1. ?I need to > create a multivariate qq plot, there is 8 variable group with each has 55 > number of input. ?An example of what I did so far, just to get my point out: >> data=read.csv(file.choose(),header=T) >> data > ? ? ? ? ? ? country ? ? village group ? ?av_expen ?P2ary_ed ?no_fisher > 1 ? ? ? Cook Islands ? ?Aitutaki ? ? D ?5239.12747 0.6666667 ?666.99986 > 2 ? ? ? Cook Islands ? ? Mangaia ? ? C ?4587.36188 0.6021505 ?207.69228 > 3 ? ? ? Cook Islands ?Palmerston ? ? B ?7784.31874 0.1666667 ? 24.00000 > ... > 53 Wallis And Futuna ?All Futuna ? ? D 11023.30674 0.2789855 1056.63143 > 54 Wallis And Futuna ? ? ?Halalo ? ? B ?8783.54979 0.2794118 ?153.51715 > 55 Wallis And Futuna ? ? Vailala ? ? A 12231.95400 0.2395833 ?100.00000 > > The problem I'm having starts now. ?I use the following command trying to > work out the mahalanobis before plotting the QQ plot, but the following > error is prompt: >> mah=mahalanobis(data,apply(data,2,mean),var(data)) > Error in FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...) : > ?non-numeric argument to binary operator > In addition: Warning messages: > 1: In mean.default(newX[, i], ...) : > ?argument is not numeric or logical: returning NA > 2: In mean.default(newX[, i], ...) : > ?argument is not numeric or logical: returning NA > 3: In mean.default(newX[, i], ...) : > ?argument is not numeric or logical: returning NA > 4: In mean.default(newX[, i], ...) : > ?argument is not numeric or logical: returning NA > 5: In mean.default(newX[, i], ...) : > ?argument is not numeric or logical: returning NA > 6: In mean.default(newX[, i], ...) : > ?argument is not numeric or logical: returning NA > 7: In mean.default(newX[, i], ...) : > ?argument is not numeric or logical: returning NA > 8: In mean.default(newX[, i], ...) : > ?argument is not numeric or logical: returning NA > 9: In mean.default(newX[, i], ...) : > ?argument is not numeric or logical: returning NA > 10: In mean.default(newX[, i], ...) : > ?argument is not numeric or logical: returning NA > > Then I thought to myself, maybe the error I got the wrong input for > variable. ?So I adjust the variable to see is my assumption was correct to > the command below, but I still got the same error: >> mah=mahalanobis(data,apply(data,2,mean),var(no_fisher)) > > I absolutely got no clue where I got wrong and don't know how to fix it. > Anyways I thought to myself, no worries I don't use mahalanobis then I'll > still try the QQ plot and see what happen. ?This is the command and the > error I got from it: > qqplot(qchisq(ppoints(data),ncol(data),data) > Error in qchisq(p, df, lower.tail, log.p) : > ?Non-numeric argument to mathematical function > > -- > View this message in context: http://n4.nabble.com/producing-a-QQ-plot-tp1693228p1693228.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Senior in Psychology University of California, Riverside http://www.joshuawiley.com/
Hello, this is the first 10 data of the population. country village group av_expen P2ary_ed no_fisher B_Leth B_Lutjan Wt_Leth Wt_Lutjan Cook Islands Aitutaki D 5239.127472 0.666666667 666.9998558 3.286283997 1.971519001 520.6454552 126.2441843 Cook Islands Mangaia C 4587.361877 0.602150538 207.69228 0.330248 1.846795 0 0 Cook Islands Palmerston B 7784.318736 0.166666667 24.00000002 1.384456001 0.233746 0 57.76351477 Cook Islands Rarotonga A 8793.256543 0.764285714 223.8639163 6.790178998 0.751358 51.51418019 30.5970125 French Polynesia Fakarava B 7937.3952 0.36 255.3600002 7.485009002 6.282185007 62.28921398 60.39332797 French Polynesia Maatea D 12135.84 0.316455696 293.7499998 1.270781 0.526468 1002.39553 648.4578044 French Polynesia Mataiea D 12718.57548 0.341880342 2082.386008 2.117207998 0.340852 1830.16527 4239.861263 French Polynesia Raivavae B 8741.5104 0.285714286 325.0665956 20.121207 4.458011998 63.49777279 0 French Polynesia Tikehau D 6295.66 0.240384615 114.0832839 5.183129001 7.178272997 900.4192224 935.3617853 -- View this message in context: http://n4.nabble.com/producing-a-QQ-plot-tp1693228p1693245.html Sent from the R help mailing list archive at Nabble.com.
Am 27.03.2010 10:45, schrieb Philip Wong:> >> mah=mahalanobis(data,apply(data,2,mean),var(data)) >> > >> mah=mahalanobis(data,apply(data,2,mean),var(no_fisher)) >> > qqplot(qchisq(ppoints(data),ncol(data),data) >As Joshua already pointed out: you are trying mathematical functions on names. Your Data containes e.g. Village names. You can exclude those with subset, generating a new data.frame. If you want averages by categories e.g. by a village have a look at the doby package, it has a function summaryby that is perfect for generating aggregated data with multiple functions. hth Stefan
Aha! I see now! thanks guys! really helpful! -- View this message in context: http://n4.nabble.com/producing-a-QQ-plot-tp1693228p1693259.html Sent from the R help mailing list archive at Nabble.com.
It is a bit of a side note really, but a convenient way to provide data (particularly when it is complex) is via dput(). Not only is this easier to read in, it preserves classes and other handy info. For instance, once I had played around to get "Cook" and "Islands" into one column (since there was a space) I could use: dput(data, file="clipboard") #data is what is being written and it is output to the clipboard, works decently in Windows at least to get: ###################################################### structure(list(country = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("Cook Islands", "French Polynesia"), class = "factor"), village = structure(c(1L, 4L, 6L, 8L, 2L, 3L, 5L, 7L, 9L), .Label = c("Aitutaki", "Fakarava", "Maatea", "Mangaia", "Mataiea", "Palmerston", "Raivavae", "Rarotonga", "Tikehau"), class = "factor"), group structure(c(4L, 3L, 2L, 1L, 2L, 4L, 4L, 2L, 4L), .Label = c("A", "B", "C", "D"), class = "factor"), av_expen = c(5239.127472, 4587.361877, 7784.318736, 8793.256543, 7937.3952, 12135.84, 12718.57548, 8741.5104, 6295.66), P2ary_ed = c(0.666666667, 0.602150538, 0.166666667, 0.764285714, 0.36, 0.316455696, 0.341880342, 0.285714286, 0.240384615), no_fisher = c(666.9998558, 207.69228, 24.00000002, 223.8639163, 255.3600002, 293.7499998, 2082.386008, 325.0665956, 114.0832839), B_Leth = c(3.286283997, 0.330248, 1.384456001, 6.790178998, 7.485009002, 1.270781, 2.117207998, 20.121207, 5.183129001), B_Lutjan = c(1.971519001, 1.846795, 0.233746, 0.751358, 6.282185007, 0.526468, 0.340852, 4.458011998, 7.178272997), Wt_Leth = c(520.6454552, 0, 0, 51.51418019, 62.28921398, 1002.39553, 1830.16527, 63.49777279, 900.4192224 ), Wt_Lutjan = c(126.2441843, 0, 57.76351477, 30.5970125, 60.39332797, 648.4578044, 4239.861263, 0, 935.3617853)), .Names c("country", "village", "group", "av_expen", "P2ary_ed", "no_fisher", "B_Leth", "B_Lutjan", "Wt_Leth", "Wt_Lutjan"), class = "data.frame", row.names = c(NA, -9L)) ########################################### This is easily retrievable by copying the entire block of text and using: dget("clipboard") # read the data into R Best regards, Josh