dilshan benaragama
2011-Oct-03 16:55 UTC
[R] distance coefficient for amatrix with ngative valus
Hi, I need to run a PCoA (PCO) for a data set wich has both positive and negative values for variables. I could not find any distancecoefficient other than euclidean distace running for the data set. Are there any other coefficient works with negtive values.Also I cannot get summary out put (the eigen values) for PCO as for PCA. Thanks. Dilshan [[alternative HTML version deleted]]
R. Michael Weylandt
2011-Oct-03 20:27 UTC
[R] distance coefficient for amatrix with ngative valus
One order of the usual coming right up! 1 course of "Why does XXX not work for you?" a la francaise, where XXX is, in your case, the Euclidean distance. Specifically, any metric worth its salt (in a normed space) satisfies dist(a,b) = dist(a+c,b+c) so why are negative values a problem?... 2 sides: a "Minimal Working Example" with a light buttery sauce and a fried "what package/code are you using" and, for desert, a Winsemian special of: "read the posting guide!" Michael Weylandt, who is putting together a menu for a fancy dinner even as he types On Mon, Oct 3, 2011 at 12:55 PM, dilshan benaragama <benaragamad at yahoo.com> wrote:> Hi, > I need to run a PCoA (PCO) for a data set wich has both positive and negative values for variables. I? could not find any distancecoefficient other than euclidean distace running for the data set. Are there any other coefficient works with negtive values.Also I cannot get summary out put (the eigen values) for PCO as for PCA. > > Thanks. > Dilshan > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
R. Michael Weylandt
2011-Oct-04 03:00 UTC
[R] distance coefficient for amatrix with ngative valus
You still haven't explained what's wrong with *almost every metric there is*, but if you want other distance metrics have you considered those in the package you are using, via the function dsvdis(). Consider, for example: library(labdsv) X <- get(data(bryceveg)); X[, sample(NROW(X))] <- (-1)*X[, sample(NROW(X))] # Put some negative values in all willy nilly like.... Y <- pco( dsvdis(X, index="bray/curtis") ) print(any(X < 0)) If you want more explanation, please provide actual details of what you are asking, as requested in my first email. Michael Weylandt On Mon, Oct 3, 2011 at 9:23 PM, dilshan benaragama <benaragamad at yahoo.com> wrote:> I am using (labdsv). If I can use euclidean distance I can do it with PCA > instead of PCO, so I am trying an alternative to PCA, but I cannot find a > disimilarity coefficient for that. > > From: R. Michael Weylandt <michael.weylandt at gmail.com> > To: dilshan benaragama <benaragamad at yahoo.com>; r-help > <r-help at r-project.org> > Sent: Monday, October 3, 2011 3:27:53 PM > Subject: Re: [R] distance coefficient for amatrix with ngative valus > > One order of the usual coming right up! > > 1 course of "Why does XXX not work for you?" a la francaise, where XXX > is, in your case, the Euclidean distance.? Specifically, any metric > worth its salt (in a normed space) satisfies dist(a,b) = dist(a+c,b+c) > so why are negative values a problem?... > > 2 sides: a "Minimal Working Example" with a light buttery sauce and a > fried "what package/code are you using" > > and, for desert, a Winsemian special of: "read the posting guide!" > > Michael Weylandt, who is putting together a menu for a fancy dinner > even as he types > > On Mon, Oct 3, 2011 at 12:55 PM, dilshan benaragama > <benaragamad at yahoo.com> wrote: >> Hi, >> I need to run a PCoA (PCO) for a data set wich has both positive and >> negative values for variables. I? could not find any distancecoefficient >> other than euclidean distace running for the data set. Are there any other >> coefficient works with negtive values.Also I cannot get summary out put (the >> eigen values) for PCO as for PCA. >> >> Thanks. >> Dilshan >> ? ? ? ?[[alternative HTML version deleted]] >> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > >
R. Michael Weylandt
2011-Oct-04 04:05 UTC
[R] distance coefficient for amatrix with ngative valus
Comments inline: On Mon, Oct 3, 2011 at 11:27 PM, dilshan benaragama <benaragamad at yahoo.com> wrote:> Yes I think you did not get my problem.No, you did not state your problem. I have replied to everything you have actually included to this point. Admittedly, I have failed to reply to things you did not say...> Actualy I want run PCO with > (labdsv). To do that I I am trying to get the distance metrix using > following fuctions with library (vegan).This is now the 7th email in this chain. You should mention the packages and functions you are using in the FIRST email of the chain. This is mentioned in the posting guide which you apparently have still not yet read.> > pca.gower<- vegdist(envt[,2:9],method="gower") > pca.eucl<-vegdist(envt[,2:9],method="euclidean") > pca.chi<-vegdist(envt[,2:9],method="chi.square") > pca.mahal<-vegdist(envt[,2:9],method="mahal") > pca.bray<-vegdist(envt,method="bray") > > However none of the functions workThey all work for any data I put in. This is perhaps when that minimal working example, which you also should have included, is necessary. The append at the end of each of the 7 emails in this chain that tells you to read the posting guide also asks for this, as did I explicitly.> (gives an error saying that is not > working due to negatve values)No, they each give warnings. Warnings are not errors. They are warnings and they say "warning". Perhaps unsurprisingly, errors say "error". If you are using an old version of vegan that throws an error, you should always update before seeking help.Not surprisingly, a certain document suggests this.> except euclidean distance for the raw data > set as the raw data has negative values for some variables. It is no point > of using euclidean metrix with?PCO as we can do the same thing from PCA. So > I need to find a way I can run PCO with a different dissimilarity metrix > for this data. It will be a great help if you can help me on thisActually read the warning message: it warns you that you have given negative data to an ecological function and suggests this might be a point you look into as this usually suggests a user-end problem. It does not fail to work in any sense of the word as evidence by the output of distances. If negative data is nonsense, you should heed this warning; if you know its not, disregard it. More importantly, as I said in my initial response, any distance metric worth its salt is translation invariant. To wit, x <- matrix(rnorm(50),5) d1 = vegdist(x, method="gower") d2 = vegdist(x + abs(min(x))*3, method="gower") all.equal(as.numeric(d1), as.numeric(d2)) TRUE In fairness, I'll admit this does not seem to work for the bray distance. I am not an ecologist and I do not know why this would be -- it does leave me somewhat confused as to what sort of space motivates the bray metric, but that's a discussion for another time and place -- but the function still returns a valid dist object for both d1 and d2.> > Thanks, > From: R. Michael Weylandt <michael.weylandt at gmail.com> > To: dilshan benaragama <benaragamad at yahoo.com>; r-help > <r-help at r-project.org>You will note that I include the r-help list on each email on this chain while you have not; this is mentioned in the posting guide.> Sent: Monday, October 3, 2011 10:00:53 PM > Subject: Re: [R] distance coefficient for amatrix with ngative valus > > You still haven't explained what's wrong with *almost every metric > there is*, but if you want other distance metrics have you considered > those in the package you are using, via the function dsvdis(). > Consider, for example: > > library(labdsv) > > X <- get(data(bryceveg)); > > X[, sample(NROW(X))] <- (-1)*X[, sample(NROW(X))] # Put some negative > values in all willy nilly like.... > Y <- pco( dsvdis(X, index="bray/curtis") ) > print(any(X < 0)) > > If you want more explanation, please provide actual details of what > you are asking, as requested in my first email. > > Michael Weylandt > > On Mon, Oct 3, 2011 at 9:23 PM, dilshan benaragama > <benaragamad at yahoo.com> wrote: >> I am using (labdsv). If I can use euclidean distance I can do it with PCA >> instead of PCO, so I am trying an alternative to PCA, but I cannot find a >> disimilarity coefficient for that. >> >> From: R. Michael Weylandt <michael.weylandt at gmail.com> >> To: dilshan benaragama <benaragamad at yahoo.com>; r-help >> <r-help at r-project.org> >> Sent: Monday, October 3, 2011 3:27:53 PM >> Subject: Re: [R] distance coefficient for amatrix with ngative valus >> >> One order of the usual coming right up! >> >> 1 course of "Why does XXX not work for you?" a la francaise, where XXX >> is, in your case, the Euclidean distance.? Specifically, any metric >> worth its salt (in a normed space) satisfies dist(a,b) = dist(a+c,b+c) >> so why are negative values a problem?... >> >> 2 sides: a "Minimal Working Example" with a light buttery sauce and a >> fried "what package/code are you using" >> >> and, for desert, a Winsemian special of: "read the posting guide!" >> >> Michael Weylandt, who is putting together a menu for a fancy dinner >> even as he types >> >> On Mon, Oct 3, 2011 at 12:55 PM, dilshan benaragama >> <benaragamad at yahoo.com> wrote: >>> Hi, >>> I need to run a PCoA (PCO) for a data set wich has both positive and >>> negative values for variables. I? could not find any distancecoefficient >>> other than euclidean distace running for the data set. Are there any >>> other >>> coefficient works with negtive values.Also I cannot get summary out put >>> (the >>> eigen values) for PCO as for PCA. >>> >>> Thanks. >>> Dilshan >>> ? ? ? ?[[alternative HTML version deleted]] >>> >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> >> >> > > >Would you care to elaborate further as to what the actual problem entails, with a minimal working example? More generally, might I suggest you learn how these metrics work and then apply the most appropriate one rather than groping blindly after something solely on the criterion of it being non-Euclidean. If you need other metrics, look into the various p-norms, all of which are implemented directly in R by way of the dist() function as are a few other norms with which I am not immediately familiar. Regards, Michael Weylandt
R. Michael Weylandt
2011-Oct-06 14:16 UTC
[R] distance coefficient for amatrix with ngative valus
Did you read any of the comments I made regarding working examples, meaningful question asking, or replying to the entire list? If you look at the code, you'll see pco is just a very elementary wrapper for cmdscale, the author of which is active on this list and could have seen your question and replied to it with a hundredfold more speed and knowledge than myself, ... had you replied to the entire list. Looking further into cmdscale, you can see that it is not designed to return variable loadings directly (to be honest, I'm not particularly familiar with PCO and I'm not sure that the method provides such loadings but I'm assuming you have reason to at least think they exist) so you'll have to calculate them directly. Read the documentation of cmdscale to find understand what the various things being returned are. If another function in the labsdv package seems to calculate them, perhaps you can pilfer some code from there. Michael On Tue, Oct 4, 2011 at 1:58 PM, dilshan benaragama <benaragamad at yahoo.com> wrote:> Hi, > As you mentioned I was able to run the pco? ignoring the warning massege of > negative values. The nest problem I have is how to get the loadings for each > variable as it will not give the summary out put or loadings as we get for > pca. > Thanks. > From: R. Michael Weylandt <michael.weylandt at gmail.com> > To: dilshan benaragama <benaragamad at yahoo.com>; r-help > <r-help at r-project.org> > Sent: Monday, October 3, 2011 11:05:19 PM > Subject: Re: [R] distance coefficient for amatrix with ngative valus > > Comments inline: > > On Mon, Oct 3, 2011 at 11:27 PM, dilshan benaragama > <benaragamad at yahoo.com> wrote: >> Yes I think you did not get my problem. > > No, you did not state your problem. I have replied to everything you > have actually included to this point. Admittedly, I have failed to > reply to things you did not say... > >>? Actualy I want run PCO with >> (labdsv). To do that I I am trying to get the distance metrix using >> following fuctions with library (vegan). > > This is now the 7th email in this chain. You should mention the > packages and functions you are using in the FIRST email of the chain. > This is mentioned in the posting guide which you apparently have still > not yet read. > >> >> pca.gower<- vegdist(envt[,2:9],method="gower") >> pca.eucl<-vegdist(envt[,2:9],method="euclidean") >> pca.chi<-vegdist(envt[,2:9],method="chi.square") >> pca.mahal<-vegdist(envt[,2:9],method="mahal") >> pca.bray<-vegdist(envt,method="bray") >> >> However none of the functions work > > They all work for any data I put in. This is perhaps when that minimal > working example, which you also should have included, is necessary. > The append at the end of each of the 7 emails in this chain that tells > you to read the posting guide also asks for this, as did I explicitly. > >> (gives an error saying that is not >> working due to negatve values) > > No, they each give warnings. Warnings are not errors. They are > warnings and they say "warning". Perhaps unsurprisingly, errors say > "error". If you are using an old version of vegan that throws an > error, you should always update before seeking help.Not surprisingly, > a certain document suggests this. > >> except euclidean distance for the raw data >> set as the raw data has negative values for some variables. It is no point >> of using euclidean metrix with?PCO as we can do the same thing from PCA. >> So >> I need to find a way I can run PCO with a different dissimilarity metrix >> for this data. It will be a great help if you can help me on this > > Actually read the warning message: it warns you that you have given > negative data to an ecological function and suggests this might be a > point you look into as this usually suggests a user-end problem. It > does not fail to work in any sense of the word as evidence by the > output of distances. If? negative data is nonsense, you should heed > this warning; if you know its not, disregard it. > > More importantly, as I said in my initial response, any distance > metric worth its salt is translation invariant. To wit, > > x <- matrix(rnorm(50),5) > > d1 = vegdist(x, method="gower") > d2 = vegdist(x + abs(min(x))*3, method="gower") > > all.equal(as.numeric(d1), as.numeric(d2)) > TRUE > > In fairness, I'll admit this does not seem to work for the bray > distance. I am not an ecologist and I do not know why this would be -- > it does leave me somewhat confused as to what sort of space motivates > the bray metric, but that's a discussion for another time and place -- > but the function still returns a valid dist object for both d1 and d2. > >> >> Thanks, >> From: R. Michael Weylandt <michael.weylandt at gmail.com> >> To: dilshan benaragama <benaragamad at yahoo.com>; r-help >> <r-help at r-project.org> > > You will note that I include the r-help list on each email on this > chain while you have not; this is mentioned in the posting guide. > >> Sent: Monday, October 3, 2011 10:00:53 PM >> Subject: Re: [R] distance coefficient for amatrix with ngative valus >> >> You still haven't explained what's wrong with *almost every metric >> there is*, but if you want other distance metrics have you considered >> those in the package you are using, via the function dsvdis(). >> Consider, for example: >> >> library(labdsv) >> >> X <- get(data(bryceveg)); >> >> X[, sample(NROW(X))] <- (-1)*X[, sample(NROW(X))] # Put some negative >> values in all willy nilly like.... >> Y <- pco( dsvdis(X, index="bray/curtis") ) >> print(any(X < 0)) >> >> If you want more explanation, please provide actual details of what >> you are asking, as requested in my first email. >> >> Michael Weylandt >> >> On Mon, Oct 3, 2011 at 9:23 PM, dilshan benaragama >> <benaragamad at yahoo.com> wrote: >>> I am using (labdsv). If I can use euclidean distance I can do it with PCA >>> instead of PCO, so I am trying an alternative to PCA, but I cannot find a >>> disimilarity coefficient for that. >>> >>> From: R. Michael Weylandt <michael.weylandt at gmail.com> >>> To: dilshan benaragama <benaragamad at yahoo.com>; r-help >>> <r-help at r-project.org> >>> Sent: Monday, October 3, 2011 3:27:53 PM >>> Subject: Re: [R] distance coefficient for amatrix with ngative valus >>> >>> One order of the usual coming right up! >>> >>> 1 course of "Why does XXX not work for you?" a la francaise, where XXX >>> is, in your case, the Euclidean distance.? Specifically, any metric >>> worth its salt (in a normed space) satisfies dist(a,b) = dist(a+c,b+c) >>> so why are negative values a problem?... >>> >>> 2 sides: a "Minimal Working Example" with a light buttery sauce and a >>> fried "what package/code are you using" >>> >>> and, for desert, a Winsemian special of: "read the posting guide!" >>> >>> Michael Weylandt, who is putting together a menu for a fancy dinner >>> even as he types >>> >>> On Mon, Oct 3, 2011 at 12:55 PM, dilshan benaragama >>> <benaragamad at yahoo.com> wrote: >>>> Hi, >>>> I need to run a PCoA (PCO) for a data set wich has both positive and >>>> negative values for variables. I? could not find any distancecoefficient >>>> other than euclidean distace running for the data set. Are there any >>>> other >>>> coefficient works with negtive values.Also I cannot get summary out put >>>> (the >>>> eigen values) for PCO as for PCA. >>>> >>>> Thanks. >>>> Dilshan >>>> ? ? ? ?[[alternative HTML version deleted]] >>>> >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> >>> >>> >>> >> >> >> > > Would you care to elaborate further as to what the actual problem > entails, with a minimal working example? > > More generally, might I suggest you learn how these metrics work and > then apply the most appropriate one rather than groping blindly after > something solely on the criterion of it being non-Euclidean. If you > need other metrics, look into the various p-norms, all of which are > implemented directly in R by way of the dist() function as are a few > other norms with which I am not immediately familiar. > > Regards, > > Michael Weylandt > > >