Dear group, My question, perhaps is more of a statistical question using R I have a data matrix ( 400 x 400 normally distributed) with data points ranging from -1 to +1.. For certain clustering algorithms, I suspect the tight data range is not helping resolving the clusters. Is there a way to transform the data something similar to logit, where I dont lose normality of the data and yet I can better expand the data ranges. Thanks Adrian
I apologize, I forgot to mention another key operation. in my matrix -1 to <0 has a different meaning while values between >0 to 1 has a different set of meaning. So If I do logit transformation some of the positives becomes negative (values < 0.5 etc.). In such case, the resulting transformed matrix is incorrect. I want to transform numbers ranging from -1 to <0 and numbers between >0 and 1 independently. Thanks
I don't think you have given us enough information. For example, is the 500x500 matrix a distance matrix or does it represent 500 columns of information about 500 rows of observations? If a distance matrix, how is distance being measured? You clarification suggests it may be a distance matrix of correlation coefficients? If distance has different meanings between -1 and 0 and 0 and +1, getting interpretable results from cluster analysis will be difficult, but it is not clear what you mean by that. ------------------------------------------------- David L. Carlson Department of Anthropology Texas A&M University -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Adrian Johnson Sent: Sunday, January 20, 2019 8:02 AM To: r-help <r-help at r-project.org> Subject: [R] data transformation Dear group, My question, perhaps is more of a statistical question using R I have a data matrix ( 400 x 400 normally distributed) with data points ranging from -1 to +1.. For certain clustering algorithms, I suspect the tight data range is not helping resolving the clusters. Is there a way to transform the data something similar to logit, where I dont lose normality of the data and yet I can better expand the data ranges. Thanks Adrian ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Adrian Johnson Sent: Sunday, January 20, 2019 10:08 AM To: r-help <r-help at r-project.org> Subject: Re: [R] data transformation I apologize, I forgot to mention another key operation. in my matrix -1 to <0 has a different meaning while values between >0 to 1 has a different set of meaning. So If I do logit transformation some of the positives becomes negative (values < 0.5 etc.). In such case, the resulting transformed matrix is incorrect. I want to transform numbers ranging from -1 to <0 and numbers between >0 and 1 independently. Thanks ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
this might work for you newy <- sign(oldy)*f(abs(oldy)) where f() is a monotonic transformation, perhaps a power function. On Sun, Jan 20, 2019 at 11:08 AM Adrian Johnson <oriolebaltimore at gmail.com> wrote:> > I apologize, I forgot to mention another key operation. > in my matrix -1 to <0 has a different meaning while values between >0 > to 1 has a different set of meaning. So If I do logit transformation > some of the positives becomes negative (values < 0.5 etc.). In such > case, the resulting transformed matrix is incorrect. > > I want to transform numbers ranging from -1 to <0 and numbers > between >0 and 1 independently. > > Thanks > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
There is no "perhaps" about it. Nonsense phrases like "similar to logit, where I dont [sic] lose normality of the data" that lead into off-topic discussions of why one introduces transformations in the first place are perfect examples of why questions like this belong on a statistical theory discussion forum like StackExchange rather than here where the topic is the R language. On January 20, 2019 6:02:15 AM PST, Adrian Johnson <oriolebaltimore at gmail.com> wrote:>Dear group, >My question, perhaps is more of a statistical question using R >I have a data matrix ( 400 x 400 normally distributed) with data >points ranging from -1 to +1.. >For certain clustering algorithms, I suspect the tight data range is >not helping resolving the clusters. > >Is there a way to transform the data something similar to logit, where >I dont lose normality of the data and yet I can better expand the data >ranges. > >Thanks >Adrian > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.