Displaying 20 results from an estimated 1000 matches similar to: "what command to use for standization"
2009 Dec 10
question about centroid-linkage (cluster analysis)
Dear R community,
I would be greatful if somebody could shed light on the following.
I have created a set of 6 points to check how centroid
agglomeration works in cluster analysis:
> Y <- data.frame(x=c(-1,1,1,-1,10,12),y=c(1,1,-1,-1,0,0))
It is quite intuitive to understand that the last clusters to be joined will be
{1,2,3,4} with {5,6}. Now, the centroid for the first cluster has
2008 Jul 03
Otpmial initial centroid in kmeans
Helo there. I am using kmeans of base package to cluster my customers. As
the results of kmeans is dependent on the initial centroid, may I know:
1) how can we specify the centroid in the R function? (I don't want random
starting pt)
2) how to determine the optimal (if not, a good) centroid to start with? (I
am not after the fixed seed solution as it only ensure that the
2012 Jul 04
Error in hclust?
Dear R users,
I have noted a difference in the merge distances given by hclust using
centroid method.
For the following data:
and using Euclidean distance, hclust using centroid method gives the
following results:
> x.dist<-dist(x)
> x.aah<-hclust(x.dist,method="centroid")
> x.aah$merge
[,1] [,2]
[1,] -3 -6
2012 Nov 18
centroid of hclust
Dear UseRs,i want to find centroid of clusters, which i generated by hclust. Is there a way doing that? i took mean to elements in each cluster to get centroid but i am not sure if i am right?
thanks in advanceeliza
[[alternative HTML version deleted]]
2011 Jul 09
SpatialPolygonsDataFrame holes problem
I have obtained shapefiles for Indian states from here:
Problem: I want to extract centroid coordinates for each State, but there is some coding problem with the shapefiles that prevents this.
#After extracting the shapefiles from the india_state.zip file, then:
2008 Sep 17
ANOVA contrast matrix vs. TukeyHSD?
Dear Help List,
Thanks in advance for reading...I hope my questions are not too ignorant.
I have an experiment looking at evolution of wing size [centroid] in
fruitflies and the effect of 6 different experimental treatments
[treatment]. I have five replicate populations [replic] in each
treatment and have reared the flies in two different temperatures [cond]
to assay the wing size, making
2009 Feb 05
Does the "labpt" object in the Polygons-class represent the centroid of the polygon
I need to calculate the centroids of some spatial polygons that I have
placed into a Polygons-class object. Is the labeling point in the
Polygons-class the centroid of the polygon?
Thank you for your help.
2012 Nov 22
Partial dependence plot in randomForest package (all flat responses)
I'm trying to make a partial plot with package randomForest in R. After I
perform my random forest object I type
partialPlot(data.rforest, pred.data=act2, x.var=centroid, "C")
where data.rforest is my randomforest object, act2 is the original dataset,
centroid is one of the predictor and C is one of the classes in my response
Whatever predictor or response class I
2016 Jul 27
K MEANS clustering
Hey Parth,
Thanks for the reply.
I am considering implementing a cosine distance metric too, along with
euclidian distance because of the dimensionality issue that comes in with
K-Means and euclidian distance metric.
That does help when we deal with sparse vectors for documents. The
particular problem I'm having is representing centroids in an efficient way.
For example, when we find the mean
2016 Jul 26
K MEANS clustering
I've been working on the KMeans clustering algorithm recently and since the
past week, I have been stuck on a problem which I'm not able to find a
solution to.
Since we are representing documents as Tf-idf vectors, they are really
sparse vectors (a usual corpus can have around 5000 terms). So it gets
really difficult to represent these sparse vectors in a way that would be
2012 Oct 21
Linear discriminant function analysis based median as group centroid and nonparametric scale estimators???
Dear All,
I am using a specific approach for my master thesis. In essence, a
supervised reclassification is used as an intermediate step to find chemical
parameters which are able to reclassify defined groups. These variables will
be used in a next step where location and scale estimators of the groups are
important. Traditionally linear discriminant analysis is used for
reclassification which
2011 Apr 27
centroid representation and MANOVA
hi all.
I have a matrix of data with 5 different groups and 20 individual
response per group, and about 12 variables collected for each. I want to
represent the result in a 2D plot. PCA is not so good because the
difference between the groups is not obvious. I have seen, in a recent
paper, people doing a MANOVA and representing it in a centroid plot
(they used Matlab to do it).
I would like
2001 Nov 19
Hi list!
I'm computing multivar. distances from a set of centroids
to a (large) set of individuals. I'm now just using rbind
to create a matrix (x) with the centroid and the individuals,
then run as.matrix(dist(x)) and finally select the appropriate columns,
as I'm not interested on the distances among individuals.
Therefore, this procedure implies a waste of computing time.
Is there
2007 Feb 20
Mahalanobis distance and probability of group membership using Hotelling's T2 distribution
I want to calculate the probability that a group will include a particular
point using the squared Mahalanobis distance to the centroid. I understand
that the squared Mahalanobis distance is distributed as chi-squared but that
for a small number of random samples from a multivariate normal population
the Hotellings T2 (T squared) distribution should be used.
I cannot find a function for
2013 Jan 01
translate grouped data to their centroid
Given a data set with a group factor, I want to translate the numeric
variables to their
centroid, by subtracting out the group means (adding back the grand means).
The following gives what I want, but there must be an easier way using
sweep or
apply or some such.
iris2 <- iris[,c(1,2,5)]
means <- colMeans(iris2[,1:2])
pooled <- lm(cbind(Sepal.Length, Sepal.Width) ~ Species,
2017 Mar 16
GSoC-2017 Introduction and Project Discussion
I'm Shivang Bansal, a 3rd year Computer Science Engineering undergraduate
at Institute of Engineering & Technology in Lucknow, India. This mail is an
expression of my interest for Google Summer of Code program of this year. I
want to apologize for getting in so late. Actually I would have contacted
earlier, but sudden demise of my Grandfather disabled me in doing so.
I am
2007 May 23
Fisher's r to z' transformation - help needed
I am trying to use Fisher's z' transformation of the Pearson's r but the
standard error does not appear to be correct. I have simulated an example
using the R code below. The z' data appears to have a reasonably normal
distribution but the standard error given by the formula 1/sqrt(N-3) (from
http://davidmlane.com/hyperstat/A98696.html) gives a different results than
sd(z). Can
2006 Feb 15
a basic question about standardization?
Hi all,
I have a question about standardization.
Suppose I have training data which is a X matrix, of size N x p, where N is
the number of samples, p is the number of variables in the data set. Y is a
response vector of size N x 1, each element correspoding to each row of the
X matrix.
I do standardization on X, X1=scale(X, TRUE, TRUE), and Y1=scale(Y, TRUE,
And I got a regression
2008 Jun 02
LDA and centroids
I have carried out an lda analysis using the lda function of MASS
package. I have plotted the LD1xLD2 to represent the data. Now I would
like to get the centroids for each group of data and plot it on the
LD1xLD2 graph. How can I get the centroid value from the lda object?
Daniel Valverde Saub?
Grup de Biologia Molecular de Llevats
Facultat de Veterin?ria de la
2003 Sep 26
a. crossing branches with hclust, b. plot.dendrogram
a. when I use hclust with the methods media, centroid, and mcquitty,
and plot the results, the dendrograms have lines that are crossing each
other. Is this ok?
b. My next question refers to plot.dendrogram: How can I use parameters
as "hang" or "cex" here? E.g. for
st <- as.dendrogram(subtreeshc[[x]])
I would like to have something like this, where cex and hang