Displaying 20 results from an estimated 10000 matches similar to: "How to access to Data within a cluster"
2005 May 30
2
How to access to sum of dissimilarities in CLARA
Dear All ,
Since dissimilarity is one of quality measures in clustering , I'm trying to access to the sum of dissimilarity as a whole measure. But after running my data using CLARA I obtain :
1128 dissimilarities, summarized :
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.033155 0.934630 2.257000 2.941600 4.876600 8.943700
But I can not find the sum of dissimilarity.How can i
2006 Apr 05
1
"partitioning cluster function"
Hi All,
For the function "bclust"(e1071), the argument "base.method" is
explained as "must be the name of a partitioning cluster function
returning a list with the same components as the return value of
'kmeans'.
In my understanding, there are three partitioning cluster functions in
R, which are "clara, pam, fanny". Then I check each of them to
2007 Jun 13
0
Setting a minimum number of observations within an individual cluster
Hi
I'm trying to cluster a continuous dataset with a varying number of clusters and with a restriction that each cluster must have more than 'x' number of observations.
I have tried the clara function, using silhouette to give me the neighbouring cluster mediod of each observation, then merging an observation from a cluster with less than 'x' obs. into its' neighbour,
2003 Nov 17
1
CLARA
I need informations about the clara routine. The on-line doc say that the
argument stand is a logical, indicating if the measurements in x are
standardized before calculating the dissimilarities. Measurements are
standardized for each variable (column), by subtracting the variable's mean
value and dividing by the variable's mean absolute deviation. If we note
STAND = TRUE, I suppose that
2008 Aug 01
2
Exporting data to a text file
HI R users
With clara function I get a data frame (maybe this is not the exact word,
I'm new to R) with the following variables:
> names(myclara)
[1] "sample" "medoids" "i.med" "clustering" "objective"
[6] "clusinfo" "diss" "call" "silinfo" "data"
I want to
2004 Jun 10
1
question about similarities cluster using hierclust
my major is bioinformatics, and i'm trying to cluster ( agglomerate
the closest pari of observations ) in R.
i have already got my own similarities metric, but do not know how to
clust it based on similarities instead of dissimilarities.
since the help document of hierclust mentions the parameter "sim",
which seems good to me, but it doesn't appear in the code of
hierclust()
2008 Sep 30
1
CLARA and determining the right number of clusters
Hi everyone
I have a question about clustering. I've managed using CLARA to get a
clustering analysis of a large data set. But now I want to find which is the
right number of clusters.
The clara.object gives some information like the ratio between maximal and
minimal dissimilarity that says (maybe if lower than 1??) if a cluster is
well-separated from the other. I've also read something
2005 Sep 12
4
Document clustering for R
I'm working on a project related to document clustering. I know that R
has clustering algorithms such as clara, but only supports two distance
metrics: euclidian and manhattan, which are not very useful for
clustering documents. I was wondering how easy it would be to extend the
clustering package in R to support other distance metrics, such as
cosine distance, or if there was an API for
2004 Feb 28
1
cluster-gruop-match with other attributes after na.omit
Hi,
i want a cluster-analysis with clara, but getting an
error because in cldat are NA's.
Error in clara(cldat[, 1:3], 4) : Each of the random samples contains objects
between which
no distance can be computed.
cldatx <- subset(cldat,select=c(A,B,C))
cldaty <- na.omit(cldatx)
Now , clara works but cldat has ~193.000 obs
and cldatx without NA's ~75.000 obs.
How could i match
2006 Jan 08
2
Filters in waveslim
Dear R Users,
For running wavelet functions using dwt( ), modwt( ), and mra( ), a wavelet filter algorithm is applied. For all these functions, default is "la8" and other possibility is "haar". In related documents, another possibilities like as symlet and coiflet ... are not cited.
Besides "la8" and "haar", which wavelet filters can be used?
Thank
2006 Nov 20
3
Creating a new vector ( another problem)
Dear R Users,
Suppose we are interested for generating a new vector ( x ) from a current vector (y) of length 1000 so that x includes the sum of every 5 values in y respectively from the first to the end of length y. The same length of y for x is desired, so that other 4 positions (indices) in x are filled out with NA.
For generating such a new vector, I have no idea. I tried in some
2006 Mar 10
2
creating new vector
Hi R Users,
I don't know how much is difficult my problem and even it is possible to solve in R or not.
Given a vector with 2000 observations. I want to creat a new vector from that vector so that new vector be the sum of every 5 observations sequently. That is , each new observation is sum of every 5 sequent observations, from the first observation of previous vector to the end. So
2006 Apr 10
2
passing known medoids to clara() in the cluster package
Greetings,
I have had good success using the clara() function to perform a simple cluster
analysis on a large dataset (1 million+ records with 9 variables).
Since the clara function is a wrapper to pam(), which will accept known medoid
data - I am wondering if this too is possible with clara() ... The
documentation does not suggest that this is possible.
Essentially I am trying to
2007 Nov 28
2
Clustering
Hello all!
I am performingsome clustering analysis on microarray data using
agnes{cluster} and I have created my own dissimilarity matrix according to a
distance measure different from "euclidean" or "manhattan" etc. My question
is, if I choose for example method="complete", how are the distances
between the elements calculated? Are they taken form the dissimilarity
2010 Jun 11
2
Clustering algorithms don't find obvious clusters
I have a directed graph which is represented as a matrix on the form
0 4 0 1
6 0 0 0
0 1 0 5
0 0 4 0
Each row correspond to an author (A, B, C, D) and the values says how many
times this author have cited the other authors. Hence the first row says
that author A have cited author B four times and author D one time. Thus the
matrix represents two groups of authors: (A,B) and (C,D) who cites
2009 Jun 11
1
Cluster analysis, defining center seeds or number of clusters
I use kmeans to classify spectral events in high and low 1/3 octave bands:
#Do cluster analysis
CyclA<-data.frame(LlowA,LhghA)
CntrA<-matrix(c(0.9,0.8,0.8,0.75,0.65,0.65), nrow = 3, ncol=2, byrow=TRUE)
ClstA<-kmeans(CyclA,centers=CntrA,nstart=50,algorithm="MacQueen")
This works well when the actual data shows 1,2 or 3 groups that are not
"too close" in a cross plot.
2008 Mar 06
2
Clustering large data matrix
Hello,
I have a large data matrix (68x13112), each row corresponding to one
observation (patients) and each column corresponding to the variables
(points within an NMR spectrum). I would like to carry out some kind of
clustering on these data to see how many clusters are there. I have
tried the function clara() from the package cluster. If I use the matrix
as is, I can perform the clara
2011 Mar 31
1
Cluster analysis, factor variables, large data set
Dear R helpers,
I have a large data set with 36 variables and about 50.000 cases. The
variabels represent labour market status during 36 months, there are 8
different variable values (e.g. Full-time Employment, Student,...)
Only cases with at least one change in labour market status is
included in the data set.
To analyse sub sets of the data, I have used daisy in the
cluster-package to create
2005 Jun 01
2
How to name variables in a single plot
Dear R Friends ,
I want to name my variables( more than 2 variables in a single plot) within a plot to distinct them from each other, but I cann't. How it is possible? I don't mean x and y axis using xlab or ylab. At the below , it follows some lines, only as an example that you could try please, if it is possible. I really thanks for your attention.
Amir
library(graphics)
y<-
2007 Aug 14
1
Slack variable in OR
Hi dear R users,
Is it basically correct that a problem is ( linearly on nonlinearly ) modeled so that the slack variable is bounded by an upper bound ?
If so, how it can be handled and coded practically ?
for example:
x1+ x2 =< b so ----> x1 + x2 + s=b
s=b- x1 - x2
b- x1 - x2 =< upper value
But algorithms can not calculate b- x1 - x2 , because