similar to: Introduction and Doubts

Displaying 20 results from an estimated 800 matches similar to: "Introduction and Doubts"

2016 Mar 10
2
Introduction and Doubts
I was not sharing it on maling list because i thought that someone can use all ideas i proposed in their GSOC proposal. Surely i will contribute to xapian project. sorry if that was against the rules The algorithm is not developed by me but after having much research on various clustering techniques. I found that there is a new algorithm called CLUBS(Clustering Using Binary Splitting) which gives
2016 Mar 10
2
Introduction and Doubts
Tf-idf is most used used weighting scheme is easy to understand and has been used in other frameworks like lucene and many other places. okapi bm25(implemented in xapian) is theoretically better/improved measure than tf-idf and i am looking into various other weighting scheme which are there in xapian or can be implemented like TF-ICF(term frequecy inverse corpus frequency),TF-RF(term
2001 Aug 21
1
difference between trees in R?
Hi. I am wondering if anybody has studied and/or written code in R to calculate the distance between 2 "trees". For example, if one does a hierarchical agglomerative clustering and say, a hierachical divisive clustering (represented as trees) and wishes to compute a metric on them. I am thinking of something like the symmetric difference as mentioned in Margush and McMorris (1982).
2011 Jun 09
1
k-nn hierarchical clustering
Hi there, is there any R-function for k-nearest neighbour agglomerative hierarchical clustering? By this I mean standard agglomerative hierarchical clustering as in hclust or agnes, but with the k-nearest neighbour distance between clusters used on the higher levels where there are at least k>1 distances between two clusters (single linkage is 1-nearest neighbour clustering)? Best regards,
2003 Dec 03
3
non-uniqueness in cluster analysis
Hi, I'm clustering objects defined by categorical variables with a hierarchical algorithm - average linkage. My distance matrix (general dissimilarity coefficient) includes several distances with exactly the same values. As I see, a standard agglomerative procedure ignores this problems, simply selecting, above equal distances, the one that comes first. For this reason the analysis in output
2009 Mar 12
2
Time-Ordered Clustering
Hello All, Does anyone know of a package that performs constraint-based clusters? Ideally the package could perform "Time-Ordered Clustering", a technique applied in a recent journal article by Runger, Nelson, Harnish (using MS Excel). Quote, "in our specific implementation of constrained clustering, the clustering algorithm remains agglomerative and hierarchical, but observations
2006 Feb 28
1
creating dendrogram from cluster hierarchy
Dear R users, I have created data for hierarchical agglomerative cluster analysis which consist of the merging pairs and the agglomeration heights, e.g. something like my.merge <- matrix(c(-1,-2,-3,1), ncol=2, byrow=TRUE) my.height <- c(0.5, 1) I'd like to plot a corresponding dendrogram but I don't know how to convert my data to achieve this. Is it possible to create a
2007 Jul 23
1
Cluster prediction from factor/numeric datasets
Hi all, I have a dataset with numeric and factor columns of data which I developed a Gower Dissimilarity Matrix for (Daisy) and used Agglomerative Nesting (Agnes) to develop 20 clusters. I would like to use the 20 clusters to determine cluster membership for a new dataset (using predict) but cannot find a way to do this (no way to "predict" in the cluster package). I know I can use
2011 May 11
2
hierarchical clustering within a size limit
Hello List, I am trying to implement a hierarchical cluster using the hclust method agglomerative single linkage method with a small wrinkle. I would like to cluster a set of numbers on a number line only if they are within a distance of 500. I would then like to print out the members of this list. So far I can put a vector: > x<-c(2,10,200,300,600,700) into a distance matrix: >
2004 Feb 04
1
Clustering with 'agnes'
Hello, I had a question regarding clustering using the agnes() function from the 'cluster' package. I was wondering if anyone knew how I can identify cluster points after running the agnes function. For example, I created a dataset with points randomly scattered around (0,0), (0,1) and (1,0). After clustering, the dendrogram shows all the clustered points and I get the ordering and
2007 Oct 16
0
doubts about Silhouette
Sorry for the long message. I'm doing my best to try to explain myself. I have fitted a spline to my data, I have fitted a spline, filled in the missing data by replicating the spline coefficients associated to the last node. I obtained a number of dendograms by different combination of distance and link-method by calling DIST and AGNES. The agglomerative coefficient is very high (~ 0.99) for
2016 Aug 15
2
KMeans - Evaluation Results
Hello, I've recently finished with an implementation of KMeans with two initialization techniques, random initialization and KMeans++. I would like to share my findings after evaluating the same. I have tested this implementation of KMeans with a BBC news article dataset. I am currently working on evaluating the same with FIRE datasets. Currently, clustering more than 500 documents
2005 Nov 02
1
x/y coordinates of dendrogram branches
Dear R-users, I need some help concerning the plotting of dendrograms for hierarchical agglomerative clustering. The agglomeration niveau of each step should be displayed at the branches of the dendrogram. For this I need the x/y coordinates of the branch-agglomerations of the dendrogram. The y-values are known (the heights of the agglomeration), but how can I get the x-values? > mydata
2007 Mar 01
1
data volume option, is it present in current version of ocfs2
hi, i am trying to install oracle rac on fedora core 6 through iscsi. when i try to mount, mount.ocfs2: Invalid argument while mounting /dev/mapper/rac-crs on /mnt/crs. Check 'dmesg' for more information on this error. error log in dmesg, ocfs2: Unmounting device (253,11) on (node 255) (27354,0):ocfs2_parse_options:753 ERROR: Unrecognized mount option "datavolume" or missing
2017 Mar 09
2
GSoC 2017 Project Proposal
Hello devs. I would like to propose how I plan to go about improving and getting a system that can be integrated into Xapian in this GSoC for the clustering branch. I have identified three areas of work which were not touched last time. 1) Automated Performance Analysis I had roughly implemented 2 evaluation techniques previously (Distance b/w document and centroids within clusters and
2008 Jan 24
1
Calculating sum of squares from density estimates
Hi I have some density estimates obtained from density(). I would like to calculate the sum of squares of these. As the x values of the estimates are not the same, and I would prefer not to restrict the estiomate to a certain range of x values, how can I do the calculation? Lets say: d1 <- density(Data1) d2 <- density(Data2) If the x values would be the same, I would: ssq <- sum(
2002 Apr 29
2
cluster analyses
I'm clustering rather large data sets and would like to cut the dendrograms to get a better view of specific components. I calculate the dissimilarity matrix using daisy() because I have a mixture of variable types: factors, ordered factors and numerical variables. If I want one dendrogram, I use agnes() for the agglomerative nesting and pltree() to draw the dendrogram. That way, I get the
2008 Jan 26
2
using facet_grid() from ggplot2 with additional text in labels
Hi I am using ggplot2 at the moment and I must say it is definitely better then ggplot - good work. My problem is that I am using facet_grid() in the following way: > p <- ggplot(ssq, aes(x=year, y=-log(ssq))) > p + geom_point() + facet_grid(me*gi~cs*rz) and it works nicely, except that I would like to have, in naddition to the values of me, gi, cs and rz the name of the variable.
2011 Jan 27
3
agnes clustering and NAs
Hello, In the documentation for agnes in the package 'cluster', it says that NAs are allowed, and sure enough it works for a small example like : > m <- matrix(c( 1, 1, 1, 2, 1, NA, 1, 1, 1, 2, 2, 2), nrow = 3, byrow = TRUE) > agnes(m) Call: agnes(x = m) Agglomerative coefficient: 0.1614168 Order of objects: [1] 1 2 3 Height (summary): Min. 1st Qu. Median Mean 3rd
2003 Jun 10
3
tftp server error
I have got a Solaris 9 package build from the latest tftp software at ftp.kernel.org. When I do the following: ./in.tftpd -l -s /tftpboot -m /tftprules -v The tftp server does not start. The error in /var/adm/messages is "too many -s directories". If i do ./in.tftpd -l -s /tftpboot -v then it works fine. But i need remap feature for my project. The message I get when i do