2004 Dec 08
similarity matrix conversion to dissimilarity
I have a matrix of similarity scores that I want to convert into a matrix of dissimilarity scores so that I can apply some clustering methods to the data. That is, high values in my matrix signify similarity and low values (zero being the lowest) signify no similarity. What functions/options in R or its packages are a...
2020 Sep 01
[RFC] Framework for Finding and Using Similarity at the IR Level
...that perform nearly the same operation. By copying and pasting code throughout a code base, or using a piece of code as a reference to create a new piece of code that has nearly the same structure redundancies are inadvertently introduced throughout programs. Furthermore, compilers can introduce similarity through sets of instructions that require the same code generation strategies, or optimizing to the same patterns. For example these two pieces of code: int fn(const std::vector<int> &myVec) { for (auto it = myVec.begin(), et = myVec.end(); it != et; ++it) { if (*it &a...
2001 Apr 10
Similarity matrix
I frequently use hclust on a similarity matrix. In R only a distance matrix is allowed. Is there a simple reliable transformation of a similarity matrix that will result in a distance matrix making hclust work the same as S-Plus with a similarity matrix? Venables & Ripley 3rd edition implies that a simple reversal of values will s...
2011 Oct 07
R equivalent of proc varclus
Dear List What is the R package equivalent of Proc Varclus or Information Value. ANy assistance in determining R equivalents of f Oblique Component Analysis (PROC VARCLUS), Information Value (IV) and Weight Of Evidence (WOE) analysis, and business intelligence Regards, Ajay Websites-
2015 Sep 15
DWARF info in readobj
Hi All, I see that llvm-readobj displays information similar to GNU readelf does except DWARF data. I also see llvm-dwarfdump dumps all DWARF data in user readable format. Is there a plan for readobj to incorporate similar options? This will make readobj more feature complete for reading objects similar to readelf. If this is not the plan, will llvm-dwarfdump be a tool that regular user
2006 Feb 08
OpenRico LiveGrid or similar
Hi Has anyone used OpenRico''s "on-demand listbox" LiveGrid or something similar in a Rails app already? How well does it behave? I''m asking this because I need a scrollable list but the number of records in the table could be well above 5000. In the past I''ve used similar "on-demand fetchings" in desktop apps and it was a real blessing (the
2006 Jun 15
individual scales in random subset of pairwise distance survey
Hello, I'm curious if anyone has encounted a version of this problem (and it's solution) involving finding a consistent set of scales for subsets of survey data. The goal is to obtain peoples' rankings of pairwise similarity of a large number of items, on a 1..5 scale for example, and average these across people to use as input to MDS: How similar is object A to B on a 1..5 scale ___ How similar is object A to C on a 1..5 scale ___ etc. Because there are many items, there are N(N-1)/2 pairs, so it is n...
2011 Sep 13
help with hclust
Hello, how can I get the similarity value (i.e., the inner cluster similarity) that was used to cut a hierarchical tree at a specific height? I would appreciate your help! Best regards, Madeleine
2007 Apr 30
Xapian document matching
Hi, i'm wondering is there a possibility to do like ABCSok do (, to make "Main article" and "Same articles" collapsed to it. Like on the same thing. "Parent" and "same article on other sites" (they do differ from each other a little bit). Maybe somebody know how to do
2014 Jul 02
block level changes at the file system level?
I'm trying to streamline a backup system using ZFS. In our situation, we're writing pg_dump files repeatedly, each file being highly similar to the previous file. Is there a file system (EG: ext4? xfs?) that, when re-writing a similar file, will write only the changed blocks and not rewrite the entire file to a new set of blocks? Assume that we're writing a 500 MB file with only
2016 May 05
GSoC 2016 - Introduction
...hat cleared a few things out. Apologies for replying late because of exams going on. I was going through the previous clustering API to understand how it worked and it seems like the the approach for construction of the termlists which are used for distance metrics use TF-IDF weighting with cosine similarity, which is very similar to the approach I would need for this project. Just in this case, euclidian distance would be the metric. Would it be good to structure it in a way similar to the previous API with a few changes? For example, the Xapian::DocSimCosine::similarity( ) function in itself calcul...
2004 Jun 29
PAM clustering: using my own dissimilarity matrix
Hello, I would like to use my own dissimilarity matrix in a PAM clustering with method "pam" (cluster package) instead of a dissimilarity matrix created by daisy. I read data from a file containing the dissimilarity values using "read.csv". This creates a matrix (alternatively: an array or vector) which is not accepted b...
2006 Sep 26
Scoring/similarity, biased towards small fields?
Lucene, and perhaps most search engines, are biased towards small fields with little content (where thus the term frequency is higher). Lucene has the option to define a custom (Similarity) class to calculate the similarity between two fields (custom calculation of lengthNorm and tf) in different documents. But how do I do this in ferret? (I know to boost a field, but this is not what I (think to) need, I need to be able to influence the relative importance between the same field...
2011 Jul 28
construct a data set
Hi, i want to construct a data set similar to "AirPassengers". Its attributes are following. > attributes(AirPassengers) $tsp [1] 1949.000 1960.917 12.000 $class [1] "ts" How Can I construct a data set similar to it having same class and attributes. Thanks -- Amar Kumar Nandan ?:nandan.amar at
2016 Mar 10
Introduction and Doubts
...equecy inverse corpus frequency),TF-RF(term frequency-relevance frequency) for evaluating the speed and accuracy of final clustering system we can benchmark it against various other algos like k-means,HAC based on the measures mentioned in previous mail.(purity,F-measure,Entropy,F-Measure,Overall Similarity,Relative Margin,Variance Ratio) Please give your suggestions Have a Nice day Regards, Nirmal Singhania III Yr On Thu, Mar 10, 2016 at 5:46 PM, James Aylett <james-xapian at> wrote: > On Thu, Mar 10, 2016 at 05:47:29AM +0530, nirmal singhania wrote: > > >...
2008 Feb 22
lustre error
Dear All, Yesterday evening or cluster has stopped. Two of our nodes tried to take the resource from each other, they haven''t seen the other side, if I saw well. I stopped heartbeat, resources, start it again, and back to online, worked fine. This morning I saw this in logs: Feb 22 03:25:07 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall
2020 Sep 30
[RFC] Framework for Finding and Using Similarity at the IR Level
...r] Adding option to enable outlining from linkonceodr functions > Flexibility for Isomorphic Predicates: [IRSim] Adding support for isomorphic predicates > Flexibility for Commutative Instructions: [IRSim] Adding commutativity matching to structure checking > Matching call instructions for similarity identification: [IRSim] Letting call instructions be legal for similarity identification. > Outlining Calls: [IRSim][IROutliner] Allowing call instructions to be outlined. > Matching GEP instructions for similarity identification: [IRSim] Letting gep instructions be legal for similarity ident...
2014 Jan 31
[LLVMdev] MergeFunctions: reduce complexity to O(log(N)); the US > LLVM conference to hear me explain it using an animation). > > 1. Hash all functions into buckets > > In each bucket, separately: > > 2. Compare functions pair-wise and determine a > similarity metric for each pair (%age of equivalent > instructions) > > 3. Merge identical functions (with similarity = 100%), > update call > sites for those functions. > > 4. If the updates of call sites have touched other...
2006 Aug 02
Does Ruby / Rails have something similar to PHPs ''virtual''
Hi all Is there a rails / ruby function that is analagous to PHPs ''virtual'' function? "virtual() is an Apache-specific function which is similar to <!--#include virtual...--> in mod_include. It performs an Apache sub-request. It is useful for including CGI scripts or .shtml files, or anything else that you would parse through Apache. Note that for a CGI
2020 Sep 02
[RFC] Framework for Finding and Using Similarity at the IR Level
Indeed, an awesome project and an excellent report! Code size doesn't really get much attention, so the level of detail and the strong roadmap is refreshing. Hopefully, the project will provide execution times along with code-size > reductions. > I doubt it. Outlining will (almost?) always make for slower code due to a lot more calls being made. But that's ok for embedded targets,