thr3ads.net - similar to: "rsync performance on large files strongly depends on file's (dis)similarity"

Displaying 20 results from an estimated 10000 matches similar to: "rsync performance on large files strongly depends on file's (dis)similarity"

2004 Dec 08

similarity matrix conversion to dissimilarity

I have a matrix of similarity scores that I want to convert into a matrix of dissimilarity scores so that I can apply some clustering methods to the data. That is, high values in my matrix signify similarity and low values (zero being the lowest) signify no similarity. What functions/options in R or its packages are available for making this kind of transformation of a matrix?

Question

2005 Jan 19

Question

It is quite simple but highly relevant to me ;-) I was looking for the icecast protocol specifications/documentations but all I could find where documentations of applications how to build or configure them. I'm interested in some Java implementation of a audio streaming client, therefor I was looking for some info about the protocol. Please -- where can I find information? J?rgen Knauth

how to group a large list of strings into categories based on string similarity?

2010 Jun 24

how to group a large list of strings into categories based on string similarity?

Hi, I want to group a large list (20 million) of strings into categories based on string similarity? The specific problem is: given a list of DNA sequence as below ACTCCCGCCGTTCGCGCGCAGCATGATCCTG ACTCCCGCCGTTCGCGCGCNNNNNNNNNNNN CAGGATCATGCTGCGCGCGAACGGCGGGAGT CAGGATCATGCTGCGCGCGAANNNNNNNNNN CAGGATCATGCTGCGCGCGNNNNNNNNNNNN ...... ..... NNNNNNNCCGTTCGCGCGCAGCATGATCCTG

Rsync: Re: patch to enable faster mirroring of large filesyst ems

2001 Nov 30

Rsync: Re: patch to enable faster mirroring of large filesyst ems

I, too, was disappointed with rsync's performance when no changes were required (23 minutes to verify that a system of about 3200 files was identical). I wrote a little client/server python app which does the verification, and then hands rsync the list of files to update. This reduced the optimal case compare time to under 30 seconds. Here's what it does, and forgive me if these sound

Clustering with R - efficient processing of large sparse data sets (text data)

2009 Sep 27

Clustering with R - efficient processing of large sparse data sets (text data)

I checked the R procedure HCLUST (hierarchical clustering) but it looks like it requires a full triangular n x n similarity matrix as input, where n = number of observations. The number of variables is 200. My data set has n = 50,000 observations (keywords), and I use ad-hoc similarity measures, not available in R, to measure keyword similarity. Here, the vast majority of the n x n similarities

T-test to check equality, unable to interpret the results.

2009 Sep 16

T-test to check equality, unable to interpret the results.

Hi, I have the precision values of a system on two different data sets. The snippets of these results are as shown: sample1: (total 194 samples) 0.6000000238 0.8000000119 0.6000000238 0.2000000030 0.6000000238 ... ... sample2: (total 188 samples) 0.80000001 0.20000000 0.80000001 0.00000000 0.80000001 0.40000001 ... ... I want to check if these results are statistically significant? Intuitively,

Comparing long species lists via Sorensons dissimilarity

2007 Aug 14

Comparing long species lists via Sorensons dissimilarity

I have 4 very large species lists and I would like to compare them. I have the following results from running Sorenson’s dissimilarity tests: Norfolk Fens compared to Suffolk Coastal Fens: QS=0.583961142689298 Norfolk Fens compared to Breckland Edge Fens: QS=0.714896020281379 Norfolk Fens compared to Other Fens:

A --exclude-checksum option?

2013 Feb 12

A --exclude-checksum option?

Hi, I use rsync with hardlinks for backup, once a week doing checksums to ensure there's no filesystem corruption in the backed-up data. I also use tmpwatch, or something similar, to clean up /tmp, it removes files that have not been accessed recently. (atime older than some configured limit). I backup /tmp because I throw stuff in tmp that I might possibly need again but don't want to

cluster-analysis and NA's

2002 Aug 07

cluster-analysis and NA's

Hi, exist a special cluster-analysis algorithms which can work with NA's. a further "problem" is that i want cluster variables not cases to identify special variable-set's. Is it a common way turn the data.frame and use kmeans,because this works with NA's, or have anybody another method for finding "variable-sets" , with exception of factor analysis. thanks for

6251453 dis should decode rip-relative memory accesses

2006 Oct 31

6251453 dis should decode rip-relative memory accesses

Author: dmick Repository: /hg/zfs-crypto/gate Revision: 2120ccf2018170cfe16915dac09370ad30dc5285 Log message: 6251453 dis should decode rip-relative memory accesses 6279427 mdb''s x64 disassembler doesn''t decode %rip-relative addresses for data access 6427698 mdb/kmdb/dis should look up symbols for immediate operands 6428349 mdb/kmdb/dis (libdisasm) show odd offset for x86

similarity matrix

2011 Dec 04

similarity matrix

Hello R-users, I've got a file with individuals as colums and the clusters where they occur in as rows. And I wanted a similarity matrix which tells me how many times each individual occurs with another. My eventual goal is to make Venn-diagrams from the occurence of my individuals. So I've this: cluster ind1 ind2 ind3 etc. 1 0 1 2 2 3 0 1 3

How to measure level of similarity of two data frames

2012 May 26

How to measure level of similarity of two data frames

Hi group, I've been thinking of calculating euclidean distance between each column of a data frames that each consists of standardized numerical columns. However, I don't know if there's a way of summarizing the overall distance by some kind of metrics. If anyone know a proper way of doing so and/or a package I would greatly appreciate your suggestions. Thanks very much! Kel --

[LLVMdev] should we stop using llvm-as/llvm-dis in tests?

2009 Sep 05

[LLVMdev] should we stop using llvm-as/llvm-dis in tests?

A recent commit added the ability to opt and llc to read .ll files directly. Should we go through and update the existing tests? llvm-as < %s | opt ... | llvm-dis would become: opt %s ... -print-module and llvm-as < %s | llc would become: llc < %s The pro of this is that it would remove the bitcode write and read from the tests, making them faster. The con of this is

2008 Aug 24

similarity between two gene lists with varied length

Dear listers, a little off-topic: I am looking for and compare algorithms which can calculate "distance" or "similarity" between two gene lists with different lengths. Any paper, any implementation in R and any suggestion is welcome! Thanks, -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..."

[LLVMdev] should we stop using llvm-as/llvm-dis in tests?

2009 Sep 06

[LLVMdev] should we stop using llvm-as/llvm-dis in tests?

On Sep 5, 2009, at 1:53 PM, Nick Lewycky wrote: > A recent commit added the ability to opt and llc to read .ll files > directly. Should we go through and update the existing tests? Yes, I think that Dan is planning to do this. -Chris > > llvm-as < %s | opt ... | llvm-dis > > would become: > > opt %s ... -print-module > > and > > llvm-as < %s |

[LLVMdev] llvm-dis fails to parse bytecode emitted by clang

2010 Oct 26

[LLVMdev] llvm-dis fails to parse bytecode emitted by clang

Hi Lorenzo, > Thanks to everyone for the quick replies! I spent some time looking > into the issue. It turns out that llvm-dis crashes on CLANG-generated > bytecode if LLVM is compiled for a 64-bit architecture. The problem > disappears when compiling for a 32-bit architecture. Should I file a > bug report? it still sounds like LLVM is being miscompiled to me. What compiler did

similarity measure for binary data

2009 Oct 29

similarity measure for binary data

I am doing hierarchical clustering with cluster package. I couldnot find similarity measures like matching coefficient, Jaccard coefficient and sokal and sneath. Could anyone please tell package with similarity measures for binary data? kind regards, Ms.Karunambigai M PhD Scholar Dept. of Biostatistics NIMHANS Bangalore India From cricket scores to your friends. Try the Yahoo! India

PIC preferred too strongly, even at CodeModel::Large?

2016 Jul 29

PIC preferred too strongly, even at CodeModel::Large?

On Thu, Jul 28, 2016 at 6:13 PM, Ramkumar Ramachandra via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > We were just debugging a sporadic crash the other day, when we noticed > that RIP-relative addressing was being used in a JumpTable, even when > code and data were well over 4G apart. This is confusing, because we > picked CodeModel::Large, and expected this to

(Dis)advantage of using lmtp?

2010 Nov 22

(Dis)advantage of using lmtp?

Hi all, are there any (dis)advantages in "connecting" dovecot and an MTA (in our case: exim) using LMTP over using other methods (e.g. the exim transports definitions that can be found in the wiki)? Thank you very much! Gruss/Regards, Christian Schmidt

[LLVMdev] llvm-dis fails to parse bytecode emitted by clang

2010 Oct 26

[LLVMdev] llvm-dis fails to parse bytecode emitted by clang

Thanks to everyone for the quick replies! I spent some time looking into the issue. It turns out that llvm-dis crashes on CLANG-generated bytecode if LLVM is compiled for a 64-bit architecture. The problem disappears when compiling for a 32-bit architecture. Should I file a bug report? Lorenzo On Tue, Oct 26, 2010 at 3:34 AM, Xinfinity <xinfinity_a at yahoo.com> wrote: > > Hi, >

similar to: rsync performance on large files strongly depends on file's (dis)similarity