thr3ads.net - similar to: "Calculate missing value using a correlation metric"

Displaying 20 results from an estimated 30000 matches similar to: "Calculate missing value using a correlation metric"

custom metric for dist for use with hclust/kmeans

2010 May 05

custom metric for dist for use with hclust/kmeans

Hi guys, I've been using the kmeans and hclust functions for some time now and was wondering if I could specify a custom metric when passing my data frame into hclust as a distance matrix. Actually, kmeans doesn't even take a distance matrix; it takes the data frame directly. I was wondering if there's a way or if there's a package that lets you create distance matrices from

knn using custom distance metric

2004 Feb 07

knn using custom distance metric

Hi, There are two packages providing knn classification: class and knnTree. However, it seems both uses Eucleadian distances only. How can I uses a custom distance function with either package? Thanks, Xiao-Jun

title for plot contain 4 subplots

2003 Sep 14

title for plot contain 4 subplots

Hi, I'm plotting 4 graphs on one page (2x2 matrix) but I cant seem to get the title for the whole page right. I'm doing: op <- par(mfrow = c(2,2), pty="s") hist(var$V2, breaks="FD",main="Euclidean Metric", xlab="Sum of 3NN ... hist(var$V2, breaks="FD",main="Manhattan Metric", xlab="Sum of 3NN ... hist(var$V2,

nnclust: nnfind() distance metric?

2010 May 06

nnclust: nnfind() distance metric?

Hello, pardon my ingorance, but what distance metric is used in this function in the nnclust package? The manual only says: "Find the nearest neighbours of points in one data set from another data set. Useful for Mallows-type distance metrics." BR, Jay

help about agnes

2006 Aug 16

help about agnes

Hello. I have the following distance matrix between 8 points: [1,] 0.000000 3.162278 7.280110 8.544004 7.071068 9.899495 6.403124 8.062258 [2,] 3.162278 0.000000 5.000000 6.403124 4.472136 8.944272 6.082763 8.062258 [3,] 7.280110 5.000000 0.000000 1.414214 1.000000 5.000000 4.242641 5.830952 [4,] 8.544004 6.403124 1.414214 0.000000 2.236068 4.123106 4.472136 5.656854 [5,] 7.071068 4.472136

dist() {"mva" package} bug: treats +/- Inf as NA

2002 Oct 21

dist() {"mva" package} bug: treats +/- Inf as NA

Vince Carey found this (thank you!). Since the fix to the problem is not entirely obvious, I post this to R-devel as RFC: help(dist) says: >> Missing values are allowed, and are excluded from all computations >> involving the rows within which they occur. If some columns are >> excluded in calculating a Euclidean, Manhattan or Canberra >> distance, the sum is

nlme: spatial autocorrelation on a sphere

2012 Oct 01

nlme: spatial autocorrelation on a sphere

I have spatial data on a sphere (the Earth) for which I would like to run an gls model assuming that the errors are autcorrelated, i.e. including a corSpatial correlation in the model specification. In this case the distance metric should be calculated on the sphere, therefore metric = "euclidean" in (for example) corSpher would be incorrect. I would be grateful for help on how to

estimating missing data

2002 Jul 26

estimating missing data

Hello R group Do you know if an EM algorithm exits for R to estimate missing data in a sample? I just found knn algorithm in to the package emv but it doesn't look to be the usual EM algorithm. Thanks Xavier -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info",

New to R, trying to use agnes, but can't load my ditance matrix

2011 Jun 27

New to R, trying to use agnes, but can't load my ditance matrix

Hi, I'm mighty new to R. I'm using it on Windows. I'm trying to cluster using a distance matrix I created from the data on my own and called it D10.dist. I loaded the cluster package. Then tried the following command... > agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand = FALSE, > method = "average", par.method, keep.diss = n < 1000,

about isoMDS method

2009 Aug 30

about isoMDS method

Hi, For example: I built a half matrix "w" using a daisy(x, metric = c("euclidean")) http://www.nabble.com/file/p25211016/1.jpg And next I transformed this matrix "w" using isoMDS function, for example isoMDS(w, k=2) and as result I got: http://www.nabble.com/file/p25211016/2.jpg And now I have two questions: 1. If number in matrix w[2, 1] (= 0.41538462) match

Mean correlation within cluster

2017 Jun 13

Mean correlation within cluster

Hello all, I'd like to calculate the mean correlation within a cluster and understand if it's significantly >0. I'm using packages 'geomorph' and 'paleomorph'. #Simulate an array A <- array ( rep ( 1 : 36 , by = 4 ), dim = c ( 12 , 3 , 4 )) #Load 'geomorph' package and superimpose coordinates test.gpa <- gpagen ( A , print.progress = FALSE ) #Load

Specifying medoids in PAM?

2005 Jun 07

Specifying medoids in PAM?

I am using the PAM algorithm in the CLUSTER library. When I allow PAM to seed the medoids using the default __build__ algorithm things work well: > pam(stats.table, metric="euclidean", stand=TRUE, k=5) But I have some clusters from a Hierarchical analysis that I would like to use as seeds for the PAM algorithm. I can't figure what the mediod argument wants. When I put in the

Specify feature weights in model prediction (CARET)

2011 Mar 16

Specify feature weights in model prediction (CARET)

Using the 'CARET' package, is it possible to specify weights for features used in model prediction? And for the 'knn' implementation, is there a way to choose a distance metric (i.e. Mahalanobis distance)? Thanks, ~Kendric [[alternative HTML version deleted]]

correlation between rows of data.frame

2008 Aug 01

correlation between rows of data.frame

Dear R users, I need to come up with an efficient method to compute the correlation (or at least, the euclidean distance if that's easier) between specific rows in a data frame (46,232 rows, 29 columns). The pairs of rows between which I want to find the correlation share a common value in one of the columns. So for example, in the following

Calculating the distance samples using distance metics method

2008 Feb 19

Calculating the distance samples using distance metics method

***********reading in data********** data<-read.table("microarray.txt",header=T, sep="\t") head(data) dim(data) attach(data) ***********creating matrix and calculating variance across probesets******** x<-1:20000 y<-2:141 data.matrix<-data.matrix(data[,y]) variableprobe<-apply(data.matrix[x,],1,var) hist(variableprobe) **************filter out low

stats 'dist' euclidean distance calculation

2018 Mar 15

stats 'dist' euclidean distance calculation

Hello, I am working with a matrix of multilocus genotypes for ~180 individual snail samples, with substantial missing data. I am trying to calculate the pairwise genetic distance between individuals using the stats package 'dist' function, using euclidean distance. I took a subset of this dataset (3 samples x 3 loci) to test how euclidean distance is calculated: 3x3 subset used

Help: Strange MDS behavior

2003 Nov 13

Help: Strange MDS behavior

Hi! I have a dissimilarity matrix X and try to compare it with X' = dist(cmdscsale(X,k)). If I increase k, I should expect that the error (or fit) should monotonically decrease, right. Here is a sample code; library(mva) set.seed(12345) x <- as.matrix(dist(matrix(rnorm(100),ncol=10,byrow=T))) # x[1,2]<-x[2,1]<-1000 ## <<--** 1 # x[5,6]<-x[6,5]<-1000 ##

Bhattacharyya distance metric

2009 Nov 05

Bhattacharyya distance metric

I need to use the Bhattacharyya distance metric to determine population separation. Has anyone written a Bhattacharyya distance metric function in R? -- View this message in context: http://old.nabble.com/Bhattacharyya-distance-metric-tp26221259p26221259.html Sent from the R help mailing list archive at Nabble.com.

How to perform clustering without removing rows where NA is present in R

2013 Dec 07

How to perform clustering without removing rows where NA is present in R

I have a data which contain some NA value in their elements. What I want to do is to **perform clustering without removing rows** where the NA is present. I understand that `gower` distance measure in `daisy` allow such situation. But why my code below doesn't work? __BEGIN__ # plot heat map with dendogram together. library("gplots") library("cluster")

Training with very few positives

2013 Feb 10

Training with very few positives

I have a binary classification problem where the fraction of positives is very low, e.g. 20 positives in 10,000 examples (0.2%) What is an appropriate cross validation scheme for training a classifier with very few positives? I currently have the following setup: ======================================== library(caret) tmp <- createDataPartition(Y, p = 9/10, times = 3, list = TRUE)

similar to: Calculate missing value using a correlation metric