thr3ads.net - similar to: "Canberra distance"

Displaying 20 results from an estimated 6000 matches similar to: "Canberra distance"

2001 Mar 05

Canberra dist and double zeros

Canberra distance is defined in function `dist' (standard library `mva') as sum(|x_i - y_i| / |x_i + y_i|) Obviously this is undefined for cases where both x_i and y_i are zeros. Since double zeros are common in many data sets, this is a nuisance. In our field (from which the distance is coming), it is customary to remove double zeros: contribution to distance is zero when both x_i

Canberra dist and double zeros

2001 Mar 05

Canberra dist and double zeros

Canberra distance

2009 Dec 01

Canberra distance

Hi, I am using R 2.9.0. It seems the documentation for the calculation of Canberra distance using stats::dist is ambiguous. Does anyone have the original definition given in the Lance & Williams paper from Aust. Comput. J. 1, 15-20, 1967? When there are zeros at certain position in both vectors, they are not omitted as documented in the function (see below). Instead, Canberra distance is

stats 'dist' euclidean distance calculation

2018 Mar 15

stats 'dist' euclidean distance calculation

> 3x3 subset used > Locus1 Locus2 Locus3 > Samp1 GG <NA> GG > Samp2 AG CA GA > Samp3 AG CA GG > > The euclidean distance function is defined as: sqrt(sum((x_i - y_i)^2)) My > assumption was that the difference between

mgcv::gam is it possible to have a 'simple' product of 1-d smooths?

2018 Jan 17

mgcv::gam is it possible to have a 'simple' product of 1-d smooths?

I am trying to test out several mgcv::gam models in a scalar-on-function regression analysis. The following is the 'hierarchy' of models I would like to test: (1) Y_i = a + integral[ X_i(t)*Beta(t) dt ] (2) Y_i = a + integral[ F{X_i(t)}*Beta(t) dt ] (3) Y_i = a + integral[ F{X_i(t),t} dt ] equivalents for discrete data might be: 1) Y_i = a + sum_t[ L_t * X_it * Beta_t ] (2) Y_i

stats 'dist' euclidean distance calculation

2018 Mar 15

stats 'dist' euclidean distance calculation

Hello, I am working with a matrix of multilocus genotypes for ~180 individual snail samples, with substantial missing data. I am trying to calculate the pairwise genetic distance between individuals using the stats package 'dist' function, using euclidean distance. I took a subset of this dataset (3 samples x 3 loci) to test how euclidean distance is calculated: 3x3 subset used

lm with data=(means,sds,ns)

2004 Apr 18

lm with data=(means,sds,ns)

Hi Folks, I am dealing with data which have been presented as at each x_i, mean m_i of the y-values at x_i, sd s_i of the y-values at x_i number n_i of the y-values at x_i and I want to linearly regress y on x. There does not seem to be an option to 'lm' which can deal with such data directly, though the regression problem could be algebraically

camberra distance?

2004 Jun 29

camberra distance?

Hi! Its not an R specific question but had no idea where to ask elsewhere. Does anyone know the orginal reference to the CAMBERA DISTANCE? Eryk. Ps.: I knew that its an out of topic question (sorry). Can anyone reccomend a mailing list where such questions are in topic?

function pointer question

2010 Apr 25

function pointer question

Hello, I have the following function that receives a "function pointer" formal parameter name "fnc": loocv <- function(data, fnc) { n <- length(data.x) score <- 0 for (i in 1:n) { x_i <- data.x[-i] y_i <- data.y[-i] yhat <- fnc(x=x_i,y=y_i) score <- score + (y_i - yhat)^2 } score <- score/n

Help with efficient double sum of max (X_i, Y_i) (X & Y vectors)

2007 Feb 01

Help with efficient double sum of max (X_i, Y_i) (X & Y vectors)

Greetings. For R gurus this may be a no brainer, but I could not find pointers to efficient computation of this beast in past help files. Background - I wish to implement a Cramer-von Mises type test statistic which involves double sums of max(X_i,Y_j) where X and Y are vectors of differing length. I am currently using ifelse pointwise in a vector, but have a nagging suspicion that there is a

covariance question which has nothing to do with R

2007 Mar 01

covariance question which has nothing to do with R

This is a covariance calculation question so nothing to do with R but maybe someone could help me anyway. Suppose, I have two random variables X and Y whose means are both known to be zero and I want to get an estimate of their covariance. I have n sample pairs (X1,Y1) (X2,Y2) . . . . . (Xn,Yn) , so that the covariance estimate is clearly 1/n *(sum from i = 1 to n of ( X_i*Y_i) ) But,

linear functional relationships with heteroscedastic & non-Gaussian errors - any packages around?

2008 Dec 01

linear functional relationships with heteroscedastic & non-Gaussian errors - any packages around?

Hi, I have a situation where I have a set of pairs of X & Y variables for each of which I have a (fairly) well-defined PDF. The PDF(x_i) 's and PDF(y_i)'s are unfortunately often rather non-Gaussian although most of the time not multi--modal. For these data (estimates of gas content in galaxies), I need to quantify a linear functional relationship and I am trying to do this as

metafor package: effect sizes are not fully independent

2010 Feb 05

metafor package: effect sizes are not fully independent

In a classical meta analysis model y_i = X_i * beta_i + e_i, data {y_i} are assumed to be independent effect sizes. However, I'm encountering the following two scenarios: (1) Each source has multiple effect sizes, thus {y_i} are not fully independent with each other. (2) Each source has multiple effect sizes, each of the effect size from a source can be categorized as one of a factor levels

need help on computing double summation

2005 Jun 15

need help on computing double summation

Dear helpers in this forum, This is a clarified version of my previous questions in this forum. I really need your generous help on this issue. > Suppose I have the following data set: > > id x y > 023 1 2 > 023 2 5 > 023 4 6 > 023 5 7 > 412 2 5 > 412 3 4 > 412 4 6 > 412 7 9 > 220 5 7 > 220 4 8 > 220 9 8 > ...... > Now I want to compute the

how to fit a weighted logistic regression?

2004 Dec 15

how to fit a weighted logistic regression?

I tried lrm in library(Design) but there is always some error message. Is this function really doing the weighted logistic regression as maximizing the following likelihood: \sum w_i*(y_i*\beta*x_i-log(1+exp(\beta*x_i))) Does anybody know a better way to fit this kind of model in R? FYI: one example of getting error message is like: > x=runif(10,0,3) > y=c(rep(0,5),rep(1,5)) >

Variance-covariance matrix for beta hat and b hat from lme

2003 Oct 23

Variance-covariance matrix for beta hat and b hat from lme

Dear all, Given a LME model (following the notation of Pinheiro and Bates 2000) y_i = X_i*beta + Z_i*b_i + e_i, is it possible to extract the variance-covariance matrix for the estimated beta_i hat and b_i hat from the lme fitted object? The reason for needing this is because I want to have interval prediction on the predicted values (at level = 0:1). The "predict.lme" seems to

notation question

2011 Jul 19

notation question

Dear list, I am currently writing up some of my R models in a more formal sense for a paper, and I am having trouble with the notation. Although this isn't really an 'R' question, it should help me to understand a bit better what I am actually doing when fitting my models! Using the analysis of co-variance example from MASS (fourth edition, p 142), what is the correct notation for the

Canberra distance

2007 Oct 16

Canberra distance

Hi, I misunderstand the definition of Canberra distance in R. On Internet and in function description pages of dist() from stats and Dist() from amap, Canberra distance between vectors x and y, d(x,y), is : d(x,y) = sum(abs(x-y)/(x+y)) But in use, through simple examples, we find that the formula is : d(x,y) = (NZ + 1)/NZ * sum(abs(x-y)/(x+y)) with NZ = nb of pairs of coordinates that are

CDF plot

2005 Jul 07

CDF plot

Dear all, I have define a discrete distribution P(y_i=x_i)=p_i, which I want to plot a CDF plot. However, I can not find a function in R to draw it for me after searching R and R-archive. I only find the one for the sample CDF instead my theoretical one. I find stepfun can do it for me, however, I want to plot some different CDF with same support x in one plot. I can not manage how to do it with

Errors-In-Variables in R

2013 Mar 02

Errors-In-Variables in R

In reference to [1], how would you solve the following regression problem: Given observations (X_i,Y_i) with known respective error distributions (e_X_i,e_Y_i) (say, 0-mean Gaussian with known STD), find the parameters a and b which maximize the Likelihood of Y = a*X + b Taking the example further, how many of the very simplified assumptions from the above example can be lifted or eased and R

similar to: Canberra distance