Displaying 20 results from an estimated 5000 matches similar to: "how to fit a weighted logistic regression?"
2017 Dec 03
1
Discourage the weights= option of lm with summarized data
Peter,
This is a highly structured text. Just for the discussion, I separate
the building blocks, where (D) and (E) and (F) are new:
BEGIN OF TEXT --------------------
(A)
Non-?NULL? ?weights? can be used to indicate that different
observations have different variances (with the values in ?weights?
being inversely proportional to the variances);
(B)
or equivalently, when the elements of
2006 May 24
1
(PR#8877) predict.lm does not have a weights argument for
I am more than 'a little disappointed' that you expect a detailed
explanation of the problems with your 'bug' report, especially as you did
not provide any explanation yourself as to your reasoning (nor did you
provide any credentials nor references).
Note that
1) Your report did not make clear that this was only relevant to
prediction intervals, which are not commonly used.
2010 Feb 06
1
Canberra distance
Hi the list,
According to what I know, the Canberra distance between X et Y is : sum[
(|x_i - y_i|) / (|x_i|+|y_i|) ] (with | | denoting the function
'absolute value')
In the source code of the canberra distance in the file distance.c, we
find :
sum = fabs(x[i1] + x[i2]);
diff = fabs(x[i1] - x[i2]);
dev = diff/sum;
which correspond to the formula : sum[ (|x_i - y_i|) /
2006 May 20
1
(PR#8877) predict.lm does not have a weights argument for newdata
Dear R developers,
I am a little disappointed that my bug report only made it to the
wishlist, with the argument:
Well, it does not say it has.
Only relevant to prediction intervals.
predict.lm does calculate prediction intervals for linear models from
weighted regression, so they should be correct, right?
As far as I can see they are bound to be wrong in almost all cases, if
no weights
2017 Oct 12
4
Discourage the weights= option of lm with summarized data
OK. We have now three suggestions to repair the text:
- remove the text
- add "not" at the beginning of the text
- add at the end of the text a warning; something like:
"Note that in this case the standard estimates of the parameters are
in general not correct, and hence also the t values and the p value.
Also the number of degrees of freedom is not correct. (The parameter
2004 Apr 18
2
lm with data=(means,sds,ns)
Hi Folks,
I am dealing with data which have been presented as
at each x_i, mean m_i of the y-values at x_i,
sd s_i of the y-values at x_i
number n_i of the y-values at x_i
and I want to linearly regress y on x.
There does not seem to be an option to 'lm' which can
deal with such data directly, though the regression
problem could be algebraically
2018 Jan 17
1
mgcv::gam is it possible to have a 'simple' product of 1-d smooths?
I am trying to test out several mgcv::gam models in a scalar-on-function regression analysis.
The following is the 'hierarchy' of models I would like to test:
(1) Y_i = a + integral[ X_i(t)*Beta(t) dt ]
(2) Y_i = a + integral[ F{X_i(t)}*Beta(t) dt ]
(3) Y_i = a + integral[ F{X_i(t),t} dt ]
equivalents for discrete data might be:
1) Y_i = a + sum_t[ L_t * X_it * Beta_t ]
(2) Y_i
2001 Mar 05
1
Canberra dist and double zeros
Canberra distance is defined in function `dist' (standard library `mva') as
sum(|x_i - y_i| / |x_i + y_i|)
Obviously this is undefined for cases where both x_i and y_i are zeros. Since
double zeros are common in many data sets, this is a nuisance. In our field
(from which the distance is coming), it is customary to remove double zeros:
contribution to distance is zero when both x_i
2001 Mar 05
1
Canberra dist and double zeros
Canberra distance is defined in function `dist' (standard library `mva') as
sum(|x_i - y_i| / |x_i + y_i|)
Obviously this is undefined for cases where both x_i and y_i are zeros. Since
double zeros are common in many data sets, this is a nuisance. In our field
(from which the distance is coming), it is customary to remove double zeros:
contribution to distance is zero when both x_i
2007 Feb 01
3
Help with efficient double sum of max (X_i, Y_i) (X & Y vectors)
Greetings.
For R gurus this may be a no brainer, but I could not find pointers to
efficient computation of this beast in past help files.
Background - I wish to implement a Cramer-von Mises type test statistic
which involves double sums of max(X_i,Y_j) where X and Y are vectors of
differing length.
I am currently using ifelse pointwise in a vector, but have a nagging
suspicion that there is a
2010 Apr 25
1
function pointer question
Hello,
I have the following function that receives a "function pointer" formal parameter name "fnc":
loocv <- function(data, fnc) {
n <- length(data.x)
score <- 0
for (i in 1:n) {
x_i <- data.x[-i]
y_i <- data.y[-i]
yhat <- fnc(x=x_i,y=y_i)
score <- score + (y_i - yhat)^2
}
score <- score/n
2010 Feb 05
3
metafor package: effect sizes are not fully independent
In a classical meta analysis model y_i = X_i * beta_i + e_i, data
{y_i} are assumed to be independent effect sizes. However, I'm
encountering the following two scenarios:
(1) Each source has multiple effect sizes, thus {y_i} are not fully
independent with each other.
(2) Each source has multiple effect sizes, each of the effect size
from a source can be categorized as one of a factor levels
2007 Mar 01
1
covariance question which has nothing to do with R
This is a covariance calculation question so nothing to do with R but
maybe someone could help me anyway.
Suppose, I have two random variables X and Y whose means are both known
to be zero and I want to get an estimate of their covariance.
I have n sample pairs
(X1,Y1)
(X2,Y2)
.
.
.
.
.
(Xn,Yn)
, so that the covariance estimate is clearly 1/n *(sum from i = 1 to n
of ( X_i*Y_i) )
But,
2008 Dec 01
1
linear functional relationships with heteroscedastic & non-Gaussian errors - any packages around?
Hi,
I have a situation where I have a set of pairs of X & Y variables for
each of which I have a (fairly) well-defined PDF. The PDF(x_i) 's and
PDF(y_i)'s are unfortunately often rather non-Gaussian although most
of the time not multi--modal.
For these data (estimates of gas content in galaxies), I need to
quantify a linear functional relationship and I am trying to do this
as
2018 Mar 15
0
stats 'dist' euclidean distance calculation
> 3x3 subset used
> Locus1 Locus2 Locus3
> Samp1 GG <NA> GG
> Samp2 AG CA GA
> Samp3 AG CA GG
>
> The euclidean distance function is defined as: sqrt(sum((x_i - y_i)^2)) My
> assumption was that the difference between
2018 Mar 15
3
stats 'dist' euclidean distance calculation
Hello,
I am working with a matrix of multilocus genotypes for ~180 individual snail samples, with substantial missing data. I am trying to calculate the pairwise genetic distance between individuals using the stats package 'dist' function, using euclidean distance. I took a subset of this dataset (3 samples x 3 loci) to test how euclidean distance is calculated:
3x3 subset used
2005 Jun 15
2
need help on computing double summation
Dear helpers in this forum,
This is a clarified version of my previous
questions in this forum. I really need your generous
help on this issue.
> Suppose I have the following data set:
>
> id x y
> 023 1 2
> 023 2 5
> 023 4 6
> 023 5 7
> 412 2 5
> 412 3 4
> 412 4 6
> 412 7 9
> 220 5 7
> 220 4 8
> 220 9 8
> ......
>
Now I want to compute the
2013 Jun 23
1
2SLS / TSLS / SEM non-linear
Dear all, I try to conduct a SEM / two stage least squares regression with
the following equations:
First: X ~ IV1 + IV2 * Y
Second: Y ~ a + b X
therein, IV1 and IV2 are the two instruments I would like to use. the
structure I would like to maintain as the model is derived from economic
theory. My problem here is that I have trouble solving the equations to get
the reduced form so I can run
2011 Jul 19
1
notation question
Dear list, I am currently writing up some of my R models in a more
formal sense for a paper, and I am having trouble with the notation.
Although this isn't really an 'R' question, it should help me to
understand a bit better what I am actually doing when fitting my
models!
Using the analysis of co-variance example from MASS (fourth edition, p
142), what is the correct notation for the
2007 Jul 19
1
R
Hello!
I am using for logistic regression in survey data the svyglm procedure.
I wondered how does the strata effect estimates SE (in addition to the
weights given proportional to population size).
I know that for simple regression measurements of each strata is assumed to
have different variance.
But in a logistic model this is not the case.
Can anyone help me here?
Thank you
Ron
[[alternative