Perhaps I am missing something but it appears that because X1 and X2 are
random normal, that the influence of X2 is much like a second sampling of
X1, and thus you would expect just what you observed, especially with a
large (1000) sample size. Try making X2 and X1 different.
Charles Annis, P.E.
Charles.Annis at StatisticalEngineering.com
561-352-9699
http://www.StatisticalEngineering.com
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On
Behalf Of Ana De Barros
Sent: Thursday, March 25, 2010 12:19 PM
To: r-help at r-project.org
Subject: [R] Logit/probit model
Deal all,
I have a population with the following characteristics:
N=1000
X0=rep(1,N)
X1=rnorm(N)
X2=rnorm(N)
I also know that the population distribution is a linear logistic function
with parameters alpha0=0 (intercept), alpha1=0.4 and alpha2=1.1. So easily I
can get the dependent variable (in my case the response propensities) by
doing:
alpha=as.vector(c(0, 0.4, 1.1))
X=cbind(X0,X1, X2)
X=matrix(X, ncol=3, nrow=N)
P=X%*%alpha
propensity=1/(1+exp(-(P)))
proptrue=mean(propensity)
I have to estimate by sampling simulation the response propensity (dependent
variable), assuming I don9t know the population distribution and assuming:
1. a linear logistic function adjusting for x1 only
1.1 assuming I know the true parameters (alpha0=0 and
alpha1=0.4)
1.2 assuming I don9t know the true parameters
2. a probit function adjusting for x1 only
2.1 assuming I know the true parameters (alpha0=0 and alpha1=0.4)
2.2 assuming I don9t know the true parameters
When I assume I don9t know the true parameters I sample by doing for (g in
1:replicas)
{
labels=sample(N, sample.size, replace=FALSE)
x0=X0[labels]
x1=X1[labels]
x2=X2[labels]
propsample=propensity[labels]
logitx1=glm(propsample~x1, family=binomial(link="logit"))
coefx1= logitx1$coefficients
fitx1= logitx1$fitted.values
PSprob=mean(fitx1)
probx1=glm(propsample~x1, family=binomial(link="probit"))
c33= probx1$coefficients
cc33= probx1$fitted.values
PSprob=mean(cc33)
}
My problem is that although I omit x2 in the simulations I still get very
similar results (similar response propensities) with the population response
propensity and it doesn9t make any sense... I must be doing something wrong
but I don9t find the error. Can you help me, please?
Thanks a lot
Ana
[[alternative HTML version deleted]]