On Wed, 2 Jun 2010, Misha Spisok wrote:
> Hello,
>
> I can't figure out why using and not using weights in mlogit yields
> identical results.  My motivation is for the case when an
> "observation" or "individual" represents a number of
individuals.  For
> example,
>
> library(mlogit)
> library(AER)
> data("TravelMode", package = "AER")
> TM <- mlogit.data(TravelMode, choice = "choice", shape =
"long",
>                 alt.levels = c("air", "train",
"bus", "car"))
> myweight = rep(floor(1000*runif(nrow(TravelMode)/4)), each = 4)
>
> summary(mlogit(choice ~ wait + vcost + travel + gcost, data=TM))
> summary(mlogit(choice ~ wait + vcost + travel + gcost, weights=income,
data=TM))
> summary(mlogit(choice ~ wait + vcost + travel + gcost,
> weights=myweight, data=TM))
>
> Each gives the same result.
I can't replicate that. For me all three give different results. For 
example, the first two (which do not contain random elements) are
    alttrain      altbus      altcar        wait       vcost      travel
-0.84413818 -1.44150828 -5.20474275 -0.10364955 -0.08493182 -0.01333220
       gcost
  0.06929537
and
    alttrain      altbus      altcar        wait       vcost      travel
-1.56910793 -1.67020936 -5.44725428 -0.11157800 -0.08866886 -0.01435371
       gcost
  0.08087749
respectively. I'm using the current "mlogit" version from CRAN:
0.1-7.
> Am I specifying "weights" incorrectly?
Yes, I think so.
> Is there a better way to do what I want to do?  That is, if
"myweight"
> contains the number of observations represented by an
"observation,"
> is this the correct approach?
You will get the correct parameter estimates but not the correct 
inference. Following most of the basic model fitting function (such as 
lm() or glm()), the weights are _not_ interpreted as case weights. I.e., 
the function treats
   length(weights > 0)
as the number of observations and not
   sum(weights)
A simple example using lm():
   x <- 1:5
   y <- c(0, 2, 1, 4, 5)
   w <- rep(2, 5)
   xx <- c(x, x)
   yy <- c(y, y)
Then you can fit both models
   fm1 <- lm(y ~ x, weights = w)
   fm2 <- lm(yy ~ xx)
and you get the same coefficients
   all.equal(coef(fm1), coef(fm2))
(which only mentions that the strings 'xx' and 'x' are
different.) But fm1
thinks 2 parameters have been estimated from 5 observations while the 
latter thinks 2 parameters have been estimated from 10 observations. Hence
   df.residual(fm1) / df.residual(fm2)
   vcov(fm2) / vcov(fm1)
Hope that helps,
Z
> If so, what am I doing wrong?  If not,
> what suggestions are there?
>
> Thank you for your time.
>
> Best,
>
> Misha
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>