thr3ads.net - R help - [R] Using mlogit with case weights [Aug 2014]

If this information is useful, please help other people find it:
Share via:

Hylton, Ronald

2014-Aug-28 20:29 UTC

[R] Using mlogit with case weights

I have a set of data with ~ 250,000 observations summarized in ~ 1000 rows that
I'm trying to analyze with mlogit.  Based on the discussion in
https://stat.ethz.ch/pipermail/r-help/2010-June/241161.html
I understand that using weights= does not (fully) do what I need.  I tried
expanding my data to one row per observation to sidestep this issue but after
waiting several hours for mlogit to finish I decided this was not a feasible
strategy and I needed to use weights= and make whatever adjustments are
necessary for the inferences.

My solution is the following:
Define W = sum(weights) / length(weights)
Multiply the Log-Likelihood by W
Divide the Std. Error's by sqrt(W) (and therefore multiply the t-value's
by sqrt(W))

Can anyone confirm that this is correct (at least as a large-N approximation)?

The code below provides a test case where I compare duplicating rows to using
weights and adjusting the inferences (the original code was from Kenneth
Train's exercises using the mlogit package for R).  The last few lines
printed (Ratios: ...) show that the coefficients in the two cases are the same
to a high accuracy and the Log-Likelihood, Std. Error's and t-value's
also have the expected ratios to a decent accuracy.  However it would be good to
know that this approach is conceptually sound.

Thanks,
Ron

library("mlogit")
data("Heating", package = "mlogit")
H <- mlogit.data(Heating, shape="wide", choice="depvar",
varying=c(3:12))
m <- mlogit(depvar~ic+oc|0, H)
# print(summary(m))

w <- sample(1:200, nrow(Heating), replace=TRUE) # random weights
i <- rep(1:nrow(Heating), times=w) # index vector for duplicating rows
according to the weights
H2 <- mlogit.data(Heating[i,], shape="wide",
choice="depvar", varying=c(3:12))
m2 <- mlogit(depvar~ic+oc|0, H2)
# print(summary(m2))
m3 <- mlogit(depvar~ic+oc|0, H, weights=rep(w,each=5))
# print(summary(m3))
print(all.equal(coef(m2),coef(m3)))

f2 <- fitted(m2)[cumsum(w)]
f3 <- fitted(m3)
names(f2) <- names(f3)
print(all.equal(f2,f3))

cat("\nRatios:", m2$logLik/m3$logLik, sum(w)/length(w),
sqrt(sum(w)/length(w)), sqrt(length(w)/sum(w)), "\n\n")

s2 <- summary(m2)
s3 <- summary(m3)

print(s2$CoefTable / s3$CoefTable)


	[[alternative HTML version deleted]]

R help - Aug 2014 - Using mlogit with case weights

[R] Using mlogit with case weights