thr3ads.net - R help - [R] Different LLRs on multinomial logit models in R and SPSS [Jan 2011]

If this information is useful, please help other people find it:
Share via:

Sören Vogel

2011-Jan-06 16:06 UTC

[R] Different LLRs on multinomial logit models in R and SPSS

Hello, after calculating a multinomial logit regression on my data, I
compared the output to an output retrieved with SPSS 18 (Mac). The
coefficients appear to be the same, but the logLik (and therefore fit)
values differ widely. Why?

The regression in R:

set.seed(1234)
df <- data.frame(
  "y"=factor(sample(LETTERS[1:3], 143, repl=T, prob=c(4, 1, 10))),
  "a"=sample(1:5, 143, repl=T),
  "b"=sample(1:7, 143, repl=T),
  "c"=sample(1:2, 143, repl=T)
)
library(nnet)
mod1 <- multinom(y ~ ., data=df, trace=F)
deviance(mod1) # 199.0659
mod0 <- update(mod1, . ~ 1, trace=FALSE)
deviance(mod0) # 204.2904

Output data and syntax for SPSS:

df2 <- df
df2[, 1] <- as.numeric(df[, 1])
write.csv(df2, file="dfxy.csv", row.names=F, na="")
syntaxfile <- "dfxy.sps"
cat('GET DATA
  /TYPE=TXT
  /FILE=\'', getwd(), '/dfxy.csv\'
  /DELCASE=LINE
  /DELIMITERS=","
  /QUALIFIER=\'"\'
  /ARRANGEMENT=DELIMITED
  /FIRSTCASE=2
  /IMPORTCASE=ALL
  /VARIABLES  y "F1.0"
  a "F8.4"
  b "F8.4"
  c "F8.4".
CACHE.
EXECUTE.
DATASET NAME DataSet1 WINDOW=FRONT.

VALUE LABELS
  /y 1 "A" 2 "B" 3 "C".
EXECUTE.

NOMREG y (BASE=1 ORDER=ASCENDING) WITH a b c
  /CRITERIA CIN(95) DELTA(0) MXITER(100) MXSTEP(5) CHKSEP(20)
LCONVERGE(0) PCONVERGE(0.000001)
    SINGULAR(0.00000001)
  /MODEL
  /STEPWISE=PIN(.05) POUT(0.1) MINEFFECT(0) RULE(SINGLE)
ENTRYMETHOD(LR) REMOVALMETHOD(LR)
  /INTERCEPT=INCLUDE
  /PRINT=FIT PARAMETER SUMMARY LRT CPS STEP MFI IC.
', file=syntaxfile, sep="", append=F)

-> Loglik0: 135.02
-> Loglik1: 129.80

Thanks, S?ren

David Winsemius

2011-Jan-06 16:16 UTC

head link

[R] Different LLRs on multinomial logit models in R and SPSS

On Jan 6, 2011, at 11:06 AM, S?ren Vogel wrote:
> Hello, after calculating a multinomial logit regression on my data, I
> compared the output to an output retrieved with SPSS 18 (Mac). The
> coefficients appear to be the same, but the logLik (and therefore fit)
> values differ widely. Why?
The likelihood is arbitrary. It is the difference in likelihoods that  
is important and in this respect the answers you got from the two  
software packages (delta deviance= 5.22) is equivalent. If you run  
logistic models with individual records versus using grouped data for  
equivalent models with the same software, the reported deviance will  
be widely different but the comparison of nested models will imply the  
same inferential conclusions.

-- 
David.>
> The regression in R:
>
> set.seed(1234)
> df <- data.frame(
>  "y"=factor(sample(LETTERS[1:3], 143, repl=T, prob=c(4, 1, 10))),
>  "a"=sample(1:5, 143, repl=T),
>  "b"=sample(1:7, 143, repl=T),
>  "c"=sample(1:2, 143, repl=T)
> )
> library(nnet)
> mod1 <- multinom(y ~ ., data=df, trace=F)
> deviance(mod1) # 199.0659
> mod0 <- update(mod1, . ~ 1, trace=FALSE)
> deviance(mod0) # 204.2904
>
> Output data and syntax for SPSS:
>
> df2 <- df
> df2[, 1] <- as.numeric(df[, 1])
> write.csv(df2, file="dfxy.csv", row.names=F, na="")
> syntaxfile <- "dfxy.sps"
> cat('GET DATA
>  /TYPE=TXT
>  /FILE=\'', getwd(), '/dfxy.csv\'
>  /DELCASE=LINE
>  /DELIMITERS=","
>  /QUALIFIER=\'"\'
>  /ARRANGEMENT=DELIMITED
>  /FIRSTCASE=2
>  /IMPORTCASE=ALL
>  /VARIABLES>  y "F1.0"
>  a "F8.4"
>  b "F8.4"
>  c "F8.4".
> CACHE.
> EXECUTE.
> DATASET NAME DataSet1 WINDOW=FRONT.
>
> VALUE LABELS
>  /y 1 "A" 2 "B" 3 "C".
> EXECUTE.
>
> NOMREG y (BASE=1 ORDER=ASCENDING) WITH a b c
>  /CRITERIA CIN(95) DELTA(0) MXITER(100) MXSTEP(5) CHKSEP(20)
> LCONVERGE(0) PCONVERGE(0.000001)
>    SINGULAR(0.00000001)
>  /MODEL
>  /STEPWISE=PIN(.05) POUT(0.1) MINEFFECT(0) RULE(SINGLE)
> ENTRYMETHOD(LR) REMOVALMETHOD(LR)
>  /INTERCEPT=INCLUDE
>  /PRINT=FIT PARAMETER SUMMARY LRT CPS STEP MFI IC.
> ', file=syntaxfile, sep="", append=F)
>
> -> Loglik0: 135.02
> -> Loglik1: 129.80
>
> Thanks, S?ren
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT

Ben Bolker

2011-Jan-06 16:17 UTC

head link

[R] Different LLRs on multinomial logit models in R and SPSS

S?ren Vogel <sovo0815 <at> gmail.com> writes:
> 
> Hello, after calculating a multinomial logit regression on my data, I
> compared the output to an output retrieved with SPSS 18 (Mac). The
> coefficients appear to be the same, but the logLik (and therefore fit)
> values differ widely. Why?
  Since constants that are independent of the model
parameters can be left out of a log-likelihood calculation
without affecting inference among models, it is quite common
for likelihoods to be calculated differently in different
software packages (for example, the (n/2 log(2*pi)) constant in the
Gaussian log-likelihood) -- this is true even among different
computations within the R ecosystem.  What's important is the difference
among log-likelihoods between models, which is the same (up to
the numeric precision of what you've shown us) for both
software packages.
> 135.02-129.8
[1] 5.22> 204.2904-199.0659[1] 5.2245

  Ben Bolker

Sören Vogel

2011-Jan-06 16:23 UTC

head link

[R] Different LLRs on multinomial logit models in R and SPSS

Thanks for your replies. I am no mathematician or statistician by far,
however, it appears to me that the actual value of any of the two LLs
is indeed important when it comes to calculation of
Pseudo-R-Squared-s. If Rnagel devides by (some transformation of) the
actiual value of llnull then any calculation of Rnagel should differ.
How come? Or is my function wrong? And if my function is right, how
can I calculate a R-Squared independent from the software used?

Rfits <- function(mod) {
  llnull <- deviance(update(mod, . ~ 1, trace=F))
  llmod <- deviance(mod)
  n <- length(predict(mod))
  Rcs <- 1 - exp( (llmod - llnull) / n )
  Rnagel <- Rcs / (1 - exp(-llnull/n))
  out <- list(
    "Rcs"=Rcs,
    "Rnagel"=Rnagel
  )
  class(out) <- c("list", "table")
  return(out)
}

David Winsemius

2011-Jan-06 17:03 UTC

head link

[R] Different LLRs on multinomial logit models in R and SPSS

On Jan 6, 2011, at 11:23 AM, S?ren Vogel wrote:
> Thanks for your replies. I am no mathematician or statistician by far,
> however, it appears to me that the actual value of any of the two LLs
> is indeed important when it comes to calculation of
> Pseudo-R-Squared-s. If Rnagel devides by (some transformation of) the
> actiual value of llnull then any calculation of Rnagel should differ.
> How come? Or is my function wrong? And if my function is right, how
> can I calculate a R-Squared independent from the software used?
You have two models in that function, the null one with ".~ 1" and the
origianl one and you are getting a ratio on the likelihood scale  
( which is a difference on the log-likelihood or deviance scale).
>
> Rfits <- function(mod) {
>  llnull <- deviance(update(mod, . ~ 1, trace=F))
>  llmod <- deviance(mod)
>  n <- length(predict(mod))
>  Rcs <- 1 - exp( (llmod - llnull) / n )
>  Rnagel <- Rcs / (1 - exp(-llnull/n))
>  out <- list(
>    "Rcs"=Rcs,
>    "Rnagel"=Rnagel
>  )
>  class(out) <- c("list", "table")
>  return(out)
> }

-- 
David Winsemius, MD
West Hartford, CT

Possibly Parallel Threads

Search for more reasonably related threads

R help - Jan 2011 - Different LLRs on multinomial logit models in R and SPSS

[R] Different LLRs on multinomial logit models in R and SPSS

[R] Different LLRs on multinomial logit models in R and SPSS

[R] Different LLRs on multinomial logit models in R and SPSS

[R] Different LLRs on multinomial logit models in R and SPSS

[R] Different LLRs on multinomial logit models in R and SPSS

Possibly Parallel Threads