thr3ads.net - R devel - [Rd] lm() gives different results to lm.ridge() and SPSS [May 2017]

If this information is useful, please help other people find it:
Share via:

Nick Brown

2017-May-05 08:40 UTC

[Rd] lm() gives different results to lm.ridge() and SPSS

Hi, 

Here is (I hope) all the relevant output from R. 
> mean(s1$ZDEPRESSION, na.rm=T) [1] -1.041546e-16 > mean(s1$ZDIVERSITY_PA,
na.rm=T) [1] -9.660583e-16 > mean(s1$ZMEAN_PA, na.rm=T) [1] -5.430282e-15
> lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)$coef ZMEAN_PA    
ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA            -0.3962254             -0.3636026             -0.1425772      ##
This is what I thought was the problem originally. :-)

> coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
(Intercept)               ZMEAN_PA          ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA            0.07342198            -0.39650356            -0.36569488           
-0.09435788 > coefficients(lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA,
data=s1)) ZMEAN_PA          ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
            0.07342198            -0.39650356            -0.36569488           
-0.09435788 The equivalent from SPSS is attached. The unstandardized
coefficients in SPSS look nothing like those in R. The standardized coefficients
in SPSS match the lm.ridge()$coef numbers very closely indeed, suggesting that
the same algorithm may be in use.

I have put the dataset file, which is the untouched original I received from the
authors, in this Dropbox folder:
https://www.dropbox.com/sh/xsebjy55ius1ysb/AADwYUyV1bl6-iAw7ACuF1_La?dl=0. You
can read it into R with this code (one variable needs to be standardized and
centered; everything else is already in the file):

s1 <- read.csv("Emodiversity_Study1.csv", stringsAsFactors=FALSE)
s1$ZDEPRESSION <- scale(s1$DEPRESSION)
Hey, maybe R is fine and I've stumbled on a bug in SPSS? If so, I'm sure
IBM will want to fix it quickly (ha ha ha).

Nick 



----- Original Message -----

From: "peter dalgaard" <pdalgd at gmail.com> 
To: "Nick Brown" <nick.brown at free.fr> 
Cc: "Simon Bonner" <sbonner6 at uwo.ca>, r-devel at
r-project.org
Sent: Friday, 5 May, 2017 10:02:10 AM 
Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS 

I asked you before, but in case you missed it: Are you looking at the right
place in SPSS output?

The UNstandardized coefficients should be comparable to R, i.e. the
"B" column, not "Beta".

-pd 
> On 5 May 2017, at 01:58 , Nick Brown <nick.brown at free.fr> wrote: 
> 
> Hi Simon, 
> 
> Yes, if I uses coefficients() I get the same results for lm() and
lm.ridge(). So that's consistent, at least.
> 
> Interestingly, the "wrong" number I get from lm.ridge()$coef
agrees with the value from SPSS to 5dp, which is an interesting coincidence if
these numbers have no particular external meaning in lm.ridge().
> 
> Kind regards, 
> Nick 
> 
> ----- Original Message ----- 
> 
> From: "Simon Bonner" <sbonner6 at uwo.ca> 
> To: "Nick Brown" <nick.brown at free.fr>, r-devel at
r-project.org
> Sent: Thursday, 4 May, 2017 7:07:33 PM 
> Subject: RE: [Rd] lm() gives different results to lm.ridge() and SPSS 
> 
> Hi Nick, 
> 
> I think that the problem here is your use of $coef to extract the
coefficients of the ridge regression. The help for lm.ridge states that coef is
a "matrix of coefficients, one row for each value of lambda. Note that
these are not on the original scale and are for use by the coef method."
> 
> I ran a small test with simulated data, code is copied below, and indeed
the output from lm.ridge differs depending on whether the coefficients are
accessed via $coef or via the coefficients() function. The latter does produce
results that match the output from lm.
> 
> I hope that helps. 
> 
> Cheers, 
> 
> Simon 
> 
> ## Load packages 
> library(MASS) 
> 
> ## Set seed 
> set.seed(8888) 
> 
> ## Set parameters 
> n <- 100 
> beta <- c(1,0,1) 
> sigma <- .5 
> rho <- .75 
> 
> ## Simulate correlated covariates 
> Sigma <- matrix(c(1,rho,rho,1),ncol=2) 
> X <- mvrnorm(n,c(0,0),Sigma=Sigma) 
> 
> ## Simulate data 
> mu <- beta[1] + X %*% beta[-1] 
> y <- rnorm(n,mu,sigma) 
> 
> ## Fit model with lm() 
> fit1 <- lm(y ~ X) 
> 
> ## Fit model with lm.ridge() 
> fit2 <- lm.ridge(y ~ X) 
> 
> ## Compare coefficients 
> cbind(fit1$coefficients,c(NA,fit2$coef),coefficients(fit2)) 
> 
> [,1] [,2] [,3] 
> (Intercept) 0.99276001 NA 0.99276001 
> X1 -0.03980772 -0.04282391 -0.03980772 
> X2 1.11167179 1.06200476 1.11167179 
> 
> -- 
> 
> Simon Bonner 
> Assistant Professor of Environmetrics/ Director MMASc 
> Department of Statistical and Actuarial Sciences/Department of Biology 
> University of Western Ontario 
> 
> Office: Western Science Centre rm 276 
> 
> Email: sbonner6 at uwo.ca | Telephone: 519-661-2111 x88205 | Fax:
519-661-3813
> Twitter: @bonnerstatslab | Website:
http://simon.bonners.ca/bonner-lab/wpblog/
> 
>> -----Original Message----- 
>> From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of
Nick
>> Brown 
>> Sent: May 4, 2017 10:29 AM 
>> To: r-devel at r-project.org 
>> Subject: [Rd] lm() gives different results to lm.ridge() and SPSS 
>> 
>> Hallo, 
>> 
>> I hope I am posting to the right place. I was advised to try this list
by Ben Bolker
>> (https://twitter.com/bolkerb/status/859909918446497795). I also posted
this
>> question to StackOverflow 
>>
(http://stackoverflow.com/questions/43771269/lm-gives-different-results-
>> from-lm-ridgelambda-0). I am a relative newcomer to R, but I wrote my
first
>> program in 1975 and have been paid to program in about 15 different 
>> languages, so I have some general background knowledge. 
>> 
>> 
>> I have a regression from which I extract the coefficients like this: 
>> lm(y ~ x1 * x2, data=ds)$coef 
>> That gives: x1=0.40, x2=0.37, x1*x2=0.09 
>> 
>> 
>> 
>> When I do the same regression in SPSS, I get: 
>> beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14. 
>> So the main effects are in agreement, but there is quite a difference
in the
>> coefficient for the interaction. 
>> 
>> 
>> X1 and X2 are correlated about .75 (yes, yes, I know - this model
wasn't my
>> idea, but it got published), so there is quite possibly something going
on with
>> collinearity. So I thought I'd try lm.ridge() to see if I can get
an idea of where
>> the problems are occurring. 
>> 
>> 
>> The starting point is to run lm.ridge() with lambda=0 (i.e., no ridge
penalty) and
>> check we get the same results as with lm(): 
>> lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef 
>> x1=0.40, x2=0.37, x1*x2=0.14 
>> So lm.ridge() agrees with SPSS, but not with lm(). (Of course, lambda=0
is the
>> default, so it can be omitted; I can alternate between including or
deleting
>> ".ridge" in the function call, and watch the coefficient for
the interaction
>> change.) 
>> 
>> 
>> 
>> What seems slightly strange to me here is that I assumed that
lm.ridge() just
>> piggybacks on lm() anyway, so in the specific case where lambda=0 and
there
>> is no "ridging" to do, I'd expect exactly the same
results.
>> 
>> 
>> Unfortunately there are 34,000 cases in the dataset, so a
"minimal" reprex will
>> not be easy to make, but I can share the data via Dropbox or something
if that
>> would help. 
>> 
>> 
>> 
>> I appreciate that when there is strong collinearity then all bets are
off in terms
>> of what the betas mean, but I would really expect lm() and lm.ridge()
to give
>> the same results. (I would be happy to ignore SPSS, but for the moment
it's
>> part of the majority!) 
>> 
>> 
>> 
>> Thanks for reading, 
>> Nick 
>> 
>> 
>> [[alternative HTML version deleted]] 
>> 
>> ______________________________________________ 
>> R-devel at r-project.org mailing list 
>> https://stat.ethz.ch/mailman/listinfo/r-devel 
> 
> 
> [[alternative HTML version deleted]] 
> 
> ______________________________________________ 
> R-devel at r-project.org mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-devel 
-- 
Peter Dalgaard, Professor, 
Center for Statistics, Copenhagen Business School 
Solbjerg Plads 3, 2000 Frederiksberg, Denmark 
Phone: (+45)38153501 
Office: A 4.23 
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com 










-------------- next part --------------
A non-text attachment was scrubbed...
Name: SPSS.png
Type: image/png
Size: 38704 bytes
Desc: not available
URL:
<https://stat.ethz.ch/pipermail/r-devel/attachments/20170505/24adf071/attachment.png>

Viechtbauer Wolfgang (SP)

2017-May-05 12:43 UTC

head link

[Rd] lm() gives different results to lm.ridge() and SPSS

I had no problems running regression models in SPSS and R that yielded the same
results for these data.

The difference you are observing is from fitting different models. In R, you
fitted:

res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
summary(res)

The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This is not a
standardized variable itself and not the same as "ZINTER_PA_C" in the
png you showed, which is not a variable in the dataset, but can be created with:

dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))

If you want the same results as in SPSS, then you need to fit:

res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C, data=dat)
summary(res)

This yields:

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)    6.41041    0.01722  372.21   <2e-16 ***
ZMEAN_PA      -1.62726    0.04200  -38.74   <2e-16 ***
ZDIVERSITY_PA -1.50082    0.07447  -20.15   <2e-16 ***
ZINTER_PA_C   -0.58955    0.05288  -11.15   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Exactly the same as in the png.

Peter already mentioned this as a possible reason for the discrepancy:
https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it perhaps
the case that x1 and x2 have already been scaled to have standard deviation 1?
In that case, x1*x2 won't be.")

Best,
Wolfgang

-----Original Message-----
From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of Nick Brown
Sent: Friday, May 05, 2017 10:40
To: peter dalgaard
Cc: r-devel at r-project.org
Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Hi, 

Here is (I hope) all the relevant output from R. 
> mean(s1$ZDEPRESSION, na.rm=T) [1] -1.041546e-16 > mean(s1$ZDIVERSITY_PA,
na.rm=T) [1] -9.660583e-16 > mean(s1$ZMEAN_PA, na.rm=T) [1] -5.430282e-15
> lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)$coef ZMEAN_PA    
ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA            -0.3962254             -0.3636026             -0.1425772      ##
This is what I thought was the problem originally. :-)

> coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
(Intercept)               ZMEAN_PA          ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA            0.07342198            -0.39650356            -0.36569488           
-0.09435788 > coefficients(lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA,
data=s1)) ZMEAN_PA          ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
            0.07342198            -0.39650356            -0.36569488           
-0.09435788 The equivalent from SPSS is attached. The unstandardized
coefficients in SPSS look nothing like those in R. The standardized coefficients
in SPSS match the lm.ridge()$coef numbers very closely indeed, suggesting that
the same algorithm may be in use.

I have put the dataset file, which is the untouched original I received from the
authors, in this Dropbox folder:
https://www.dropbox.com/sh/xsebjy55ius1ysb/AADwYUyV1bl6-iAw7ACuF1_La?dl=0. You
can read it into R with this code (one variable needs to be standardized and
centered; everything else is already in the file):

s1 <- read.csv("Emodiversity_Study1.csv", stringsAsFactors=FALSE)
s1$ZDEPRESSION <- scale(s1$DEPRESSION)
Hey, maybe R is fine and I've stumbled on a bug in SPSS? If so, I'm sure
IBM will want to fix it quickly (ha ha ha).

Nick 

----- Original Message -----

From: "peter dalgaard" <pdalgd at gmail.com> 
To: "Nick Brown" <nick.brown at free.fr> 
Cc: "Simon Bonner" <sbonner6 at uwo.ca>, r-devel at
r-project.org
Sent: Friday, 5 May, 2017 10:02:10 AM 
Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS 

I asked you before, but in case you missed it: Are you looking at the right
place in SPSS output?

The UNstandardized coefficients should be comparable to R, i.e. the
"B" column, not "Beta".

-pd 
> On 5 May 2017, at 01:58 , Nick Brown <nick.brown at free.fr> wrote: 
> 
> Hi Simon, 
> 
> Yes, if I uses coefficients() I get the same results for lm() and
lm.ridge(). So that's consistent, at least.
> 
> Interestingly, the "wrong" number I get from lm.ridge()$coef
agrees with the value from SPSS to 5dp, which is an interesting coincidence if
these numbers have no particular external meaning in lm.ridge().
> 
> Kind regards, 
> Nick 
> 
> ----- Original Message ----- 
> 
> From: "Simon Bonner" <sbonner6 at uwo.ca> 
> To: "Nick Brown" <nick.brown at free.fr>, r-devel at
r-project.org
> Sent: Thursday, 4 May, 2017 7:07:33 PM 
> Subject: RE: [Rd] lm() gives different results to lm.ridge() and SPSS 
> 
> Hi Nick, 
> 
> I think that the problem here is your use of $coef to extract the
coefficients of the ridge regression. The help for lm.ridge states that coef is
a "matrix of coefficients, one row for each value of lambda. Note that
these are not on the original scale and are for use by the coef method."
> 
> I ran a small test with simulated data, code is copied below, and indeed
the output from lm.ridge differs depending on whether the coefficients are
accessed via $coef or via the coefficients() function. The latter does produce
results that match the output from lm.
> 
> I hope that helps. 
> 
> Cheers, 
> 
> Simon 
> 
> ## Load packages 
> library(MASS) 
> 
> ## Set seed 
> set.seed(8888) 
> 
> ## Set parameters 
> n <- 100 
> beta <- c(1,0,1) 
> sigma <- .5 
> rho <- .75 
> 
> ## Simulate correlated covariates 
> Sigma <- matrix(c(1,rho,rho,1),ncol=2) 
> X <- mvrnorm(n,c(0,0),Sigma=Sigma) 
> 
> ## Simulate data 
> mu <- beta[1] + X %*% beta[-1] 
> y <- rnorm(n,mu,sigma) 
> 
> ## Fit model with lm() 
> fit1 <- lm(y ~ X) 
> 
> ## Fit model with lm.ridge() 
> fit2 <- lm.ridge(y ~ X) 
> 
> ## Compare coefficients 
> cbind(fit1$coefficients,c(NA,fit2$coef),coefficients(fit2)) 
> 
> [,1] [,2] [,3] 
> (Intercept) 0.99276001 NA 0.99276001 
> X1 -0.03980772 -0.04282391 -0.03980772 
> X2 1.11167179 1.06200476 1.11167179 
> 
> -- 
> 
> Simon Bonner 
> Assistant Professor of Environmetrics/ Director MMASc 
> Department of Statistical and Actuarial Sciences/Department of Biology 
> University of Western Ontario 
> 
> Office: Western Science Centre rm 276 
> 
> Email: sbonner6 at uwo.ca | Telephone: 519-661-2111 x88205 | Fax:
519-661-3813
> Twitter: @bonnerstatslab | Website:
http://simon.bonners.ca/bonner-lab/wpblog/
> 
>> -----Original Message----- 
>> From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of
Nick
>> Brown 
>> Sent: May 4, 2017 10:29 AM 
>> To: r-devel at r-project.org 
>> Subject: [Rd] lm() gives different results to lm.ridge() and SPSS 
>> 
>> Hallo, 
>> 
>> I hope I am posting to the right place. I was advised to try this list
by Ben Bolker
>> (https://twitter.com/bolkerb/status/859909918446497795). I also posted
this
>> question to StackOverflow 
>>
(http://stackoverflow.com/questions/43771269/lm-gives-different-results-
>> from-lm-ridgelambda-0). I am a relative newcomer to R, but I wrote my
first
>> program in 1975 and have been paid to program in about 15 different 
>> languages, so I have some general background knowledge. 
>> 
>> I have a regression from which I extract the coefficients like this: 
>> lm(y ~ x1 * x2, data=ds)$coef 
>> That gives: x1=0.40, x2=0.37, x1*x2=0.09 
>> 
>> When I do the same regression in SPSS, I get: 
>> beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14. 
>> So the main effects are in agreement, but there is quite a difference
in the
>> coefficient for the interaction. 
>> 
>> X1 and X2 are correlated about .75 (yes, yes, I know - this model
wasn't my
>> idea, but it got published), so there is quite possibly something going
on with
>> collinearity. So I thought I'd try lm.ridge() to see if I can get
an idea of where
>> the problems are occurring. 
>> 
>> The starting point is to run lm.ridge() with lambda=0 (i.e., no ridge
penalty) and
>> check we get the same results as with lm(): 
>> lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef 
>> x1=0.40, x2=0.37, x1*x2=0.14 
>> So lm.ridge() agrees with SPSS, but not with lm(). (Of course, lambda=0
is the
>> default, so it can be omitted; I can alternate between including or
deleting
>> ".ridge" in the function call, and watch the coefficient for
the interaction
>> change.) 
>> 
>> What seems slightly strange to me here is that I assumed that
lm.ridge() just
>> piggybacks on lm() anyway, so in the specific case where lambda=0 and
there
>> is no "ridging" to do, I'd expect exactly the same
results.
>> 
>> Unfortunately there are 34,000 cases in the dataset, so a
"minimal" reprex will
>> not be easy to make, but I can share the data via Dropbox or something
if that
>> would help. 
>> 
>> I appreciate that when there is strong collinearity then all bets are
off in terms
>> of what the betas mean, but I would really expect lm() and lm.ridge()
to give
>> the same results. (I would be happy to ignore SPSS, but for the moment
it's
>> part of the majority!) 
>> 
>> Thanks for reading, 
>> Nick

peter dalgaard

2017-May-05 13:33 UTC

head link

[Rd] lm() gives different results to lm.ridge() and SPSS

Thanks, I was getting to try this, but got side tracked by actual work...

Your analysis reproduces the SPSS unscaled estimates. It still remains to figure
out how Nick got
> coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))

           (Intercept)               ZMEAN_PA          ZDIVERSITY_PA
ZMEAN_PA:ZDIVERSITY_PA
            0.07342198            -0.39650356            -0.36569488           
-0.09435788


which does not match your output. I suspect that ZMEAN_PA and ZDIVERSITY_PA were
scaled for this analysis (but the interaction term still obviously is not). I
conjecture that something in the vicinity of

res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
summary(res)

would reproduce the SPSS Beta values.

> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP)
<wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
> 
> I had no problems running regression models in SPSS and R that yielded the
same results for these data.
> 
> The difference you are observing is from fitting different models. In R,
you fitted:
> 
> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
> summary(res)
> 
> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This is
not a standardized variable itself and not the same as "ZINTER_PA_C"
in the png you showed, which is not a variable in the dataset, but can be
created with:
> 
> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))
> 
> If you want the same results as in SPSS, then you need to fit:
> 
> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C, data=dat)
> summary(res)
> 
> This yields:
> 
> Coefficients:
>              Estimate Std. Error t value Pr(>|t|)    
> (Intercept)    6.41041    0.01722  372.21   <2e-16 ***
> ZMEAN_PA      -1.62726    0.04200  -38.74   <2e-16 ***
> ZDIVERSITY_PA -1.50082    0.07447  -20.15   <2e-16 ***
> ZINTER_PA_C   -0.58955    0.05288  -11.15   <2e-16 ***
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> 
> Exactly the same as in the png.
> 
> Peter already mentioned this as a possible reason for the discrepancy:
https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it perhaps
the case that x1 and x2 have already been scaled to have standard deviation 1?
In that case, x1*x2 won't be.")
> 
> Best,
> Wolfgang
> 
> -----Original Message-----
> From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of Nick
Brown
> Sent: Friday, May 05, 2017 10:40
> To: peter dalgaard
> Cc: r-devel at r-project.org
> Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
> 
> Hi, 
> 
> Here is (I hope) all the relevant output from R. 
> 
>> mean(s1$ZDEPRESSION, na.rm=T) [1] -1.041546e-16 >
mean(s1$ZDIVERSITY_PA, na.rm=T) [1] -9.660583e-16 > mean(s1$ZMEAN_PA,
na.rm=T) [1] -5.430282e-15 > lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA,
data=s1)$coef ZMEAN_PA          ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>            -0.3962254             -0.3636026             -0.1425772      ##
This is what I thought was the problem originally. :-)
> 
> 
>> coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
(Intercept)               ZMEAN_PA          ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>            0.07342198            -0.39650356            -0.36569488        
-0.09435788 > coefficients(lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA,
data=s1)) ZMEAN_PA          ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>            0.07342198            -0.39650356            -0.36569488        
-0.09435788 The equivalent from SPSS is attached. The unstandardized
coefficients in SPSS look nothing like those in R. The standardized coefficients
in SPSS match the lm.ridge()$coef numbers very closely indeed, suggesting that
the same algorithm may be in use.
> 
> I have put the dataset file, which is the untouched original I received
from the authors, in this Dropbox folder:
https://www.dropbox.com/sh/xsebjy55ius1ysb/AADwYUyV1bl6-iAw7ACuF1_La?dl=0. You
can read it into R with this code (one variable needs to be standardized and
centered; everything else is already in the file):
> 
> s1 <- read.csv("Emodiversity_Study1.csv",
stringsAsFactors=FALSE) s1$ZDEPRESSION <- scale(s1$DEPRESSION)
> Hey, maybe R is fine and I've stumbled on a bug in SPSS? If so, I'm
sure IBM will want to fix it quickly (ha ha ha).
> 
> Nick 
> 
> ----- Original Message -----
> 
> From: "peter dalgaard" <pdalgd at gmail.com> 
> To: "Nick Brown" <nick.brown at free.fr> 
> Cc: "Simon Bonner" <sbonner6 at uwo.ca>, r-devel at
r-project.org
> Sent: Friday, 5 May, 2017 10:02:10 AM 
> Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS 
> 
> I asked you before, but in case you missed it: Are you looking at the right
place in SPSS output?
> 
> The UNstandardized coefficients should be comparable to R, i.e. the
"B" column, not "Beta".
> 
> -pd 
> 
>> On 5 May 2017, at 01:58 , Nick Brown <nick.brown at free.fr>
wrote:
>> 
>> Hi Simon, 
>> 
>> Yes, if I uses coefficients() I get the same results for lm() and
lm.ridge(). So that's consistent, at least.
>> 
>> Interestingly, the "wrong" number I get from lm.ridge()$coef
agrees with the value from SPSS to 5dp, which is an interesting coincidence if
these numbers have no particular external meaning in lm.ridge().
>> 
>> Kind regards, 
>> Nick 
>> 
>> ----- Original Message ----- 
>> 
>> From: "Simon Bonner" <sbonner6 at uwo.ca> 
>> To: "Nick Brown" <nick.brown at free.fr>, r-devel at
r-project.org
>> Sent: Thursday, 4 May, 2017 7:07:33 PM 
>> Subject: RE: [Rd] lm() gives different results to lm.ridge() and SPSS 
>> 
>> Hi Nick, 
>> 
>> I think that the problem here is your use of $coef to extract the
coefficients of the ridge regression. The help for lm.ridge states that coef is
a "matrix of coefficients, one row for each value of lambda. Note that
these are not on the original scale and are for use by the coef method."
>> 
>> I ran a small test with simulated data, code is copied below, and
indeed the output from lm.ridge differs depending on whether the coefficients
are accessed via $coef or via the coefficients() function. The latter does
produce results that match the output from lm.
>> 
>> I hope that helps. 
>> 
>> Cheers, 
>> 
>> Simon 
>> 
>> ## Load packages 
>> library(MASS) 
>> 
>> ## Set seed 
>> set.seed(8888) 
>> 
>> ## Set parameters 
>> n <- 100 
>> beta <- c(1,0,1) 
>> sigma <- .5 
>> rho <- .75 
>> 
>> ## Simulate correlated covariates 
>> Sigma <- matrix(c(1,rho,rho,1),ncol=2) 
>> X <- mvrnorm(n,c(0,0),Sigma=Sigma) 
>> 
>> ## Simulate data 
>> mu <- beta[1] + X %*% beta[-1] 
>> y <- rnorm(n,mu,sigma) 
>> 
>> ## Fit model with lm() 
>> fit1 <- lm(y ~ X) 
>> 
>> ## Fit model with lm.ridge() 
>> fit2 <- lm.ridge(y ~ X) 
>> 
>> ## Compare coefficients 
>> cbind(fit1$coefficients,c(NA,fit2$coef),coefficients(fit2)) 
>> 
>> [,1] [,2] [,3] 
>> (Intercept) 0.99276001 NA 0.99276001 
>> X1 -0.03980772 -0.04282391 -0.03980772 
>> X2 1.11167179 1.06200476 1.11167179 
>> 
>> -- 
>> 
>> Simon Bonner 
>> Assistant Professor of Environmetrics/ Director MMASc 
>> Department of Statistical and Actuarial Sciences/Department of Biology 
>> University of Western Ontario 
>> 
>> Office: Western Science Centre rm 276 
>> 
>> Email: sbonner6 at uwo.ca | Telephone: 519-661-2111 x88205 | Fax:
519-661-3813
>> Twitter: @bonnerstatslab | Website:
http://simon.bonners.ca/bonner-lab/wpblog/
>> 
>>> -----Original Message----- 
>>> From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf
Of Nick
>>> Brown 
>>> Sent: May 4, 2017 10:29 AM 
>>> To: r-devel at r-project.org 
>>> Subject: [Rd] lm() gives different results to lm.ridge() and SPSS 
>>> 
>>> Hallo, 
>>> 
>>> I hope I am posting to the right place. I was advised to try this
list by Ben Bolker
>>> (https://twitter.com/bolkerb/status/859909918446497795). I also
posted this
>>> question to StackOverflow 
>>>
(http://stackoverflow.com/questions/43771269/lm-gives-different-results-
>>> from-lm-ridgelambda-0). I am a relative newcomer to R, but I wrote
my first
>>> program in 1975 and have been paid to program in about 15 different
>>> languages, so I have some general background knowledge. 
>>> 
>>> I have a regression from which I extract the coefficients like
this:
>>> lm(y ~ x1 * x2, data=ds)$coef 
>>> That gives: x1=0.40, x2=0.37, x1*x2=0.09 
>>> 
>>> When I do the same regression in SPSS, I get: 
>>> beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14. 
>>> So the main effects are in agreement, but there is quite a
difference in the
>>> coefficient for the interaction. 
>>> 
>>> X1 and X2 are correlated about .75 (yes, yes, I know - this model
wasn't my
>>> idea, but it got published), so there is quite possibly something
going on with
>>> collinearity. So I thought I'd try lm.ridge() to see if I can
get an idea of where
>>> the problems are occurring. 
>>> 
>>> The starting point is to run lm.ridge() with lambda=0 (i.e., no
ridge penalty) and
>>> check we get the same results as with lm(): 
>>> lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef 
>>> x1=0.40, x2=0.37, x1*x2=0.14 
>>> So lm.ridge() agrees with SPSS, but not with lm(). (Of course,
lambda=0 is the
>>> default, so it can be omitted; I can alternate between including or
deleting
>>> ".ridge" in the function call, and watch the coefficient
for the interaction
>>> change.) 
>>> 
>>> What seems slightly strange to me here is that I assumed that
lm.ridge() just
>>> piggybacks on lm() anyway, so in the specific case where lambda=0
and there
>>> is no "ridging" to do, I'd expect exactly the same
results.
>>> 
>>> Unfortunately there are 34,000 cases in the dataset, so a
"minimal" reprex will
>>> not be easy to make, but I can share the data via Dropbox or
something if that
>>> would help. 
>>> 
>>> I appreciate that when there is strong collinearity then all bets
are off in terms
>>> of what the betas mean, but I would really expect lm() and
lm.ridge() to give
>>> the same results. (I would be happy to ignore SPSS, but for the
moment it's
>>> part of the majority!) 
>>> 
>>> Thanks for reading, 
>>> Nick 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

Fox, John

2017-May-05 18:22 UTC

head link

[Rd] lm() gives different results to lm.ridge() and SPSS

Dear Nick,


On 2017-05-05, 9:40 AM, "R-devel on behalf of Nick Brown"
<r-devel-bounces at r-project.org on behalf of nick.brown at free.fr>
wrote:
>>I conjecture that something in the vicinity of
>> res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>>summary(res) 
>> would reproduce the SPSS Beta values.
>
>Yes, that works. Thanks!
That you have to work hard in R to match the SPSS results isn?t such a bad
thing when you factor in the observation that standardizing the
interaction regressor, ZMEAN_PA * ZDIVERSITY_PA, separately from each of
its components, ZMEAN_PA and ZDIVERSITY_PA, is nonsense.

Best,
 John

-------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
Web: http://socserv.mcmaster.ca/jfox/

> 
>
>----- Original Message -----
>
>From: "peter dalgaard" <pdalgd at gmail.com>
>To: "Viechtbauer Wolfgang (SP)"
><wolfgang.viechtbauer at maastrichtuniversity.nl>, "Nick
Brown"
><nick.brown at free.fr>
>Cc: r-devel at r-project.org
>Sent: Friday, 5 May, 2017 3:33:29 PM
>Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
>
>Thanks, I was getting to try this, but got side tracked by actual work...
>
>Your analysis reproduces the SPSS unscaled estimates. It still remains to
>figure out how Nick got
>
>> 
>coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
>
>(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>0.07342198 -0.39650356 -0.36569488 -0.09435788
>
>
>which does not match your output. I suspect that ZMEAN_PA and
>ZDIVERSITY_PA were scaled for this analysis (but the interaction term
>still obviously is not). I conjecture that something in the vicinity of
>
>res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>summary(res) 
>
>would reproduce the SPSS Beta values.
>
>
>> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP)
>><wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
>> 
>> I had no problems running regression models in SPSS and R that yielded
>>the same results for these data.
>> 
>> The difference you are observing is from fitting different models. In
>>R, you fitted: 
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
>> summary(res) 
>> 
>> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This
>>is not a standardized variable itself and not the same as
"ZINTER_PA_C"
>>in the png you showed, which is not a variable in the dataset, but can
>>be created with: 
>> 
>> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))
>> 
>> If you want the same results as in SPSS, then you need to fit:
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C,
>>data=dat) 
>> summary(res) 
>> 
>> This yields: 
>> 
>> Coefficients: 
>> Estimate Std. Error t value Pr(>|t|)
>> (Intercept) 6.41041 0.01722 372.21 <2e-16 ***
>> ZMEAN_PA -1.62726 0.04200 -38.74 <2e-16 ***
>> ZDIVERSITY_PA -1.50082 0.07447 -20.15 <2e-16 ***
>> ZINTER_PA_C -0.58955 0.05288 -11.15 <2e-16 ***
>> --- 
>> Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>> 
>> Exactly the same as in the png.
>> 
>> Peter already mentioned this as a possible reason for the discrepancy:
>>https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it
>>perhaps the case that x1 and x2 have already been scaled to have
>>standard deviation 1? In that case, x1*x2 won't be.")
>> 
>> Best, 
>> Wolfgang 
>> 
>> -----Original Message-----
>> From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of
Nick
>>Brown 
>> Sent: Friday, May 05, 2017 10:40
>> To: peter dalgaard
>> Cc: r-devel at r-project.org
>> Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
>> 
>> Hi, 
>> 
>> Here is (I hope) all the relevant output from R.
>> 
>>> mean(s1$ZDEPRESSION, na.rm=T) [1] -1.041546e-16 >
>>>mean(s1$ZDIVERSITY_PA, na.rm=T) [1] -9.660583e-16 >
mean(s1$ZMEAN_PA,
>>>na.rm=T) [1] -5.430282e-15 > lm.ridge(ZDEPRESSION ~ ZMEAN_PA *
>>>ZDIVERSITY_PA, data=s1)$coef ZMEAN_PA ZDIVERSITY_PA
>>>ZMEAN_PA:ZDIVERSITY_PA
>> -0.3962254 -0.3636026 -0.1425772 ## This is what I thought was the
>>problem originally. :-)
>> 
>> 
>>> coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
>>>(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>> 0.07342198 -0.39650356 -0.36569488 -0.09435788 >
>>coefficients(lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
>>ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>> 0.07342198 -0.39650356 -0.36569488 -0.09435788 The equivalent from SPSS
>>is attached. The unstandardized coefficients in SPSS look nothing like
>>those in R. The standardized coefficients in SPSS match the
>>lm.ridge()$coef numbers very closely indeed, suggesting that the same
>>algorithm may be in use.
>> 
>> I have put the dataset file, which is the untouched original I received
>>from the authors, in this Dropbox folder:
>>https://www.dropbox.com/sh/xsebjy55ius1ysb/AADwYUyV1bl6-iAw7ACuF1_La?dl=0
>>. You can read it into R with this code (one variable needs to be
>>standardized and centered; everything else is already in the file):
>> 
>> s1 <- read.csv("Emodiversity_Study1.csv",
stringsAsFactors=FALSE)
>>s1$ZDEPRESSION <- scale(s1$DEPRESSION)
>> Hey, maybe R is fine and I've stumbled on a bug in SPSS? If so,
I'm
>>sure IBM will want to fix it quickly (ha ha ha).
>> 
>> Nick 
>> 
>> ----- Original Message -----
>> 
>> From: "peter dalgaard" <pdalgd at gmail.com>
>> To: "Nick Brown" <nick.brown at free.fr>
>> Cc: "Simon Bonner" <sbonner6 at uwo.ca>, r-devel at
r-project.org
>> Sent: Friday, 5 May, 2017 10:02:10 AM
>> Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
>> 
>> I asked you before, but in case you missed it: Are you looking at the
>>right place in SPSS output?
>> 
>> The UNstandardized coefficients should be comparable to R, i.e. the
"B"
>>column, not "Beta".
>> 
>> -pd 
>> 
>>> On 5 May 2017, at 01:58 , Nick Brown <nick.brown at free.fr>
wrote:
>>> 
>>> Hi Simon, 
>>> 
>>> Yes, if I uses coefficients() I get the same results for lm() and
>>>lm.ridge(). So that's consistent, at least.
>>> 
>>> Interestingly, the "wrong" number I get from
lm.ridge()$coef agrees
>>>with the value from SPSS to 5dp, which is an interesting coincidence
if
>>>these numbers have no particular external meaning in lm.ridge().
>>> 
>>> Kind regards, 
>>> Nick 
>>> 
>>> ----- Original Message -----
>>> 
>>> From: "Simon Bonner" <sbonner6 at uwo.ca>
>>> To: "Nick Brown" <nick.brown at free.fr>, r-devel
at r-project.org
>>> Sent: Thursday, 4 May, 2017 7:07:33 PM
>>> Subject: RE: [Rd] lm() gives different results to lm.ridge() and
SPSS
>>> 
>>> Hi Nick, 
>>> 
>>> I think that the problem here is your use of $coef to extract the
>>>coefficients of the ridge regression. The help for lm.ridge states
that
>>>coef is a "matrix of coefficients, one row for each value of
lambda.
>>>Note that these are not on the original scale and are for use by the
>>>coef method." 
>>> 
>>> I ran a small test with simulated data, code is copied below, and
>>>indeed the output from lm.ridge differs depending on whether the
>>>coefficients are accessed via $coef or via the coefficients()
function.
>>>The latter does produce results that match the output from lm.
>>> 
>>> I hope that helps.
>>> 
>>> Cheers, 
>>> 
>>> Simon 
>>> 
>>> ## Load packages
>>> library(MASS) 
>>> 
>>> ## Set seed 
>>> set.seed(8888) 
>>> 
>>> ## Set parameters
>>> n <- 100 
>>> beta <- c(1,0,1)
>>> sigma <- .5 
>>> rho <- .75 
>>> 
>>> ## Simulate correlated covariates
>>> Sigma <- matrix(c(1,rho,rho,1),ncol=2)
>>> X <- mvrnorm(n,c(0,0),Sigma=Sigma)
>>> 
>>> ## Simulate data
>>> mu <- beta[1] + X %*% beta[-1]
>>> y <- rnorm(n,mu,sigma)
>>> 
>>> ## Fit model with lm()
>>> fit1 <- lm(y ~ X)
>>> 
>>> ## Fit model with lm.ridge()
>>> fit2 <- lm.ridge(y ~ X)
>>> 
>>> ## Compare coefficients
>>> cbind(fit1$coefficients,c(NA,fit2$coef),coefficients(fit2))
>>> 
>>> [,1] [,2] [,3] 
>>> (Intercept) 0.99276001 NA 0.99276001
>>> X1 -0.03980772 -0.04282391 -0.03980772
>>> X2 1.11167179 1.06200476 1.11167179
>>> 
>>> -- 
>>> 
>>> Simon Bonner 
>>> Assistant Professor of Environmetrics/ Director MMASc
>>> Department of Statistical and Actuarial Sciences/Department of
Biology
>>> University of Western Ontario
>>> 
>>> Office: Western Science Centre rm 276
>>> 
>>> Email: sbonner6 at uwo.ca | Telephone: 519-661-2111 x88205 | Fax:
>>>519-661-3813 
>>> Twitter: @bonnerstatslab | Website:
>>>http://simon.bonners.ca/bonner-lab/wpblog/
>>> 
>>>> -----Original Message-----
>>>> From: R-devel [mailto:r-devel-bounces at r-project.org] On
Behalf Of
>>>>Nick 
>>>> Brown 
>>>> Sent: May 4, 2017 10:29 AM
>>>> To: r-devel at r-project.org
>>>> Subject: [Rd] lm() gives different results to lm.ridge() and
SPSS
>>>> 
>>>> Hallo, 
>>>> 
>>>> I hope I am posting to the right place. I was advised to try
this
>>>>list by Ben Bolker
>>>> (https://twitter.com/bolkerb/status/859909918446497795). I also
>>>>posted this 
>>>> question to StackOverflow
>>>> 
>>>>(http://stackoverflow.com/questions/43771269/lm-gives-different-results
>>>>- 
>>>> from-lm-ridgelambda-0). I am a relative newcomer to R, but I
wrote my
>>>>first 
>>>> program in 1975 and have been paid to program in about 15
different
>>>> languages, so I have some general background knowledge.
>>>> 
>>>> I have a regression from which I extract the coefficients like
this:
>>>> lm(y ~ x1 * x2, data=ds)$coef
>>>> That gives: x1=0.40, x2=0.37, x1*x2=0.09
>>>> 
>>>> When I do the same regression in SPSS, I get:
>>>> beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14.
>>>> So the main effects are in agreement, but there is quite a
difference
>>>>in the 
>>>> coefficient for the interaction.
>>>> 
>>>> X1 and X2 are correlated about .75 (yes, yes, I know - this
model
>>>>wasn't my 
>>>> idea, but it got published), so there is quite possibly
something
>>>>going on with 
>>>> collinearity. So I thought I'd try lm.ridge() to see if I
can get an
>>>>idea of where 
>>>> the problems are occurring.
>>>> 
>>>> The starting point is to run lm.ridge() with lambda=0 (i.e., no
ridge
>>>>penalty) and 
>>>> check we get the same results as with lm():
>>>> lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef
>>>> x1=0.40, x2=0.37, x1*x2=0.14
>>>> So lm.ridge() agrees with SPSS, but not with lm(). (Of course,
>>>>lambda=0 is the
>>>> default, so it can be omitted; I can alternate between
including or
>>>>deleting 
>>>> ".ridge" in the function call, and watch the
coefficient for the
>>>>interaction 
>>>> change.) 
>>>> 
>>>> What seems slightly strange to me here is that I assumed that
>>>>lm.ridge() just
>>>> piggybacks on lm() anyway, so in the specific case where
lambda=0 and
>>>>there 
>>>> is no "ridging" to do, I'd expect exactly the
same results.
>>>> 
>>>> Unfortunately there are 34,000 cases in the dataset, so a
"minimal"
>>>>reprex will 
>>>> not be easy to make, but I can share the data via Dropbox or
>>>>something if that
>>>> would help. 
>>>> 
>>>> I appreciate that when there is strong collinearity then all
bets are
>>>>off in terms 
>>>> of what the betas mean, but I would really expect lm() and
lm.ridge()
>>>>to give 
>>>> the same results. (I would be happy to ignore SPSS, but for the
>>>>moment it's 
>>>> part of the majority!)
>>>> 
>>>> Thanks for reading,
>>>> Nick 
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>-- 
>Peter Dalgaard, Professor,
>Center for Statistics, Copenhagen Business School
>Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>Phone: (+45)38153501
>Office: A 4.23 
>Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
>
>
>
>
>
>
>
>
>
>
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-devel at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-devel

Viechtbauer Wolfgang (SP)

2017-May-05 20:34 UTC

head link

[Rd] lm() gives different results to lm.ridge() and SPSS

Totally agree that standardizing the interaction term is nonsense. But in all
fairness, SPSS doesn't do that. In fact, the 'REGRESSION' command in
SPSS doesn't compute any interaction terms -- one has to first compute them
'by hand' and then add them to the model like any other variable. So
somebody worked extra hard to standardize that interaction term in SPSS as well
:/

Best,
Wolfgang

-----Original Message-----
From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of Fox, John
Sent: Friday, May 05, 2017 20:23
To: Nick Brown; peter dalgaard
Cc: r-devel at r-project.org
Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Dear Nick,

On 2017-05-05, 9:40 AM, "R-devel on behalf of Nick Brown"
<r-devel-bounces at r-project.org on behalf of nick.brown at free.fr>
wrote:
>>I conjecture that something in the vicinity of
>> res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>>summary(res) 
>> would reproduce the SPSS Beta values.
>
>Yes, that works. Thanks!
That you have to work hard in R to match the SPSS results isn?t such a bad
thing when you factor in the observation that standardizing the
interaction regressor, ZMEAN_PA * ZDIVERSITY_PA, separately from each of
its components, ZMEAN_PA and ZDIVERSITY_PA, is nonsense.

Best,
 John

-------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
Web: http://socserv.mcmaster.ca/jfox/
>----- Original Message -----
>
>From: "peter dalgaard" <pdalgd at gmail.com>
>To: "Viechtbauer Wolfgang (SP)"
><wolfgang.viechtbauer at maastrichtuniversity.nl>, "Nick
Brown"
><nick.brown at free.fr>
>Cc: r-devel at r-project.org
>Sent: Friday, 5 May, 2017 3:33:29 PM
>Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
>
>Thanks, I was getting to try this, but got side tracked by actual work...
>
>Your analysis reproduces the SPSS unscaled estimates. It still remains to
>figure out how Nick got
>
>> 
>coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
>
>(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>0.07342198 -0.39650356 -0.36569488 -0.09435788
>
>
>which does not match your output. I suspect that ZMEAN_PA and
>ZDIVERSITY_PA were scaled for this analysis (but the interaction term
>still obviously is not). I conjecture that something in the vicinity of
>
>res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>summary(res) 
>
>would reproduce the SPSS Beta values.
>
>> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP)
>><wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
>> 
>> I had no problems running regression models in SPSS and R that yielded
>>the same results for these data.
>> 
>> The difference you are observing is from fitting different models. In
>>R, you fitted: 
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
>> summary(res) 
>> 
>> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This
>>is not a standardized variable itself and not the same as
"ZINTER_PA_C"
>>in the png you showed, which is not a variable in the dataset, but can
>>be created with: 
>> 
>> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))
>> 
>> If you want the same results as in SPSS, then you need to fit:
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C,
>>data=dat) 
>> summary(res) 
>> 
>> This yields: 
>> 
>> Coefficients: 
>> Estimate Std. Error t value Pr(>|t|)
>> (Intercept) 6.41041 0.01722 372.21 <2e-16 ***
>> ZMEAN_PA -1.62726 0.04200 -38.74 <2e-16 ***
>> ZDIVERSITY_PA -1.50082 0.07447 -20.15 <2e-16 ***
>> ZINTER_PA_C -0.58955 0.05288 -11.15 <2e-16 ***
>> --- 
>> Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>> 
>> Exactly the same as in the png.
>> 
>> Peter already mentioned this as a possible reason for the discrepancy:
>>https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it
>>perhaps the case that x1 and x2 have already been scaled to have
>>standard deviation 1? In that case, x1*x2 won't be.")
>> 
>> Best, 
>> Wolfgang

Viechtbauer Wolfgang (SP)

2017-May-05 20:41 UTC

head link

[Rd] lm() gives different results to lm.ridge() and SPSS

Well, one correction -- the 'standardized coefficients' that SPSS shows
are based on standardizing all variables separately (so x1, x2, and x1*x2 are
all standardized). So with respect to that, the criticism certainly stands.

-----Original Message-----
From: Viechtbauer Wolfgang (SP) 
Sent: Friday, May 05, 2017 22:35
To: r-devel at r-project.org
Subject: RE: [Rd] lm() gives different results to lm.ridge() and SPSS

Totally agree that standardizing the interaction term is nonsense. But in all
fairness, SPSS doesn't do that. In fact, the 'REGRESSION' command in
SPSS doesn't compute any interaction terms -- one has to first compute them
'by hand' and then add them to the model like any other variable. So
somebody worked extra hard to standardize that interaction term in SPSS as well
:/

Best,
Wolfgang

-----Original Message-----
From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of Fox, John
Sent: Friday, May 05, 2017 20:23
To: Nick Brown; peter dalgaard
Cc: r-devel at r-project.org
Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Dear Nick,

On 2017-05-05, 9:40 AM, "R-devel on behalf of Nick Brown"
<r-devel-bounces at r-project.org on behalf of nick.brown at free.fr>
wrote:
>>I conjecture that something in the vicinity of
>> res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>>summary(res) 
>> would reproduce the SPSS Beta values.
>
>Yes, that works. Thanks!
That you have to work hard in R to match the SPSS results isn?t such a bad
thing when you factor in the observation that standardizing the
interaction regressor, ZMEAN_PA * ZDIVERSITY_PA, separately from each of
its components, ZMEAN_PA and ZDIVERSITY_PA, is nonsense.

Best,
 John

-------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
Web: http://socserv.mcmaster.ca/jfox/
>----- Original Message -----
>
>From: "peter dalgaard" <pdalgd at gmail.com>
>To: "Viechtbauer Wolfgang (SP)"
><wolfgang.viechtbauer at maastrichtuniversity.nl>, "Nick
Brown"
><nick.brown at free.fr>
>Cc: r-devel at r-project.org
>Sent: Friday, 5 May, 2017 3:33:29 PM
>Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
>
>Thanks, I was getting to try this, but got side tracked by actual work...
>
>Your analysis reproduces the SPSS unscaled estimates. It still remains to
>figure out how Nick got
>
>> 
>coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
>
>(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>0.07342198 -0.39650356 -0.36569488 -0.09435788
>
>
>which does not match your output. I suspect that ZMEAN_PA and
>ZDIVERSITY_PA were scaled for this analysis (but the interaction term
>still obviously is not). I conjecture that something in the vicinity of
>
>res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>summary(res) 
>
>would reproduce the SPSS Beta values.
>
>> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP)
>><wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
>> 
>> I had no problems running regression models in SPSS and R that yielded
>>the same results for these data.
>> 
>> The difference you are observing is from fitting different models. In
>>R, you fitted: 
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
>> summary(res) 
>> 
>> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This
>>is not a standardized variable itself and not the same as
"ZINTER_PA_C"
>>in the png you showed, which is not a variable in the dataset, but can
>>be created with: 
>> 
>> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))
>> 
>> If you want the same results as in SPSS, then you need to fit:
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C,
>>data=dat) 
>> summary(res) 
>> 
>> This yields: 
>> 
>> Coefficients: 
>> Estimate Std. Error t value Pr(>|t|)
>> (Intercept) 6.41041 0.01722 372.21 <2e-16 ***
>> ZMEAN_PA -1.62726 0.04200 -38.74 <2e-16 ***
>> ZDIVERSITY_PA -1.50082 0.07447 -20.15 <2e-16 ***
>> ZINTER_PA_C -0.58955 0.05288 -11.15 <2e-16 ***
>> --- 
>> Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>> 
>> Exactly the same as in the png.
>> 
>> Peter already mentioned this as a possible reason for the discrepancy:
>>https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it
>>perhaps the case that x1 and x2 have already been scaled to have
>>standard deviation 1? In that case, x1*x2 won't be.")
>> 
>> Best, 
>> Wolfgang

Nick Brown

2017-May-05 23:49 UTC

head link

[Rd] lm() gives different results to lm.ridge() and SPSS

Hi John, 

Thanks for the comment... but that appears to mean that SPSS has a big problem.
I have always been told that to include an interaction term in a regression, the
only way is to do the multiplication by hand. But then it seems to be impossible
to stop SPSS from re-standardizing the variable that corresponds to the
interaction term. Am I missing something? Is there a way to perform the
regression with the interaction in SPSS without computing the interaction as a
separate variable?

Best, 
Nick 

----- Original Message -----

From: "John Fox" <jfox at mcmaster.ca> 
To: "Nick Brown" <nick.brown at free.fr>, "peter
dalgaard" <pdalgd at gmail.com>
Cc: r-devel at r-project.org 
Sent: Friday, 5 May, 2017 8:22:53 PM 
Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS 

Dear Nick, 


On 2017-05-05, 9:40 AM, "R-devel on behalf of Nick Brown" 
<r-devel-bounces at r-project.org on behalf of nick.brown at free.fr>
wrote:
>>I conjecture that something in the vicinity of 
>> res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) + 
>>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat) 
>>summary(res) 
>> would reproduce the SPSS Beta values. 
> 
>Yes, that works. Thanks! 
That you have to work hard in R to match the SPSS results isn?t such a bad 
thing when you factor in the observation that standardizing the 
interaction regressor, ZMEAN_PA * ZDIVERSITY_PA, separately from each of 
its components, ZMEAN_PA and ZDIVERSITY_PA, is nonsense. 

Best, 
John 

------------------------------------- 
John Fox, Professor 
McMaster University 
Hamilton, Ontario, Canada 
Web: http://socserv.mcmaster.ca/jfox/ 

> 
> 
>----- Original Message ----- 
> 
>From: "peter dalgaard" <pdalgd at gmail.com> 
>To: "Viechtbauer Wolfgang (SP)" 
><wolfgang.viechtbauer at maastrichtuniversity.nl>, "Nick
Brown"
><nick.brown at free.fr> 
>Cc: r-devel at r-project.org 
>Sent: Friday, 5 May, 2017 3:33:29 PM 
>Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS 
> 
>Thanks, I was getting to try this, but got side tracked by actual work...
> 
>Your analysis reproduces the SPSS unscaled estimates. It still remains to
>figure out how Nick got 
> 
>> 
>coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)) 
> 
>(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
>0.07342198 -0.39650356 -0.36569488 -0.09435788 
> 
> 
>which does not match your output. I suspect that ZMEAN_PA and 
>ZDIVERSITY_PA were scaled for this analysis (but the interaction term 
>still obviously is not). I conjecture that something in the vicinity of 
> 
>res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) + 
>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat) 
>summary(res) 
> 
>would reproduce the SPSS Beta values. 
> 
> 
>> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP) 
>><wolfgang.viechtbauer at maastrichtuniversity.nl> wrote: 
>> 
>> I had no problems running regression models in SPSS and R that yielded 
>>the same results for these data. 
>> 
>> The difference you are observing is from fitting different models. In 
>>R, you fitted: 
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat) 
>> summary(res) 
>> 
>> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This
>>is not a standardized variable itself and not the same as
"ZINTER_PA_C"
>>in the png you showed, which is not a variable in the dataset, but can 
>>be created with: 
>> 
>> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA)) 
>> 
>> If you want the same results as in SPSS, then you need to fit: 
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C, 
>>data=dat) 
>> summary(res) 
>> 
>> This yields: 
>> 
>> Coefficients: 
>> Estimate Std. Error t value Pr(>|t|) 
>> (Intercept) 6.41041 0.01722 372.21 <2e-16 *** 
>> ZMEAN_PA -1.62726 0.04200 -38.74 <2e-16 *** 
>> ZDIVERSITY_PA -1.50082 0.07447 -20.15 <2e-16 *** 
>> ZINTER_PA_C -0.58955 0.05288 -11.15 <2e-16 *** 
>> --- 
>> Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
>> 
>> Exactly the same as in the png. 
>> 
>> Peter already mentioned this as a possible reason for the discrepancy: 
>>https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it
>>perhaps the case that x1 and x2 have already been scaled to have 
>>standard deviation 1? In that case, x1*x2 won't be.") 
>> 
>> Best, 
>> Wolfgang 
>> 
>> -----Original Message----- 
>> From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of
Nick
>>Brown 
>> Sent: Friday, May 05, 2017 10:40 
>> To: peter dalgaard 
>> Cc: r-devel at r-project.org 
>> Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS 
>> 
>> Hi, 
>> 
>> Here is (I hope) all the relevant output from R. 
>> 
>>> mean(s1$ZDEPRESSION, na.rm=T) [1] -1.041546e-16 > 
>>>mean(s1$ZDIVERSITY_PA, na.rm=T) [1] -9.660583e-16 >
mean(s1$ZMEAN_PA,
>>>na.rm=T) [1] -5.430282e-15 > lm.ridge(ZDEPRESSION ~ ZMEAN_PA * 
>>>ZDIVERSITY_PA, data=s1)$coef ZMEAN_PA ZDIVERSITY_PA 
>>>ZMEAN_PA:ZDIVERSITY_PA 
>> -0.3962254 -0.3636026 -0.1425772 ## This is what I thought was the 
>>problem originally. :-) 
>> 
>> 
>>> coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)) 
>>>(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
>> 0.07342198 -0.39650356 -0.36569488 -0.09435788 > 
>>coefficients(lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
>>ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
>> 0.07342198 -0.39650356 -0.36569488 -0.09435788 The equivalent from SPSS
>>is attached. The unstandardized coefficients in SPSS look nothing like 
>>those in R. The standardized coefficients in SPSS match the 
>>lm.ridge()$coef numbers very closely indeed, suggesting that the same 
>>algorithm may be in use. 
>> 
>> I have put the dataset file, which is the untouched original I received
>>from the authors, in this Dropbox folder: 
>>https://www.dropbox.com/sh/xsebjy55ius1ysb/AADwYUyV1bl6-iAw7ACuF1_La?dl=0
>>. You can read it into R with this code (one variable needs to be 
>>standardized and centered; everything else is already in the file): 
>> 
>> s1 <- read.csv("Emodiversity_Study1.csv",
stringsAsFactors=FALSE)
>>s1$ZDEPRESSION <- scale(s1$DEPRESSION) 
>> Hey, maybe R is fine and I've stumbled on a bug in SPSS? If so,
I'm
>>sure IBM will want to fix it quickly (ha ha ha). 
>> 
>> Nick 
>> 
>> ----- Original Message ----- 
>> 
>> From: "peter dalgaard" <pdalgd at gmail.com> 
>> To: "Nick Brown" <nick.brown at free.fr> 
>> Cc: "Simon Bonner" <sbonner6 at uwo.ca>, r-devel at
r-project.org
>> Sent: Friday, 5 May, 2017 10:02:10 AM 
>> Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS 
>> 
>> I asked you before, but in case you missed it: Are you looking at the 
>>right place in SPSS output? 
>> 
>> The UNstandardized coefficients should be comparable to R, i.e. the
"B"
>>column, not "Beta". 
>> 
>> -pd 
>> 
>>> On 5 May 2017, at 01:58 , Nick Brown <nick.brown at free.fr>
wrote:
>>> 
>>> Hi Simon, 
>>> 
>>> Yes, if I uses coefficients() I get the same results for lm() and 
>>>lm.ridge(). So that's consistent, at least. 
>>> 
>>> Interestingly, the "wrong" number I get from
lm.ridge()$coef agrees
>>>with the value from SPSS to 5dp, which is an interesting coincidence
if
>>>these numbers have no particular external meaning in lm.ridge(). 
>>> 
>>> Kind regards, 
>>> Nick 
>>> 
>>> ----- Original Message ----- 
>>> 
>>> From: "Simon Bonner" <sbonner6 at uwo.ca> 
>>> To: "Nick Brown" <nick.brown at free.fr>, r-devel
at r-project.org
>>> Sent: Thursday, 4 May, 2017 7:07:33 PM 
>>> Subject: RE: [Rd] lm() gives different results to lm.ridge() and
SPSS
>>> 
>>> Hi Nick, 
>>> 
>>> I think that the problem here is your use of $coef to extract the 
>>>coefficients of the ridge regression. The help for lm.ridge states
that
>>>coef is a "matrix of coefficients, one row for each value of
lambda.
>>>Note that these are not on the original scale and are for use by the
>>>coef method." 
>>> 
>>> I ran a small test with simulated data, code is copied below, and 
>>>indeed the output from lm.ridge differs depending on whether the 
>>>coefficients are accessed via $coef or via the coefficients()
function.
>>>The latter does produce results that match the output from lm. 
>>> 
>>> I hope that helps. 
>>> 
>>> Cheers, 
>>> 
>>> Simon 
>>> 
>>> ## Load packages 
>>> library(MASS) 
>>> 
>>> ## Set seed 
>>> set.seed(8888) 
>>> 
>>> ## Set parameters 
>>> n <- 100 
>>> beta <- c(1,0,1) 
>>> sigma <- .5 
>>> rho <- .75 
>>> 
>>> ## Simulate correlated covariates 
>>> Sigma <- matrix(c(1,rho,rho,1),ncol=2) 
>>> X <- mvrnorm(n,c(0,0),Sigma=Sigma) 
>>> 
>>> ## Simulate data 
>>> mu <- beta[1] + X %*% beta[-1] 
>>> y <- rnorm(n,mu,sigma) 
>>> 
>>> ## Fit model with lm() 
>>> fit1 <- lm(y ~ X) 
>>> 
>>> ## Fit model with lm.ridge() 
>>> fit2 <- lm.ridge(y ~ X) 
>>> 
>>> ## Compare coefficients 
>>> cbind(fit1$coefficients,c(NA,fit2$coef),coefficients(fit2)) 
>>> 
>>> [,1] [,2] [,3] 
>>> (Intercept) 0.99276001 NA 0.99276001 
>>> X1 -0.03980772 -0.04282391 -0.03980772 
>>> X2 1.11167179 1.06200476 1.11167179 
>>> 
>>> -- 
>>> 
>>> Simon Bonner 
>>> Assistant Professor of Environmetrics/ Director MMASc 
>>> Department of Statistical and Actuarial Sciences/Department of
Biology
>>> University of Western Ontario 
>>> 
>>> Office: Western Science Centre rm 276 
>>> 
>>> Email: sbonner6 at uwo.ca | Telephone: 519-661-2111 x88205 | Fax: 
>>>519-661-3813 
>>> Twitter: @bonnerstatslab | Website: 
>>>http://simon.bonners.ca/bonner-lab/wpblog/ 
>>> 
>>>> -----Original Message----- 
>>>> From: R-devel [mailto:r-devel-bounces at r-project.org] On
Behalf Of
>>>>Nick 
>>>> Brown 
>>>> Sent: May 4, 2017 10:29 AM 
>>>> To: r-devel at r-project.org 
>>>> Subject: [Rd] lm() gives different results to lm.ridge() and
SPSS
>>>> 
>>>> Hallo, 
>>>> 
>>>> I hope I am posting to the right place. I was advised to try
this
>>>>list by Ben Bolker 
>>>> (https://twitter.com/bolkerb/status/859909918446497795). I also
>>>>posted this 
>>>> question to StackOverflow 
>>>> 
>>>>(http://stackoverflow.com/questions/43771269/lm-gives-different-results
>>>>- 
>>>> from-lm-ridgelambda-0). I am a relative newcomer to R, but I
wrote my
>>>>first 
>>>> program in 1975 and have been paid to program in about 15
different
>>>> languages, so I have some general background knowledge. 
>>>> 
>>>> I have a regression from which I extract the coefficients like
this:
>>>> lm(y ~ x1 * x2, data=ds)$coef 
>>>> That gives: x1=0.40, x2=0.37, x1*x2=0.09 
>>>> 
>>>> When I do the same regression in SPSS, I get: 
>>>> beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14. 
>>>> So the main effects are in agreement, but there is quite a
difference
>>>>in the 
>>>> coefficient for the interaction. 
>>>> 
>>>> X1 and X2 are correlated about .75 (yes, yes, I know - this
model
>>>>wasn't my 
>>>> idea, but it got published), so there is quite possibly
something
>>>>going on with 
>>>> collinearity. So I thought I'd try lm.ridge() to see if I
can get an
>>>>idea of where 
>>>> the problems are occurring. 
>>>> 
>>>> The starting point is to run lm.ridge() with lambda=0 (i.e., no
ridge
>>>>penalty) and 
>>>> check we get the same results as with lm(): 
>>>> lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef 
>>>> x1=0.40, x2=0.37, x1*x2=0.14 
>>>> So lm.ridge() agrees with SPSS, but not with lm(). (Of course, 
>>>>lambda=0 is the 
>>>> default, so it can be omitted; I can alternate between
including or
>>>>deleting 
>>>> ".ridge" in the function call, and watch the
coefficient for the
>>>>interaction 
>>>> change.) 
>>>> 
>>>> What seems slightly strange to me here is that I assumed that 
>>>>lm.ridge() just 
>>>> piggybacks on lm() anyway, so in the specific case where
lambda=0 and
>>>>there 
>>>> is no "ridging" to do, I'd expect exactly the
same results.
>>>> 
>>>> Unfortunately there are 34,000 cases in the dataset, so a
"minimal"
>>>>reprex will 
>>>> not be easy to make, but I can share the data via Dropbox or 
>>>>something if that 
>>>> would help. 
>>>> 
>>>> I appreciate that when there is strong collinearity then all
bets are
>>>>off in terms 
>>>> of what the betas mean, but I would really expect lm() and
lm.ridge()
>>>>to give 
>>>> the same results. (I would be happy to ignore SPSS, but for the
>>>>moment it's 
>>>> part of the majority!) 
>>>> 
>>>> Thanks for reading, 
>>>> Nick 
>> ______________________________________________ 
>> R-devel at r-project.org mailing list 
>> https://stat.ethz.ch/mailman/listinfo/r-devel 
> 
>-- 
>Peter Dalgaard, Professor, 
>Center for Statistics, Copenhagen Business School 
>Solbjerg Plads 3, 2000 Frederiksberg, Denmark 
>Phone: (+45)38153501 
>Office: A 4.23 
>Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> [[alternative HTML version deleted]] 
> 
>______________________________________________ 
>R-devel at r-project.org mailing list 
>https://stat.ethz.ch/mailman/listinfo/r-devel 


	[[alternative HTML version deleted]]

Apparently Analagous Threads

Search for more reasonably related threads

R devel - May 2017 - lm() gives different results to lm.ridge() and SPSS

[Rd] lm() gives different results to lm.ridge() and SPSS

[Rd] lm() gives different results to lm.ridge() and SPSS

[Rd] lm() gives different results to lm.ridge() and SPSS

[Rd] lm() gives different results to lm.ridge() and SPSS

[Rd] lm() gives different results to lm.ridge() and SPSS

[Rd] lm() gives different results to lm.ridge() and SPSS

[Rd] lm() gives different results to lm.ridge() and SPSS

Apparently Analagous Threads