thr3ads.net - R help - [R] glmmLasso with interactions errors [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Walker Pedersen

2016-Jul-13 20:20 UTC

[R] glmmLasso with interactions errors

Hi Everyone,

I am having trouble running glmmLasso.

An abbreviated version of my dataset is here:

https://drive.google.com/open?id=0B_LliPDGUoZbVVFQS2VOV3hGN3c

Activity is a measure of brain activity, Novelty and Valence are
categorical variables coding the type of stimulus used to elicit the
response, ROI is a categorical variable coding three regions of the
brain that we have sampled this activity from, and STAIt is a
continuous measure representing degree of a specific personality trait
of the subjects. Subject is an ID number for the individuals the data
was sampled from.

Before glmmLasso I am running:

KNov$Subject <- factor(KNov$Subject)

to ensure the subject ID is not treated as a continuous variable.

If I run:

glm1 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
STAIt + as.factor(ROI)
+ as.factor(Valence):as.factor(ROI), list(Subject=~1), data = KNov, lambda=10)
summary(glm1)

I don't get any warning messages, but the output contains b estimates
only, no SE or p-values.

If I try to include a 3-way interaction, such as:

glm2 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
STAIt + as.factor(ROI)
+ as.factor(Novelty):as.factor(Valence):as.factor(ROI),
list(Subject=~1), data = Nov7T, lambda=10)
summary(glm2)

I get the warnings:

Warning messages:
1: In split.default((1:ncol(X))[-inotpen.which], ipen) :
  data length is not a multiple of split variable
2: In lambda_vec * sqrt(block2) :
  longer object length is not a multiple of shorter object length

And again, I do get parameter estimates, and no SE or p-values.

If I include my continuous variable in any interaction, such as:

glm3 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
STAIt + as.factor(ROI)
+ as.factor(Valence):as.factor(ROI) + as.factor(Novelty):STAIt,
list(Subject=~1), data = Nov7T, lambda=10)
summary(glm3)

I get the error message:

Error in rep(control$index[i], length.fac) : invalid 'times' argument

and no output.

If anyone has an input as to (1) why I am not getting SE or p-values
in my outputs (2) the meaning of there warnings I get when I include a
3-way variable, and if they are something to worry about, how to fix
them and (3) how to fix the error message I get when I include my
continuous factor in an interatction, I would be very appreciative.

Thanks!

Walker

Cade, Brian

2016-Jul-14 15:08 UTC

head link

[R] glmmLasso with interactions errors

It has never been obvious to me that the lasso approach can handle
interactions among predictor variables well at all.  I'ld be curious to see
what others think and what you learn.

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  cadeb at usgs.gov <brian_cade at usgs.gov>
tel:  970 226-9326


On Wed, Jul 13, 2016 at 2:20 PM, Walker Pedersen <wsp at uwm.edu> wrote:
> Hi Everyone,
>
> I am having trouble running glmmLasso.
>
> An abbreviated version of my dataset is here:
>
> https://drive.google.com/open?id=0B_LliPDGUoZbVVFQS2VOV3hGN3c
>
> Activity is a measure of brain activity, Novelty and Valence are
> categorical variables coding the type of stimulus used to elicit the
> response, ROI is a categorical variable coding three regions of the
> brain that we have sampled this activity from, and STAIt is a
> continuous measure representing degree of a specific personality trait
> of the subjects. Subject is an ID number for the individuals the data
> was sampled from.
>
> Before glmmLasso I am running:
>
> KNov$Subject <- factor(KNov$Subject)
>
> to ensure the subject ID is not treated as a continuous variable.
>
> If I run:
>
> glm1 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
> STAIt + as.factor(ROI)
> + as.factor(Valence):as.factor(ROI), list(Subject=~1), data = KNov,
> lambda=10)
> summary(glm1)
>
> I don't get any warning messages, but the output contains b estimates
> only, no SE or p-values.
>
> If I try to include a 3-way interaction, such as:
>
> glm2 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
> STAIt + as.factor(ROI)
> + as.factor(Novelty):as.factor(Valence):as.factor(ROI),
> list(Subject=~1), data = Nov7T, lambda=10)
> summary(glm2)
>
> I get the warnings:
>
> Warning messages:
> 1: In split.default((1:ncol(X))[-inotpen.which], ipen) :
>   data length is not a multiple of split variable
> 2: In lambda_vec * sqrt(block2) :
>   longer object length is not a multiple of shorter object length
>
> And again, I do get parameter estimates, and no SE or p-values.
>
> If I include my continuous variable in any interaction, such as:
>
> glm3 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
> STAIt + as.factor(ROI)
> + as.factor(Valence):as.factor(ROI) + as.factor(Novelty):STAIt,
> list(Subject=~1), data = Nov7T, lambda=10)
> summary(glm3)
>
> I get the error message:
>
> Error in rep(control$index[i], length.fac) : invalid 'times'
argument
>
> and no output.
>
> If anyone has an input as to (1) why I am not getting SE or p-values
> in my outputs (2) the meaning of there warnings I get when I include a
> 3-way variable, and if they are something to worry about, how to fix
> them and (3) how to fix the error message I get when I include my
> continuous factor in an interatction, I would be very appreciative.
>
> Thanks!
>
> Walker
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Ben Bolker

2016-Jul-15 14:23 UTC

head link

[R] glmmLasso with interactions errors

Cade, Brian <cadeb <at> usgs.gov> writes:
> 
> It has never been obvious to me that the lasso approach can handle
> interactions among predictor variables well at all. 
> I'ld be curious to see
> what others think and what you learn.
> 
> Brian
> 
  For what it's worth I think lasso *does* handle interactions
reasonably (although I forget where I read that) -- there is a
newer "hierarchical lasso" that tries to deal with marginality
concerns more carefully.

  Related questions asked on StackOverflow:

http://stackoverflow.com/questions/37910042/glmmlasso-warning-messages/
  37922918#37922918
(warning, broken URL)

My answer (in comments) there was

my guess is that you're going to have to build your own model
matrix/dummy variables; I think that as.factor() in formulas is
treated specially, so including the interaction term will probably
just confuse it. (It would be worth trying as.factor(Novelty:ROI) - I
doubt it'll work but if it does it would be the easiest way forward.)

> 
> On Wed, Jul 13, 2016 at 2:20 PM, Walker Pedersen <wsp <at>
uwm.edu> wrote:
[snip]
> >
> > An abbreviated version of my dataset is here:
> >
> > https://drive.google.com/open?id=0B_LliPDGUoZbVVFQS2VOV3hGN3c
> >
[snip snip]
> > Before glmmLasso I am running:
> >
> > KNov$Subject <- factor(KNov$Subject)
> >
> > to ensure the subject ID is not treated as a continuous variable.
> >
> > If I run:
> >
> > glm1 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence)
+
> > STAIt + as.factor(ROI)
> > + as.factor(Valence):as.factor(ROI), list(Subject=~1), data = KNov,
> > lambda=10)
> > summary(glm1)
> >
> > I don't get any warning messages, but the output contains b
estimates
> > only, no SE or p-values.
> >
> > If I try to include a 3-way interaction, such as:
> >
> > glm2 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence)
+
> > STAIt + as.factor(ROI)
> > + as.factor(Novelty):as.factor(Valence):as.factor(ROI),
> > list(Subject=~1), data = Nov7T, lambda=10)
> > summary(glm2)
> >
> > I get the warnings:
> >
> > Warning messages:
> > 1: In split.default((1:ncol(X))[-inotpen.which], ipen) :
> >   data length is not a multiple of split variable
> > 2: In lambda_vec * sqrt(block2) :
> >   longer object length is not a multiple of shorter object length
> >
> > And again, I do get parameter estimates, and no SE or p-values.
> >
> > If I include my continuous variable in any interaction, such as:
> >
> > glm3 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence)
+
> > STAIt + as.factor(ROI)
> > + as.factor(Valence):as.factor(ROI) + as.factor(Novelty):STAIt,
> > list(Subject=~1), data = Nov7T, lambda=10)
> > summary(glm3)
> >
> > I get the error message:
> >
> > Error in rep(control$index[i], length.fac) : invalid 'times'
argument
> >
> > and no output.
> >
> > If anyone has an input as to (1) why I am not getting SE or p-values
> > in my outputs (2) the meaning of there warnings I get when I include a
> > 3-way variable, and if they are something to worry about, how to fix
> > them and (3) how to fix the error message I get when I include my
> > continuous factor in an interatction, I would be very appreciative.

 [snip snip snip]

Walker Pedersen

2016-Jul-16 16:29 UTC

head link

[R] glmmLasso with interactions errors

Thank you for the input Brian and Ben.

It is odd how it seems to handle a two way interaction fine (as long
as the continuous variable is not in the mix), but not a 3-way.

In any case would anyone be able to give me a rundown of how I would
create a matrix/dummy variable for these interactions to input into
glmmLASSO?

Alternatively, is there a method for paring down a model that is a bit
less sketchy than simple backfitting, that you would expect to be more
straight forward software-wise?

Thanks!

Walker

UW-MKE

On Thu, Jul 14, 2016 at 10:08 AM, Cade, Brian <cadeb at usgs.gov>
wrote:> It has never been obvious to me that the lasso approach can handle
> interactions among predictor variables well at all.  I'ld be curious to
see
> what others think and what you learn.
>
> Brian
>
> Brian S. Cade, PhD
>
> U. S. Geological Survey
> Fort Collins Science Center
> 2150 Centre Ave., Bldg. C
> Fort Collins, CO  80526-8818
>
> email:  cadeb at usgs.gov
> tel:  970 226-9326
>
>
> On Wed, Jul 13, 2016 at 2:20 PM, Walker Pedersen <wsp at uwm.edu>
wrote:
>>
>> Hi Everyone,
>>
>> I am having trouble running glmmLasso.
>>
>> An abbreviated version of my dataset is here:
>>
>> https://drive.google.com/open?id=0B_LliPDGUoZbVVFQS2VOV3hGN3c
>>
>> Activity is a measure of brain activity, Novelty and Valence are
>> categorical variables coding the type of stimulus used to elicit the
>> response, ROI is a categorical variable coding three regions of the
>> brain that we have sampled this activity from, and STAIt is a
>> continuous measure representing degree of a specific personality trait
>> of the subjects. Subject is an ID number for the individuals the data
>> was sampled from.
>>
>> Before glmmLasso I am running:
>>
>> KNov$Subject <- factor(KNov$Subject)
>>
>> to ensure the subject ID is not treated as a continuous variable.
>>
>> If I run:
>>
>> glm1 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
>> STAIt + as.factor(ROI)
>> + as.factor(Valence):as.factor(ROI), list(Subject=~1), data = KNov,
>> lambda=10)
>> summary(glm1)
>>
>> I don't get any warning messages, but the output contains b
estimates
>> only, no SE or p-values.
>>
>> If I try to include a 3-way interaction, such as:
>>
>> glm2 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
>> STAIt + as.factor(ROI)
>> + as.factor(Novelty):as.factor(Valence):as.factor(ROI),
>> list(Subject=~1), data = Nov7T, lambda=10)
>> summary(glm2)
>>
>> I get the warnings:
>>
>> Warning messages:
>> 1: In split.default((1:ncol(X))[-inotpen.which], ipen) :
>>   data length is not a multiple of split variable
>> 2: In lambda_vec * sqrt(block2) :
>>   longer object length is not a multiple of shorter object length
>>
>> And again, I do get parameter estimates, and no SE or p-values.
>>
>> If I include my continuous variable in any interaction, such as:
>>
>> glm3 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
>> STAIt + as.factor(ROI)
>> + as.factor(Valence):as.factor(ROI) + as.factor(Novelty):STAIt,
>> list(Subject=~1), data = Nov7T, lambda=10)
>> summary(glm3)
>>
>> I get the error message:
>>
>> Error in rep(control$index[i], length.fac) : invalid 'times'
argument
>>
>> and no output.
>>
>> If anyone has an input as to (1) why I am not getting SE or p-values
>> in my outputs (2) the meaning of there warnings I get when I include a
>> 3-way variable, and if they are something to worry about, how to fix
>> them and (3) how to fix the error message I get when I include my
>> continuous factor in an interatction, I would be very appreciative.
>>
>> Thanks!
>>
>> Walker
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

R help - Jul 2016 - glmmLasso with interactions errors

[R] glmmLasso with interactions errors

[R] glmmLasso with interactions errors

[R] glmmLasso with interactions errors

[R] glmmLasso with interactions errors