thr3ads.net - R help - [R] Problem extracting enough coefs from gam (mgcv package) [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Martijn Wieling

2012-Apr-23 17:26 UTC

[R] Problem extracting enough coefs from gam (mgcv package)

Dear useRs,

I have used using the excellent mgcv package (version 1.7-12) to
create a generalized additive model (gam) including random effects -
represented with s(...,bs="re") - on the basis of dialect data.

My model contains two random-effect factors (Word and Key - the latter
representing a speaker) and I have added both random intercepts and
various random slopes for these random-effect factors. There is no
missing data in my dataset. When I try to extract the by-word random
intercepts from my model, using coef(model), I find 357 values, equal
to the number of words in my dataset. Using coef(model) I get
uninformative names: s(Word,1) until s(Word,357), but I'm assuming (I
might be wrong though?) that I can link the labels of the words to
these values by obtaining the 357 labels from the original dataset:
unique(dat[,c("Word")])

Unfortunately, I cannot use this procedure to label the by-word random
slopes, because I find a varying number of values for these (ranging
from 346 to 356) which is always less than 357. (The number of
by-speaker random slopes does equal the number of speakers, though.)

Does anybody i) have an idea why I obtain fewer by-word random slopes
than words, and/or ii) how I can link the random slopes which are
present to the correct labels of the words?

(I did not include the model as it is >300 MB in size, but let me know
if this is necessary.)

Any help would be greatly appreciated!

With kind regards,
Martijn Wieling
University of Groningen
http://www.martijnwieling.nl

Simon Wood

2012-Apr-24 08:50 UTC

head link

[R] Problem extracting enough coefs from gam (mgcv package)

Martijn,

It's a bit hard to know without seeing the full model structure, but 
it's possible that the issue is related to an undesirable side effect of 
the handling of identifiability constraints on smooth terms, prior to 
mgcv 1.7-13: the standard side constraint approach used for smooths 
could lead to unexpected constraints being applied to s(...,bs="re") 
terms in some cases.

So, could you sent me the gam call that generates the problem, and 
perhaps try out if it still happens in 1.7-13?

best,
Simon

On 23/04/12 18:26, Martijn Wieling wrote:> Dear useRs,
>
> I have used using the excellent mgcv package (version 1.7-12) to
> create a generalized additive model (gam) including random effects -
> represented with s(...,bs="re") - on the basis of dialect data.
>
> My model contains two random-effect factors (Word and Key - the latter
> representing a speaker) and I have added both random intercepts and
> various random slopes for these random-effect factors. There is no
> missing data in my dataset. When I try to extract the by-word random
> intercepts from my model, using coef(model), I find 357 values, equal
> to the number of words in my dataset. Using coef(model) I get
> uninformative names: s(Word,1) until s(Word,357), but I'm assuming (I
> might be wrong though?) that I can link the labels of the words to
> these values by obtaining the 357 labels from the original dataset:
> unique(dat[,c("Word")])
>
> Unfortunately, I cannot use this procedure to label the by-word random
> slopes, because I find a varying number of values for these (ranging
> from 346 to 356) which is always less than 357. (The number of
> by-speaker random slopes does equal the number of speakers, though.)
>
> Does anybody i) have an idea why I obtain fewer by-word random slopes
> than words, and/or ii) how I can link the random slopes which are
> present to the correct labels of the words?
>
> (I did not include the model as it is>300 MB in size, but let me know
> if this is necessary.)
>
> Any help would be greatly appreciated!
>
> With kind regards,
> Martijn Wieling
> University of Groningen
> http://www.martijnwieling.nl
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603               http://people.bath.ac.uk/sw283

Martijn Wieling

2012-Apr-24 09:22 UTC

head link

[R] Problem extracting enough coefs from gam (mgcv package)

Hi Simon,

Thanks for your quick reply. I'm now running the model again with mgcv
1.7-13. This might take some time (half a day or so) as the dataset is
quite large (112,608 rows).
The call I've used was (I've simplified some variable names):

model = bam(LingDist ~ s(Lon,Lat) + VowelRatio + IsDem + WordLength +
SpBirthYear + IsAragon + SpBirthYear_IsAragon + PopCnt +
s(Word,bs="re") + s(Speaker,bs="re") +
s(Word,SpBirthYear,bs="re") +
s(Word,IsAragon,bs="re") + s(Word,PopCnt,bs="re") +
s(Speaker,VowelRatio,bs="re") + s(Speaker,IsDem,bs="re") +
s(Speaker,WordLength,bs="re") + s(Word,Tourism,bs="re") +
s(Word,PopAge,bs="re")+ s(Word,PopIncome,bs="re") +
s(Word,SpEdu,bs="re") +
s(Word,SpBirthYear_IsAragon,bs="re"),
data=dat)

I'll post the results w.r.t. the random slopes.

My procedure to assign labels when the number of slope estimates
equals the number of words is correct: rownames(slopes)
unique(dat[,c("Word")])?

With kind regards,
Martijn


On 24/04/12 10:50, Simon Wood wrote:> Martijn,
>
> It's a bit hard to know without seeing the full model structure, but
> it's possible that the issue is related to an undesirable side effect
of
> the handling of identifiability constraints on smooth terms, prior to
> mgcv 1.7-13: the standard side constraint approach used for smooths
> could lead to unexpected constraints being applied to
s(...,bs="re")
> terms in some cases.
>
> So, could you sent me the gam call that generates the problem, and
> perhaps try out if it still happens in 1.7-13?
>
> best,
> Simon
>
> On 23/04/12 18:26, Martijn Wieling wrote:
>> Dear useRs,
>>
>> I have used using the excellent mgcv package (version 1.7-12) to
>> create a generalized additive model (gam) including random effects -
>> represented with s(...,bs="re") - on the basis of dialect
data.
>>
>> My model contains two random-effect factors (Word and Key - the latter
>> representing a speaker) and I have added both random intercepts and
>> various random slopes for these random-effect factors. There is no
>> missing data in my dataset. When I try to extract the by-word random
>> intercepts from my model, using coef(model), I find 357 values, equal
>> to the number of words in my dataset. Using coef(model) I get
>> uninformative names: s(Word,1) until s(Word,357), but I'm assuming
(I
>> might be wrong though?) that I can link the labels of the words to
>> these values by obtaining the 357 labels from the original dataset:
>> unique(dat[,c("Word")])
>>
>> Unfortunately, I cannot use this procedure to label the by-word random
>> slopes, because I find a varying number of values for these (ranging
>> from 346 to 356) which is always less than 357. (The number of
>> by-speaker random slopes does equal the number of speakers, though.)
>>
>> Does anybody i) have an idea why I obtain fewer by-word random slopes
>> than words, and/or ii) how I can link the random slopes which are
>> present to the correct labels of the words?
>>
>> (I did not include the model as it is>300 MB in size, but let me
know
>> if this is necessary.)
>>
>> Any help would be greatly appreciated!
>>
>> With kind regards,
>> Martijn Wieling
>> University of Groningen
>> http://www.martijnwieling.nl
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

Seemingly Similar Threads

Search for more seemingly similar threads

R help - Apr 2012 - Problem extracting enough coefs from gam (mgcv package)

[R] Problem extracting enough coefs from gam (mgcv package)

[R] Problem extracting enough coefs from gam (mgcv package)

[R] Problem extracting enough coefs from gam (mgcv package)

Seemingly Similar Threads