Fischer, Felix
2013-Jan-29 13:26 UTC
[R] starting values in glm(..., family = binomial(link = log))
Dear R-helpers,
i have a problem with a glm-model. I am trying to fit models with the log as
link function instead of the logit. However, in some cases glm fails to estimate
those models and suggests to give start values. However, when I set start =
coef(logistic_model) within the function call, glm still says it cannot find
starting values? This seems to be more of a problem, when I include a continous
predictor in the model (age instead of group). find below a minimal example.
Do I need to set other/better starting values? I would greatly appreciate any
hints!
Best, Felix
x = structure(list(Alter = c(28, 72, 48, 53, 49, 56, 47, 20, 72,
26, 28, 28, 25, 63, 42, 23, 68, 63, 44, 23, 23, 47,
30, 22, 21,
30, 26, 47, 40, 43, 23, 78, 29, 20, 49, 70, 24, 49,
43, 49, 68,
50, 42, 27, 70, 68, 46, 42, 40, 44, 48, 24, 23, 24,
56, 60, 66,
40, 71, 45, 37, 71, 41, 53, 48, 34, 52, 26, 76, 46,
65, 69, 75,
59, 30, 54, 69, 46, 50, 62, 38, 34, 30, 29, 73, 20,
57, 64, 40,
28, 21, 36, 65, 22, 69, 24, 38, 61, 70, 47, 61, 20,
58, 29, 35,
23, 29, 22, 21, 56, 37, 79, 27, 25, 75, 64, 22, 48,
36, 24, 44,
38, 23, 54, 76, 43, 30, 47, 48, 23, 68, 28, 44, 54,
43, 35, 47,
49, 44, 53, 26, 24, 56, 34, 39, 67, 74, 49, 55, 39,
58, 69, 46,
56, 69, 69, 26, 58, 41, 46, 40, 49, 24, 29, 24, 71,
41, 61, 27,
25, 38, 56, 26, 53, 39, 77, 40, 53, 61, 61, 54, 62,
28, 71, 42,
67, 44, 20, 40, 27, 27, 22, 71, 24, 31, 63, 24, 22,
30, 42, 43,
23, 46, 49, 21, 25, 30, 64, 29, 52, 29, 50, 57, 50,
53, 50, 34,
58, 42, 35, 50, 35, 35, 63, 42, 37, 64, 34, 56, 70,
48, 23, 43,
26, 52, 24, 31, 27, 34, 23, 44, 51, 41, 69, 47, 37,
68, 42, 28,
25), Arthrose = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L,
2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L,
1L, 1L, 1L, 1L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 2L,
1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L,
2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L,
1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L,
2L, 2L, 1L, 1L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L,
1L), .Label =
c("nicht erkrankt", "erkrankt"), class =
"factor"),
Gruppe = c(2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1,
1, 1, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2,
2, 2, 2, 2, 2, 2)), .Names = c("Alter",
"Arthrose", "Gruppe"
), row.names = c(6L, 8L, 49L, 53L, 54L, 84L, 87L,
88L, 110L,
139L, 145L, 156L, 167L, 176L,
177L, 178L, 189L, 193L, 216L, 237L,
245L, 269L, 272L, 280L, 303L,
314L, 315L, 326L, 338L, 344L, 345L,
352L, 365L, 366L, 377L, 393L,
401L, 404L, 409L, 439L, 469L, 505L,
510L, 544L, 552L, 559L, 561L,
586L, 597L, 598L, 601L, 607L, 611L,
630L, 650L, 663L, 672L, 673L,
689L, 690L, 719L, 747L, 794L, 809L,
818L, 819L, 840L, 869L, 878L,
886L, 905L, 913L, 915L, 924L, 937L,
955L, 963L, 970L, 978L, 985L,
997L, 1005L, 1021L, 1022L, 1033L,
1040L, 1041L, 1043L, 1066L,
1068L, 1084L, 1099L, 1112L, 1113L,
1125L, 1134L, 1154L, 1155L,
1166L, 1171L, 1195L, 1208L, 1216L,
1217L, 1229L, 1230L, 1236L,
1242L, 1252L, 1288L, 1308L, 1360L,
1365L, 1371L, 1383L, 1384L,
1402L, 1406L, 1412L, 1413L, 1438L,
1448L, 1451L, 1455L, 1459L,
1478L, 1483L, 1492L, 1508L, 1511L,
1519L, 1531L, 1554L, 1569L,
1573L, 1590L, 1615L, 1629L, 1649L,
1651L, 1654L, 1660L, 1661L,
1674L, 1684L, 1687L, 1690L, 1696L,
1724L, 1730L, 1767L, 1775L,
1779L, 1780L, 1800L, 1801L, 1829L,
1837L, 1848L, 1884L, 1909L,
1916L, 1933L, 1934L, 1952L, 1970L,
1991L, 2021L, 2024L, 2029L,
2040L, 2060L, 2095L, 2112L, 2115L,
2122L, 2131L, 2145L, 2150L,
2173L, 2188L, 2189L, 2193L, 2197L,
2240L, 2251L, 2252L, 2264L,
2266L, 2277L, 2313L, 2315L, 2318L,
2324L, 2331L, 2336L, 2344L,
2345L, 2357L, 2366L, 2384L, 2392L,
2413L, 2422L, 2453L, 2474L,
2477L, 2480L, 2484L, 2499L, 2502L,
2518L, 2548L, 2551L, 2565L,
2575L, 2584L, 2607L, 2608L, 2617L,
2620L, 2644L, 2653L, 2654L,
2655L, 2667L, 2669L, 2672L, 2686L,
2697L, 2733L, 2739L, 2742L,
2750L, 2764L, 2774L, 2783L, 2787L,
2807L, 2817L, 2841L, 2847L,
2850L, 2852L, 2860L, 2863L, 2889L,
2908L, 2917L, 2924L), class =
"data.frame")
Group_logit_model = glm(data = x, Arthrose ~ Gruppe, family=binomial(link =
logit))
Group_log_model = glm(data = x, Arthrose ~ Gruppe, family=binomial(link = log))
Age_logit_model = glm(data = x, Arthrose ~ Alter, family=binomial(link = logit))
Age_log_model = glm(data = x, Arthrose ~ Alter, family=binomial(link = log))
Age_log_model_start = glm(data = x, start = coef(Age_logit_model), Arthrose ~
Alter, family=binomial(link = log))
Dr. rer. nat. Felix Fischer
Diplom-Psychologe
Institut f?r Sozialmedizin, Epidemiologie und Gesundheits?konomie
Charit? - Universit?tsmedizin Berlin
Luisenstrasse 57
10117 Berlin
Tel: 030 450 529 104
Fax: 030 450 529 902
http://epidemiologie.charite.de
Ben Bolker
2013-Jan-29 15:27 UTC
[R] starting values in glm(..., family = binomial(link = log))
Fischer, Felix <Felix.Fischer <at> charite.de> writes:> > Dear R-helpers,> i have a problem with a glm-model. I am trying to fit models with > the log as link function instead of the logit. However, in some > cases glm fails to estimate those models and suggests to give start > values. However, when I set start = coef(logistic_model) within the > function call, glm still says it cannot find starting values? This > seems to be more of a problem, when I include a continous predictor > in the model (age instead of group). find below a minimal example.[Sorry for snipping context: gmane doesn't like it] Group_logit_model = glm(data = x, Arthrose ~ Gruppe, family=binomial(link = logit)) Group_log_model = glm(data = x, Arthrose ~ Gruppe, family=binomial(link = log)) Age_logit_model = glm(data = x, Arthrose ~ Alter, family=binomial(link = logit)) Age_log_model = glm(data = x, Arthrose ~ Alter, family=binomial(link = log), start=c(coef(Group_log_model)[1],0)) Using the intercept from the group_log model combined with 0 for the log-slope appears to work. It makes more sense to use this than to use the results from a logit fit (as you tried), because those parameters would be on a different scale. Another possibility for the starting intercept value would be the coefficient of a null model with a log-link: Null_log_model = glm(data = x, Arthrose ~ 1, family=binomial(link = log))