Fischer, Felix
2013-Jan-29 13:26 UTC
[R] starting values in glm(..., family = binomial(link = log))
Dear R-helpers, i have a problem with a glm-model. I am trying to fit models with the log as link function instead of the logit. However, in some cases glm fails to estimate those models and suggests to give start values. However, when I set start = coef(logistic_model) within the function call, glm still says it cannot find starting values? This seems to be more of a problem, when I include a continous predictor in the model (age instead of group). find below a minimal example. Do I need to set other/better starting values? I would greatly appreciate any hints! Best, Felix x = structure(list(Alter = c(28, 72, 48, 53, 49, 56, 47, 20, 72, 26, 28, 28, 25, 63, 42, 23, 68, 63, 44, 23, 23, 47, 30, 22, 21, 30, 26, 47, 40, 43, 23, 78, 29, 20, 49, 70, 24, 49, 43, 49, 68, 50, 42, 27, 70, 68, 46, 42, 40, 44, 48, 24, 23, 24, 56, 60, 66, 40, 71, 45, 37, 71, 41, 53, 48, 34, 52, 26, 76, 46, 65, 69, 75, 59, 30, 54, 69, 46, 50, 62, 38, 34, 30, 29, 73, 20, 57, 64, 40, 28, 21, 36, 65, 22, 69, 24, 38, 61, 70, 47, 61, 20, 58, 29, 35, 23, 29, 22, 21, 56, 37, 79, 27, 25, 75, 64, 22, 48, 36, 24, 44, 38, 23, 54, 76, 43, 30, 47, 48, 23, 68, 28, 44, 54, 43, 35, 47, 49, 44, 53, 26, 24, 56, 34, 39, 67, 74, 49, 55, 39, 58, 69, 46, 56, 69, 69, 26, 58, 41, 46, 40, 49, 24, 29, 24, 71, 41, 61, 27, 25, 38, 56, 26, 53, 39, 77, 40, 53, 61, 61, 54, 62, 28, 71, 42, 67, 44, 20, 40, 27, 27, 22, 71, 24, 31, 63, 24, 22, 30, 42, 43, 23, 46, 49, 21, 25, 30, 64, 29, 52, 29, 50, 57, 50, 53, 50, 34, 58, 42, 35, 50, 35, 35, 63, 42, 37, 64, 34, 56, 70, 48, 23, 43, 26, 52, 24, 31, 27, 34, 23, 44, 51, 41, 69, 47, 37, 68, 42, 28, 25), Arthrose = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L), .Label = c("nicht erkrankt", "erkrankt"), class = "factor"), Gruppe = c(2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), .Names = c("Alter", "Arthrose", "Gruppe" ), row.names = c(6L, 8L, 49L, 53L, 54L, 84L, 87L, 88L, 110L, 139L, 145L, 156L, 167L, 176L, 177L, 178L, 189L, 193L, 216L, 237L, 245L, 269L, 272L, 280L, 303L, 314L, 315L, 326L, 338L, 344L, 345L, 352L, 365L, 366L, 377L, 393L, 401L, 404L, 409L, 439L, 469L, 505L, 510L, 544L, 552L, 559L, 561L, 586L, 597L, 598L, 601L, 607L, 611L, 630L, 650L, 663L, 672L, 673L, 689L, 690L, 719L, 747L, 794L, 809L, 818L, 819L, 840L, 869L, 878L, 886L, 905L, 913L, 915L, 924L, 937L, 955L, 963L, 970L, 978L, 985L, 997L, 1005L, 1021L, 1022L, 1033L, 1040L, 1041L, 1043L, 1066L, 1068L, 1084L, 1099L, 1112L, 1113L, 1125L, 1134L, 1154L, 1155L, 1166L, 1171L, 1195L, 1208L, 1216L, 1217L, 1229L, 1230L, 1236L, 1242L, 1252L, 1288L, 1308L, 1360L, 1365L, 1371L, 1383L, 1384L, 1402L, 1406L, 1412L, 1413L, 1438L, 1448L, 1451L, 1455L, 1459L, 1478L, 1483L, 1492L, 1508L, 1511L, 1519L, 1531L, 1554L, 1569L, 1573L, 1590L, 1615L, 1629L, 1649L, 1651L, 1654L, 1660L, 1661L, 1674L, 1684L, 1687L, 1690L, 1696L, 1724L, 1730L, 1767L, 1775L, 1779L, 1780L, 1800L, 1801L, 1829L, 1837L, 1848L, 1884L, 1909L, 1916L, 1933L, 1934L, 1952L, 1970L, 1991L, 2021L, 2024L, 2029L, 2040L, 2060L, 2095L, 2112L, 2115L, 2122L, 2131L, 2145L, 2150L, 2173L, 2188L, 2189L, 2193L, 2197L, 2240L, 2251L, 2252L, 2264L, 2266L, 2277L, 2313L, 2315L, 2318L, 2324L, 2331L, 2336L, 2344L, 2345L, 2357L, 2366L, 2384L, 2392L, 2413L, 2422L, 2453L, 2474L, 2477L, 2480L, 2484L, 2499L, 2502L, 2518L, 2548L, 2551L, 2565L, 2575L, 2584L, 2607L, 2608L, 2617L, 2620L, 2644L, 2653L, 2654L, 2655L, 2667L, 2669L, 2672L, 2686L, 2697L, 2733L, 2739L, 2742L, 2750L, 2764L, 2774L, 2783L, 2787L, 2807L, 2817L, 2841L, 2847L, 2850L, 2852L, 2860L, 2863L, 2889L, 2908L, 2917L, 2924L), class = "data.frame") Group_logit_model = glm(data = x, Arthrose ~ Gruppe, family=binomial(link = logit)) Group_log_model = glm(data = x, Arthrose ~ Gruppe, family=binomial(link = log)) Age_logit_model = glm(data = x, Arthrose ~ Alter, family=binomial(link = logit)) Age_log_model = glm(data = x, Arthrose ~ Alter, family=binomial(link = log)) Age_log_model_start = glm(data = x, start = coef(Age_logit_model), Arthrose ~ Alter, family=binomial(link = log)) Dr. rer. nat. Felix Fischer Diplom-Psychologe Institut f?r Sozialmedizin, Epidemiologie und Gesundheits?konomie Charit? - Universit?tsmedizin Berlin Luisenstrasse 57 10117 Berlin Tel: 030 450 529 104 Fax: 030 450 529 902 http://epidemiologie.charite.de
Ben Bolker
2013-Jan-29 15:27 UTC
[R] starting values in glm(..., family = binomial(link = log))
Fischer, Felix <Felix.Fischer <at> charite.de> writes:> > Dear R-helpers,> i have a problem with a glm-model. I am trying to fit models with > the log as link function instead of the logit. However, in some > cases glm fails to estimate those models and suggests to give start > values. However, when I set start = coef(logistic_model) within the > function call, glm still says it cannot find starting values? This > seems to be more of a problem, when I include a continous predictor > in the model (age instead of group). find below a minimal example.[Sorry for snipping context: gmane doesn't like it] Group_logit_model = glm(data = x, Arthrose ~ Gruppe, family=binomial(link = logit)) Group_log_model = glm(data = x, Arthrose ~ Gruppe, family=binomial(link = log)) Age_logit_model = glm(data = x, Arthrose ~ Alter, family=binomial(link = logit)) Age_log_model = glm(data = x, Arthrose ~ Alter, family=binomial(link = log), start=c(coef(Group_log_model)[1],0)) Using the intercept from the group_log model combined with 0 for the log-slope appears to work. It makes more sense to use this than to use the results from a logit fit (as you tried), because those parameters would be on a different scale. Another possibility for the starting intercept value would be the coefficient of a null model with a log-link: Null_log_model = glm(data = x, Arthrose ~ 1, family=binomial(link = log))