laura roncaglia
2016-Sep-20 10:04 UTC
[R] Run a fixed effect regression and a logit regression on a national survey that need to be "weighted"
I am a beginner user of R. I am using a national survey to test what variables influence the partecipation in complementary pensions (the partecipation in complementary pension is voluntary in my country). Since the dependent variable is a dummy (1 if the person partecipate and 0 otherwise) I want to run a logit or probit regression; moreover I want to run a fixed effect regression since I subset the survey in order to have only the individuals interviewed more than one time. The data frame is composed by several social and economical variables and it also contain a variable "weight" which is the survey weight (they are weighting coefficients to adjust the results of the sample to the national data). family pers sex income pension1 10 1 F 10000 12 20 1 F 20000 13 20 2 M 40000 04 30 1 M 25000 05 30 2 F 50000 06 40 1 M 60000 1 pers is the component of the family and pension takes 1 if the person partecipate to complementary pension (it is a semplification of the original survey, which contains more variables and observation (aroun 22k observations)). I know how to use the plm and glm functions for a fixed effect or logit regressoin; in this case I don't know what to do since I need to take account of the survey weights. I used the svydesing function to "weight" the data frame: df1 <- svydesign(ids=~1, data=df, weights=~dfweight) I used ids=~1 because there isn't a "cluster" variable in the survey (I know that the towns are ramdomly selected and then individuals are ramdomly selected, but there isn't a variable that indicate the stratification). At this point I am lost: I don't know if it is right to use the survey package and then what function use to run the regression, or there is a way to use the plm or glm functions taking account of the weights. I tried so hard to search a solution on the website but if you could give me an answer I'd be glad. [[alternative HTML version deleted]]
Adams, Jean
2016-Sep-20 16:23 UTC
[R] Run a fixed effect regression and a logit regression on a national survey that need to be "weighted"
If you want your records to be weighted by the survey weights during the analysis, then use the weights= argument of the glm() function. Jean On Tue, Sep 20, 2016 at 5:04 AM, laura roncaglia <roncaglia.laura at gmail.com> wrote:> I am a beginner user of R. I am using a national survey to test what > variables influence the partecipation in complementary pensions (the > partecipation in complementary pension is voluntary in my country). > > Since the dependent variable is a dummy (1 if the person partecipate and 0 > otherwise) I want to run a logit or probit regression; moreover I want to > run a fixed effect regression since I subset the survey in order to have > only the individuals interviewed more than one time. > > The data frame is composed by several social and economical variables and > it also contain a variable "weight" which is the survey weight (they are > weighting coefficients to adjust the results of the sample to the national > data). > > family pers sex income pension1 10 1 F 10000 12 > 20 1 F 20000 13 20 2 M 40000 04 30 > 1 M 25000 05 30 2 F 50000 06 40 1 M > 60000 1 > > pers is the component of the family and pension takes 1 if the person > partecipate to complementary pension (it is a semplification of the > original survey, which contains more variables and observation (aroun 22k > observations)). > > I know how to use the plm and glm functions for a fixed effect or logit > regressoin; in this case I don't know what to do since I need to take > account of the survey weights. > > I used the svydesing function to "weight" the data frame: > > df1 <- svydesign(ids=~1, data=df, weights=~dfweight) > > I used ids=~1 because there isn't a "cluster" variable in the survey (I > know that the towns are ramdomly selected and then individuals are ramdomly > selected, but there isn't a variable that indicate the stratification). > > At this point I am lost: I don't know if it is right to use the survey > package and then what function use to run the regression, or there is a way > to use the plm or glm functions taking account of the weights. > > I tried so hard to search a solution on the website but if you could give > me an answer I'd be glad. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]
laura roncaglia
2016-Sep-21 06:42 UTC
[R] Run a fixed effect regression and a logit regression on a national survey that need to be "weighted"
Thank you for the answer but I had already tried that way; when I introduce weights in the glm appears the error: Warning: non-integer #successes in a binomial glm! I tried to run the glm regression using the family quasibinomial: eq <- glm(pip ~ men + age_pr + age_c + I(age_pr^2) + I(age_c^2), weights = dfweights, data = df, family = quasibinomial(link "logit")) Do you think it could be a right solution? 2016-09-20 18:23 GMT+02:00 Adams, Jean <jvadams at usgs.gov>:> If you want your records to be weighted by the survey weights during the > analysis, then use the weights= argument of the glm() function. > > Jean > > On Tue, Sep 20, 2016 at 5:04 AM, laura roncaglia < > roncaglia.laura at gmail.com> wrote: > >> I am a beginner user of R. I am using a national survey to test what >> variables influence the partecipation in complementary pensions (the >> partecipation in complementary pension is voluntary in my country). >> >> Since the dependent variable is a dummy (1 if the person partecipate and 0 >> otherwise) I want to run a logit or probit regression; moreover I want to >> run a fixed effect regression since I subset the survey in order to have >> only the individuals interviewed more than one time. >> >> The data frame is composed by several social and economical variables and >> it also contain a variable "weight" which is the survey weight (they are >> weighting coefficients to adjust the results of the sample to the national >> data). >> >> family pers sex income pension1 10 1 F 10000 12 >> 20 1 F 20000 13 20 2 M 40000 04 30 >> 1 M 25000 05 30 2 F 50000 06 40 1 M >> 60000 1 >> >> pers is the component of the family and pension takes 1 if the person >> partecipate to complementary pension (it is a semplification of the >> original survey, which contains more variables and observation (aroun 22k >> observations)). >> >> I know how to use the plm and glm functions for a fixed effect or logit >> regressoin; in this case I don't know what to do since I need to take >> account of the survey weights. >> >> I used the svydesing function to "weight" the data frame: >> >> df1 <- svydesign(ids=~1, data=df, weights=~dfweight) >> >> I used ids=~1 because there isn't a "cluster" variable in the survey (I >> know that the towns are ramdomly selected and then individuals are >> ramdomly >> selected, but there isn't a variable that indicate the stratification). >> >> At this point I am lost: I don't know if it is right to use the survey >> package and then what function use to run the regression, or there is a >> way >> to use the plm or glm functions taking account of the weights. >> >> I tried so hard to search a solution on the website but if you could give >> me an answer I'd be glad. >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> >[[alternative HTML version deleted]]