Rafael Costa
2015-May-07 18:43 UTC
[R] Getting INDIVIDUAL effects of multiple qualitative variables (ordered and unordered factors)
Dear R users, I have data from a questionnaire and I want to estimate the individual effect of each explanatory variable (all are qualitative) on the dependent variable (continuous). However, the default is to consider the estimated coefficients as the difference between the reference group (estimated value of the intercept) and the coefficient of the group. Each qualitative variable relates to a characteristic of a particular activity and the continuous variable is the time taken to perform this activity. I emphasize that the reference level of each factor relates to the case where none of the options for that factor was marked. The data is in " http://www.datafilehost.com/d/c7f0d342". I did not put them in the script, because I still do not know how to do this, but I hope this is not a problem (and I ask my sincere apologies). I do not put just a sample of the data, since there was singular matrix problems. First (and main) issue - In order to obtain the individual effect of the levels of each factor, I considered that the reference group has zero effect and I did the following steps: # Since the file was not loaded in the script, it is assumed here that it was downloaded from the internet and is already loaded in R. # I will make a quantile regression, so the package follows. install.packages (quantreg) library (quantreg) # Transforming factors into individual objects: p_1 = table (1: length (tabela1.1 $ p1), as.factor (tabela1.1 $ p1)) p_21 = table (1: length (tabela1.1 $ p21), as.factor (tabela1.1 $ p21)) p_22 = table (1: length (tabela1.1 $ p22), as.factor (tabela1.1 $ p22)) p_23 = table (1: length (tabela1.1 $ p23), as.factor (tabela1.1 $ p23)) p_24 = table (1: length (tabela1.1 $ p24), as.factor (tabela1.1 $ p24)) p_25 = table (1: length (tabela1.1 $ p25), as.factor (tabela1.1 $ p25)) p_34 = table (1: length (tabela1.1 $ p34), as.ordered (tabela1.1 $ p34)) p_5 = table (1: length (tabela1.1 $ p5), as.ordered (tabela1.1 $ p5)) p_6 = table (1: length (tabela1.1 $ p6), as.ordered (tabela1.1 $ p6)) p_7 = table (1: length (tabela1.1 $ p7), as.ordered (tabela1.1 $ p7)) p_8 = table (1: length (tabela1.1 $ p8), as.ordered (tabela1.1 $ p8)) p_9 = table (1: length (tabela1.1 $ p9), as.ordered (tabela1.1 $ p9)) # Regressing the model without intercept, but considering that the reference group = 0, considering that the reference group means that none of the factors has been marked (if any was marked, I believe that the time taken to perform the activity is practically zero). qrModel=rq(data=tabela1.1, pontoefetivo ~ 0 + p_1[,-1] + p_21[,-1] + p_22[,-1] + p_23[,-1] + p_24[,-1] + p_25[,-1] + p_34[,-1] + p_5[,-1] + p_6[,-1] + p_7[,-1] + p_8[,-1] + p_9[,-1], tau=0.5) summary(qrModel) My idea was that since the effect of the reference group is zero, the estimated coefficient of each level is precisely the individual effect of the chosen variable level. My idea is right? If not, what do I do to get these individual effects? Problem 2 - Assuming all is right above, ordered factors not have increasing effects [See summary (qrModel)]. But should not they have? If so, what do I do to ensure such an effect? Problem 3 - Again assuming that everything is correct, I hope that any estimated coefficients (individual effects on the runtime of the activity) are not negative values. Am I right about that? If so, what do I do to ensure that all values ??are not negative? I am looking forward any help. Thanks in advance , Rafael Costa. [[alternative HTML version deleted]]
Richard M. Heiberger
2015-May-07 21:50 UTC
[R] Getting INDIVIDUAL effects of multiple qualitative variables (ordered and unordered factors)
## I think this is what you are looking for. ## Your download host seems to want to give me software, so I am not taking it. tmp <- data.frame(y=rnorm(20), a=factor(rep(letters[1:4], each=5))) tmp.aov <- aov(y ~ a, data=tmp) summary(tmp.aov) summary(tmp.aov, split=list(a=list(b=1, c=2, d=3))) summary.lm(tmp.aov) Rich On Thu, May 7, 2015 at 2:43 PM, Rafael Costa <rafaelcarneirocosta.rc at gmail.com> wrote:> Dear R users, > > I have data from a questionnaire and I want to estimate the individual > effect of each explanatory variable (all are qualitative) on the dependent > variable (continuous). However, the default is to consider the estimated > coefficients as the difference between the reference group (estimated value > of the intercept) and the coefficient of the group. Each qualitative > variable relates to a characteristic of a particular activity and the > continuous variable is the time taken to perform this activity. I emphasize > that the reference level of each factor relates to the case where none of > the options for that factor was marked. The data is in " > http://www.datafilehost.com/d/c7f0d342". I did not put them in the script, > because I still do not know how to do this, but I hope this is not a > problem (and I ask my sincere apologies). I do not put just a sample of the > data, since there was singular matrix problems. > > > > First (and main) issue - In order to obtain the individual effect of the > levels of each factor, I considered that the reference group has zero > effect and I did the following steps: > > > > # Since the file was not loaded in the script, it is assumed here that it > was downloaded from the internet and is already loaded in R. > > # I will make a quantile regression, so the package follows. > > install.packages (quantreg) > > library (quantreg) > > # Transforming factors into individual objects: > > p_1 = table (1: length (tabela1.1 $ p1), as.factor (tabela1.1 $ p1)) > > p_21 = table (1: length (tabela1.1 $ p21), as.factor (tabela1.1 $ p21)) > > p_22 = table (1: length (tabela1.1 $ p22), as.factor (tabela1.1 $ p22)) > > p_23 = table (1: length (tabela1.1 $ p23), as.factor (tabela1.1 $ p23)) > > p_24 = table (1: length (tabela1.1 $ p24), as.factor (tabela1.1 $ p24)) > > p_25 = table (1: length (tabela1.1 $ p25), as.factor (tabela1.1 $ p25)) > > p_34 = table (1: length (tabela1.1 $ p34), as.ordered (tabela1.1 $ p34)) > > p_5 = table (1: length (tabela1.1 $ p5), as.ordered (tabela1.1 $ p5)) > > p_6 = table (1: length (tabela1.1 $ p6), as.ordered (tabela1.1 $ p6)) > > p_7 = table (1: length (tabela1.1 $ p7), as.ordered (tabela1.1 $ p7)) > > p_8 = table (1: length (tabela1.1 $ p8), as.ordered (tabela1.1 $ p8)) > > p_9 = table (1: length (tabela1.1 $ p9), as.ordered (tabela1.1 $ p9)) > > # Regressing the model without intercept, but considering that the > reference group = 0, considering that the reference group means that none > of the factors has been marked (if any was marked, I believe that the time > taken to perform the activity is practically zero). > > qrModel=rq(data=tabela1.1, pontoefetivo ~ 0 + p_1[,-1] + p_21[,-1] + > p_22[,-1] + p_23[,-1] + p_24[,-1] + p_25[,-1] + p_34[,-1] + p_5[,-1] + > p_6[,-1] + p_7[,-1] + p_8[,-1] + p_9[,-1], tau=0.5) > > summary(qrModel) > > My idea was that since the effect of the reference group is zero, the > estimated coefficient of each level is precisely the individual effect of > the chosen variable level. My idea is right? If not, what do I do to get > these individual effects? > > > > Problem 2 - Assuming all is right above, ordered factors not have > increasing effects [See summary (qrModel)]. But should not they have? If > so, what do I do to ensure such an effect? > > > > Problem 3 - Again assuming that everything is correct, I hope that any > estimated coefficients (individual effects on the runtime of the activity) > are not negative values. Am I right about that? If so, what do I do to > ensure that all values are not negative? > > > I am looking forward any help. > > Thanks in advance , > > Rafael Costa. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Rafael Costa
2015-May-08 20:07 UTC
[R] Getting INDIVIDUAL effects of multiple qualitative variables (ordered and unordered factors)
Dear Richard, I really appreciate your help. ## Your download host seems to want to give me software, so I am not taking> it. >*To download my file, please uncheck the "Use our download manager and get recommended downloads" option. *But If you prefer, I might send my file attached by email. #In fact, I wish calculate library (quantreg) qrModel2=rq(data=tabela1.1, pontoefetivo ~ p1 + p21 + p22 + p23 + p24 + p25 + p34 + p5 + p6 + p7 + p8 + p9, tau=0.5) summary(qrModel2) #But I have to suppress the intercept and consider that the reference group is also zero. #When I make qrModel3=rq(data=tabela1.1, pontoefetivo ~ 0+ p1 + p21 + p22 + p23 + p24 + p25 + p34 + p5 + p6 + p7 + p8 + p9, tau=0.5) summary(qrModel3) #The value of the reference group reappears in the first estimated coefficient. Is there any way to do this? Thanks in advance , Rafael Costa. [[alternative HTML version deleted]]