Tanya Yatsunenko
2008-Apr-03 20:54 UTC
[R] coding for categorical variables with unequal observations
Hi, I am doing multiple regression, and have several X variables that are categorical. I read that I can use dummy or contrast codes for that, but are there any special rules when there're unequal #observations in each groups (4 females vs 7 males in a "gender" variable)? Also, can R generate these codes for me? THanks.
Nordlund, Dan (DSHS/RDA)
2008-Apr-03 21:45 UTC
[R] coding for categorical variables with unequal observations
> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Tanya Yatsunenko > Sent: Thursday, April 03, 2008 1:55 PM > To: r-help at r-project.org > Subject: [R] coding for categorical variables with unequal > observations > > Hi, > I am doing multiple regression, and have several X variables that are > categorical. > I read that I can use dummy or contrast codes for that, but are there > any special rules when there're unequal #observations in each > groups (4 > females vs 7 males in a "gender" variable)? > Also, can R generate these codes for me? > THanks. >You don't need to do anything special, and yes you can just let SAS do it for you. For most of the regression PROCs you can put your categorical variables in a CLASS statement. Depending on which procedure you are using, you may be able to specify whether you want effects or dummy coding, and which level of the categorical variable should be the "comparison" level. It is also possible to use PROC GLMMOD to create your design variables to be fed into other PROCs. Other approaches are possible as well. If you provide more detail on what analyses you plan to undertake, someone may be able to provide more specific advice. Hope this is helpful, Dan Daniel J. Nordlund Research and Data Analysis Washington State Department of Social and Health Services Olympia, WA 98504-5204
Tanya Yatsunenko
2008-Apr-04 00:04 UTC
[R] coding for categorical variables with unequal observations
Also, since I just started to use R, I have trouble generating and understanding some of the codes, especially choosing the correct ones. Thanks! tanya On Thu, Apr 3, 2008 at 3:54 PM, Tanya Yatsunenko <yata25@gmail.com> wrote:> Hi, > I am doing multiple regression, and have several X variables that are > categorical. > I read that I can use dummy or contrast codes for that, but are there any > special rules when there're unequal #observations in each groups (4 females > vs 7 males in a "gender" variable)? > Also, can R generate these codes for me? > THanks. > >-- Tanya [[alternative HTML version deleted]]