thr3ads.net - R help - [R] dummy variable encoding [Mar 2009]

If this information is useful, please help other people find it:
Share via:

news at aleblanc.cotse.net

2009-Mar-05 14:38 UTC

[R] dummy variable encoding

Hi,
   can anyone tell me why an encoding of 1/2 for a dummy variable for
   two groups (e.g. gender) seems to be preferred over 0/1?
   It's been bugging me for a while, 0/1 seems more natural, but I have
   been told (without explanation) that 1/2 is better. Why?

-- 
aleblanc

Richard.Cotton at hsl.gov.uk

2009-Mar-05 15:49 UTC

head link

[R] dummy variable encoding

>    can anyone tell me why an encoding of 1/2 for a dummy variable for
>    two groups (e.g. gender) seems to be preferred over 0/1?
>    It's been bugging me for a while, 0/1 seems more natural, but I have
>    been told (without explanation) that 1/2 is better. Why?
The best encoding depends upon which language you would like to manipulate 
the variable in.  In R, genders are most naturally represented as factors. 
 That means that in an external data source (like a spreadsheet of data), 
you should ideally have the gender recorded as human-understandable text 
("male" and "female", or "M" and "F"). 
Once the data is read into R, by
default R will convert the string to factors (keeping the human readable 
labels).  This way you avoid having to remember that 1 means male (or 
whatever).

If you were manipulating the data in a different language that didn't have 
factors, then it might be more appropriate to use an integer.  Which 
integers you use doesn't matter, you need to have a look-up table to know 
what each number refers to, whatever you choose.

Regards,
Richie.

Mathematical Sciences Unit
HSL


------------------------------------------------------------------------
ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

Maybe Matching Threads

Search for more reasonably related threads

R help - Mar 2009 - dummy variable encoding

[R] dummy variable encoding

[R] dummy variable encoding

Maybe Matching Threads