Does anyone know if there are any special considerations with Random Forest and correlated fields or rather derived fields? For example if we are trying to predict who might leave our company to go work for another company some of the variables we may look at are below (in addition to others). Do we need to be cautious with comingling these especially since, for example with Age variable, all are based on the same variable: birthdate? Or rollup fields: Age rolls up to "Age Cohorts" and "Age Cohorts" rolls up to "Age Career Cohort"? - BIRTHDATE BASED VARIABLES 1. Age 2. Age Cohorts (i.e. 20-30, 30-40 yrs old, etc) 3. Age Career Cohort (similar to above but wider bin i.e ("Early (Age <35)", "Mid (Age 35 -49", etc) 4. Birth year (probably not in R since more than 32 categories) 5. Generation (i.e. Boomers, Generation X, Y, etc) #all categorical variables except 'Birth year' and Age - Hire Date BASED VARIABLES 6. Years of Service 7. Years of service chorts Or even, for example age and service are correlated (r~.57).? Daniel Lopez Workforce Analyst HRIM - Workforce Analytics & Metrics Strategic Human Resources Management [[alternative HTML version deleted]]