Hi all, I've been using the randomForest package on a dataset (described later) and my problem is: even though I specify proximity= TRUE in the call I get a NULL proximity matrix. Any thoughts on why that may happen? Unfortunately I can't post my dataset, which is particularly problematic here since i believe that's where the problem is. So I'll try to give as detailed of an account as i can. The outcome is binary, highly skewed with the positive outcome being 1.5% of the data. The dataset has ~7000 observations and 200 predictors. these are either 2 level factors or continuous variables. Extremely sparse. Here is my call: #i pass a balanced dataset for each tree, to deal with the skewed outcome. rf<-randomForest(y~. ,data=train, ntree=800,replace=TRUE,sampsize = c(112, 112), proximilty=TRUE) Any ideas on why im getting a null proximity measure/ solutions? Thanks! [[alternative HTML version deleted]]