similar to: different randomForest performance for same data

Displaying 20 results from an estimated 200 matches similar to: "different randomForest performance for same data"

2008 Sep 24
4
rowSums()
Say I have the following data: testDat <- data.frame(A = c(1,NA,3), B = c(NA, NA, 3)) > testDat A B 1 1 NA 2 NA NA 3 3 3 rowsums() with na.rm=TRUE generates the following, which is not desired: > rowSums(testDat[, c('A', 'B')], na.rm=T) [1] 1 0 6 rowsums() with na.rm=F generates the following, which is also not desired: > rowSums(testDat[, c('A',
2011 Jan 17
1
Replacing rows in a data frame
R-helpers, Below is a simple example of some output that I am getting while trying to work with a data frame in R 2.12.1 for Mac. ----- > testdat <- data.frame(matrix(ncol=10, nrow=10)) > colnames(testdat) <- c('a','b','c','d','e','f','g','h','i','j') > testdat[seq(1,10,3),] <-
2009 Jan 22
4
dimnames in pkg "ipred"
Hello List, I`m trying to make prediction using a bagged tree with the package ipred. I tried to follow the manual but I`m getting an error message. Also browsing through the list-archive I didn`t find any hint. Maybe someone can help me? selbag <- bagging(SOIL_UNIT ~., data=traindat.bin, coob=TRUE) Error in dimnames(X) <- list(dn[[1L]], unlist(collabs, use.names = FALSE)) :
2010 Sep 29
2
resampling issue
I am trying to get R to resample my dataset of two columns of age and length data for fish. I got it to work, but it is not resampling every replicate. Instead, it resamples my data once and then repeated it 5 times. Here is my dataset of 9 fish samples with an age and length for each one: Age Length 2 200 5 450 6 600 7 702 8 798 5 453 4 399 1 120 2 202 Here is my code which resamples my
2011 Mar 12
1
Column order in stacking/unstacking
Dear R users, I'm having some problems with the stack() and unstack() functions, and wondered if you could help. I have a large data frame (400 rows x 2000 columns), which I need to reduce to a single column of values (and therefore 800000 rows), so that I can use it in other operations (e.g., generating predictions from a GLM object). However, the problem I'm having can be reproduced
2010 Sep 29
2
repeat a function
I have R randomly sampling my array made up of 2 columns of data. Here is my code randomly sampling 5 different rows from my dataset to create a new dataset of 8 rows of data: testdat<-growth[sample(5,8,replace=T),] Now I want to tell R to repeat this function 50 times and give me the output. I have been searching the internet and have been unable to figure this out. Any advice
2011 Feb 15
3
expected behavior when parsing lines with special characters
Say I have a tab-delimited table I want to read into R. What should I expect to happen if some of the entries contain the character " ' "? I thought it would read the file fine, but that is not what happens. Instead, all the values in between two " ' "s get read into one field, and things are just seriously messed up. Is this a bug, and besides removing the offending
2010 Oct 20
1
problem with predict(mboost,...)
Hi, I use a mboost model to predict my dependent variable on new data. I get the following warning message: In bs(mf[[i]], knots = args$knots[[i]]$knots, degree = args$degree, : some 'x' values beyond boundary knots may cause ill-conditioned bases The new predicted values are partly negative although the variable in the training data ranges from 3 to 8 on a numeric scale. In order to
2013 Jul 20
2
Different x-axis scales using c() in latticeExtra
Hi, I would like to combine multiple xyplots into a single, multipanel display. Using R 3.0.1 in Ubuntu, I have used c() from latticeExtra to combine three plots, but the x-axis for two plots are on a log scale and the other is on a normal scale. I also have included equispace.log=FALSE to clean up the tick labels. However, when I try all of these, the x-axis scale of the first panel is used
2011 Apr 13
3
predict()
Hi, I am experimenting with the function predict() in two versions of R and the R extension package "survival". library(survival) set.seed(123) testdat=data.frame(otime=rexp(10),event=rep(0:1,each=5),x=rnorm(10)) testfm=as.formula('Surv(otime,event)~x') testfun=function(dat,fm) { predict(coxph(fm,data=dat),type='lp',newdata=dat) } # Under R 2.11.1 and
2011 Apr 13
3
predict()
Hi, I am experimenting with the function predict() in two versions of R and the R extension package "survival". library(survival) set.seed(123) testdat=data.frame(otime=rexp(10),event=rep(0:1,each=5),x=rnorm(10)) testfm=as.formula('Surv(otime,event)~x') testfun=function(dat,fm) { predict(coxph(fm,data=dat),type='lp',newdata=dat) } # Under R 2.11.1 and
2002 Nov 26
0
degenerate cases in RPART
RPART doesn't seem to handle the degenerate case when all training samples are drawn from a single class: > TrainType [1] 0 0 0 0 > TrainDat V1 V2 V3 V4 V5 1 0.6434392 0.5105860 0.3048803 0.3161728 0.5449632 2 0.1710005 0.5973921 0.1267061 0.6146834 0.7299928 3 0.6919125 0.8880789 0.9123243 0.9061885 0.9553663 4 0.3094843 0.6475508
2009 Dec 16
2
rcart - classification and regression trees (CART)
Hi, I am trying to use CART to find an ideal cut-off value for a simple diagnostic test (ie when the test score is above x, diagnose the condition). When I put in the model fit=rpart(outcome ~ predictor1(TB144), method="class", data=data8) sometimes it gives me a tree with multiple nodes for the same predictor (see below for example of tree with 1 or multiple nodes). Is there a way
2007 Aug 01
1
Predict using SparseM.slm
Hi, I am trying out the SparseM package and had the a question. The following piece of code works fine: ... fit = slm(model, data = trainData, weights = weight) ... But how do I use the fit object to predict the values on say a reserved testDataSet? In the regular lm function I would do something like this: predict.lm(fit,testDataSet) Thanks -Bala
2005 Mar 10
2
Logistic regression goodness of fit tests
I was unsure of what suitable goodness-of-fit tests existed in R for logistic regression. After searching the R-help archive I found that using the Design models and resid, could be used to calculate this as follows: d <- datadist(mydataframe) options(datadist = 'd') fit <- lrm(response ~ predictor1 + predictor2..., data=mydataframe, x =T, y=T) resid(fit, 'gof'). I set up a
2009 Jul 15
0
strategy to iterate over repeated measures/longitudinal data
Hi Group, Create some example data. set.seed(1) wide_data <- data.frame( id=c(1:10), predictor1 = sample(c("a","b"),10,replace=TRUE), predictor2 = sample(c("a","b"),10,replace=TRUE), predictor3 = sample(c("a","b"),10,replace=TRUE), measurement1=rnorm(10), measurement2=rnorm(10)) head(wide_data) id
2005 Jan 24
0
Follow-up on nls convergence failure with SSfol
A couple of weeks ago there was a question regarding apparent convergence in nls when using the SSfol selfStart model for fitting a first-order pharmacokinetic model. I can't manage to find the original message either in my archive or in the list archives but the data were time conc dose 0.50 5.40 1 0.75 11.10 1 1.00 8.40 1 1.25 13.80 1 1.50 15.50 1
2008 May 30
1
Get all X iterations in optim output when controls(trace=6)
Hi, I would like to get all X iterations in optim output in matrix form. I know about the follow approach: sink("reportOptim") optim( ......., control=list( trace=6,..........) ) sink() all_iterOptim <- readLines("reportOptim") unlink("reportOptim") all_iterOptim <- all_iterOptim[ grep( '^X', all_iterOptim ) ] ### TODO: the rest !!! :-) But it is very
2012 Jul 11
4
MODE , VARIANCE , NTH PERCENTAILE
Hi, Here i have an matrix like this, ABC PQR XYZ MNO ------ ------- ------ -------- 3 6 7 15 2 12 24 15 20 5 1 2 25 50 15 35 i need to get the "MODE" - for each column-wise "VARIANCE" - for
2009 Dec 14
0
GBM package: Extract coefficients
I am using the gbm package for generalized boosted regression models, and would like to be able to extract the coefficients produced for storage in a database. I am already using R to automatically generate formulas that I can export to a database and store. For example, I have been using Dr. Harrell's lrm package to perform logistic regression, e.g.: output <-