Displaying 20 results from an estimated 200 matches similar to: "different randomForest performance for same data"
2008 Sep 24
4
rowSums()
Say I have the following data:
testDat <- data.frame(A = c(1,NA,3), B = c(NA, NA, 3))
> testDat
A B
1 1 NA
2 NA NA
3 3 3
rowsums() with na.rm=TRUE generates the following, which is not desired:
> rowSums(testDat[, c('A', 'B')], na.rm=T)
[1] 1 0 6
rowsums() with na.rm=F generates the following, which is also not
desired:
> rowSums(testDat[, c('A',
2011 Jan 17
1
Replacing rows in a data frame
R-helpers,
Below is a simple example of some output that I am getting while trying to work with a data frame in R 2.12.1 for Mac.
-----
> testdat <- data.frame(matrix(ncol=10, nrow=10))
> colnames(testdat) <- c('a','b','c','d','e','f','g','h','i','j')
> testdat[seq(1,10,3),] <-
2009 Jan 22
4
dimnames in pkg "ipred"
Hello List,
I`m trying to make prediction using a bagged tree with the package ipred. I tried to follow the manual but I`m getting an error message. Also browsing through the list-archive I didn`t find any hint.
Maybe someone can help me?
selbag <- bagging(SOIL_UNIT ~., data=traindat.bin, coob=TRUE)
Error in dimnames(X) <- list(dn[[1L]], unlist(collabs, use.names = FALSE)) :
2010 Sep 29
2
resampling issue
I am trying to get R to resample my dataset of two columns of age and length
data for fish. I got it to work, but it is not resampling every replicate.
Instead, it resamples my data once and then repeated it 5 times.
Here is my dataset of 9 fish samples with an age and length for each one:
Age Length
2 200
5 450
6 600
7 702
8 798
5 453
4 399
1 120
2 202
Here is my code which resamples my
2011 Mar 12
1
Column order in stacking/unstacking
Dear R users,
I'm having some problems with the stack() and unstack() functions, and
wondered if you could help.
I have a large data frame (400 rows x 2000 columns), which I need to reduce
to a single column of values (and therefore 800000 rows), so that I can use
it in other operations (e.g., generating predictions from a GLM object).
However, the problem I'm having can be reproduced
2010 Sep 29
2
repeat a function
I have R randomly sampling my array made up of 2 columns of data. Here is
my code randomly sampling 5 different rows from my dataset to create a new
dataset of 8 rows of data:
testdat<-growth[sample(5,8,replace=T),]
Now I want to tell R to repeat this function 50 times and give me the
output. I have been searching the internet and have been unable to figure
this out. Any advice
2011 Feb 15
3
expected behavior when parsing lines with special characters
Say I have a tab-delimited table I want to read into R. What should I
expect to happen if some of the entries contain the character " ' "? I
thought it would read the file fine, but that is not what happens.
Instead, all the values in between two " ' "s get read into one field,
and things are just seriously messed up. Is this a bug, and besides
removing the offending
2010 Oct 20
1
problem with predict(mboost,...)
Hi,
I use a mboost model to predict my dependent variable on new data. I get the following warning message:
In bs(mf[[i]], knots = args$knots[[i]]$knots, degree = args$degree, :
some 'x' values beyond boundary knots may cause ill-conditioned bases
The new predicted values are partly negative although the variable in the training data ranges from 3 to 8 on a numeric scale. In order to
2013 Jul 20
2
Different x-axis scales using c() in latticeExtra
Hi,
I would like to combine multiple xyplots into a single, multipanel
display. Using R 3.0.1 in Ubuntu, I have used c() from latticeExtra
to combine three plots, but the x-axis for two plots are on a log
scale and the other is on a normal scale. I also have included
equispace.log=FALSE to clean up the tick labels. However, when I try
all of these, the x-axis scale of the first panel is used
2011 Apr 13
3
predict()
Hi,
I am experimenting with the function predict() in two versions of R and the R extension package "survival".
library(survival)
set.seed(123)
testdat=data.frame(otime=rexp(10),event=rep(0:1,each=5),x=rnorm(10))
testfm=as.formula('Surv(otime,event)~x')
testfun=function(dat,fm)
{
predict(coxph(fm,data=dat),type='lp',newdata=dat)
}
# Under R 2.11.1 and
2011 Apr 13
3
predict()
Hi,
I am experimenting with the function predict() in two versions of R and the R extension package "survival".
library(survival)
set.seed(123)
testdat=data.frame(otime=rexp(10),event=rep(0:1,each=5),x=rnorm(10))
testfm=as.formula('Surv(otime,event)~x')
testfun=function(dat,fm)
{
predict(coxph(fm,data=dat),type='lp',newdata=dat)
}
# Under R 2.11.1 and
2002 Nov 26
0
degenerate cases in RPART
RPART doesn't seem to handle the degenerate case when all training
samples are drawn from a single class:
> TrainType
[1] 0 0 0 0
> TrainDat
V1 V2 V3 V4 V5
1 0.6434392 0.5105860 0.3048803 0.3161728 0.5449632
2 0.1710005 0.5973921 0.1267061 0.6146834 0.7299928
3 0.6919125 0.8880789 0.9123243 0.9061885 0.9553663
4 0.3094843 0.6475508
2009 Dec 16
2
rcart - classification and regression trees (CART)
Hi,
I am trying to use CART to find an ideal cut-off value for a simple
diagnostic test (ie when the test score is above x, diagnose the condition).
When I put in the model
fit=rpart(outcome ~ predictor1(TB144), method="class", data=data8)
sometimes it gives me a tree with multiple nodes for the same predictor (see
below for example of tree with 1 or multiple nodes). Is there a way
2007 Aug 01
1
Predict using SparseM.slm
Hi,
I am trying out the SparseM package and had the a
question. The following piece of code works fine:
...
fit = slm(model, data = trainData, weights = weight)
...
But how do I use the fit object to predict the values
on say a reserved testDataSet? In the regular lm
function I would do something like this:
predict.lm(fit,testDataSet)
Thanks
-Bala
2005 Mar 10
2
Logistic regression goodness of fit tests
I was unsure of what suitable goodness-of-fit tests existed in R for logistic regression. After searching the R-help archive I found that using the Design models and resid, could be used to calculate this as follows:
d <- datadist(mydataframe)
options(datadist = 'd')
fit <- lrm(response ~ predictor1 + predictor2..., data=mydataframe, x =T, y=T)
resid(fit, 'gof').
I set up a
2009 Jul 15
0
strategy to iterate over repeated measures/longitudinal data
Hi Group,
Create some example data.
set.seed(1)
wide_data <- data.frame(
id=c(1:10),
predictor1 = sample(c("a","b"),10,replace=TRUE),
predictor2 = sample(c("a","b"),10,replace=TRUE),
predictor3 = sample(c("a","b"),10,replace=TRUE),
measurement1=rnorm(10),
measurement2=rnorm(10))
head(wide_data)
id
2005 Jan 24
0
Follow-up on nls convergence failure with SSfol
A couple of weeks ago there was a question regarding apparent
convergence in nls when using the SSfol selfStart model for fitting a
first-order pharmacokinetic model. I can't manage to find the original
message either in my archive or in the list archives but the data were
time conc dose
0.50 5.40 1
0.75 11.10 1
1.00 8.40 1
1.25 13.80 1
1.50 15.50 1
2008 May 30
1
Get all X iterations in optim output when controls(trace=6)
Hi,
I would like to get all X iterations in optim output in matrix form.
I know about the follow approach:
sink("reportOptim")
optim( ......., control=list( trace=6,..........) )
sink()
all_iterOptim <- readLines("reportOptim")
unlink("reportOptim")
all_iterOptim <- all_iterOptim[ grep( '^X', all_iterOptim ) ]
### TODO: the rest !!! :-)
But it is very
2012 Jul 11
4
MODE , VARIANCE , NTH PERCENTAILE
Hi,
Here i have an matrix like this,
ABC PQR XYZ MNO
------ ------- ------ --------
3 6 7 15
2 12 24 15
20 5 1 2
25 50 15 35
i need to get the
"MODE" - for each column-wise
"VARIANCE" - for
2009 Dec 14
0
GBM package: Extract coefficients
I am using the gbm package for generalized boosted regression models,
and would like to be able to extract the coefficients produced for
storage in a database.
I am already using R to automatically generate formulas that I can
export to a database and store. For example, I have been using Dr.
Harrell's lrm package to perform logistic regression, e.g.:
output <-