Displaying 20 results from an estimated 6000 matches similar to: "randomForest error"
2005 Jul 21
4
RandomForest question
Hello,
I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases.
I've seen that although there are only 32 explanatory variables the best classification performance is reached when
2005 Jul 07
2
randomForest
> From: Weiwei Shi
>
> it works.
> thanks,
>
> but: (just curious)
> why i tried previously and i got
>
> > is.vector(sample.size)
> [1] TRUE
Because a list is also a vector:
> a <- c(list(1), list(2))
> a
[[1]]
[1] 1
[[2]]
[1] 2
> is.vector(a)
[1] TRUE
> is.numeric(a)
[1] FALSE
Actually, the way I initialize a list of known length is by
2006 Mar 08
5
data import problem
Dear All,
I'm trying to read a text data file that contains several records separated by a blank line. Each record starts with a row that contains it's ID and the number of rows for the records (two columns), then the data table itself, e.g.
123 5
89.1791 1.1024
90.5735 1.1024
92.5666 1.1024
95.0725 1.1024
101.2070 1.1024
321 3
60.1601 1.1024
64.8023 1.1024
70.0593
2006 Sep 15
3
graphics and 'layout' question
Hello,
I got stuck with a graphics question: I've 3 figures that I present on a single page (window) via 'layout'. The layout is
layout(matrix(c(1,1,2,3), 2, 2, byrow=TRUE));
so that the frst plot spans the both columns in row one. Now I'd like to magnify the fist figure so that it takes 20% more vertical space (i.e. more space for the y-axis). How would I do this in R?
2006 Nov 01
4
splitting very long character string
Hello,
I've a very long character array (>500k characters) that need to split by '\n' resulting in an array of about 60k numbers. The help on strsplit says to use perl=TRUE to get better formance, but still it takes several minutes to split this string.
The massive string is the return value of a call to xmlElementsByTagName from the XML library and looks like this:
...
12345
2005 Jun 28
2
svm and scaling input
Dear All,
I've a question about scaling the input variables for an analysis with svm (package e1071). Most of my variables are factors with 4 to 6 levels but there are also some numeric variables.
I'm not familiar with the math behind svms, so my assumtions maybe completely wrong ... or obvious. Will the svm automatically expand the factors into a binary matrix? If I add numeric
2006 Feb 02
2
calculating IC50
Hello,
I was wondering if there is an R-package to automatically calculate the IC50 value (concentration of a substrance that inhibits cell growth to 50%) for some measurements.
kind regards,
Arne
[[alternative HTML version deleted]]
2007 Jan 28
2
help with RandomForest classwt option
Hello there,
I am working on an extremely unbalanced two class classification problems. I
wanna use "classwt" with "down sampling" together. By checking the rfNews()
in R, it looks that classwt is not working yet. Then I looked at the
software from Salford. I did not find the down sampling option. I am
wondering if you have any experience to deal with this problem. Do you
2004 May 13
3
storage of lm objects in a database
Hello,
I'd like to use DBI to store lm objects in a database. I've to analyze many of linear models and I cannot store them in a single R-session (not enough memory). Also it'd be nice to have them persistent.
Maybe it's possible to create a compact binary representation of the object (the kind of format created created by "save"), so that one doesn't need to write
2005 May 13
1
error in plot.lmList
Hello,
in R-2.1.0 I'm trying to prodice trellis plots from an lmList object as described in the help for plot.lmList. I can generate the plots from the help, but on my own data plotting fails with an error message that I cannot interpret (please see below). Any hints are greatly appreciapted.
kind regards,
Arne
> dim(d)
[1] 575 4
> d[1:3,]
Level_of_Expression SSPos1 SSPos19
2005 Jul 01
1
p-values for classification
Dear All,
I'm classifying some data with various methods (binary classification). I'm interpreting the results via a confusion matrix from which I calculate the sensitifity and the fdr. The classifiers are trained on 575 data points and my test set has 50 data points.
I'd like to calculate p-values for obtaining <=fdr and >=sensitifity for each classifier. I was thinking about
2006 Jul 26
0
randomForest question [Broadcast]
When mtry is equal to total number of features, you just get regular bagging
(in the R package -- Breiman & Cutler's Fortran code samples variable with
replacement, so you can't do bagging with that). There are cases when
bagging will do better than random feature selection (i.e., RF), even in
simulated data, but I'd say not very often.
HTH,
Andy
From: Arne.Muller at
2005 May 23
2
Trouble with drplot
Hi, I am a newbie with R, so I hope my question isn't too stupid. I am trying to generate dose-response curves using the "drfit" package. I have formatted my CSV files to the correct format, and have no trouble running drfit to get a summary of my data. The problem is that when I try to use "drplot" to graph my data I get an error. The message is:
Error in
2003 Oct 17
4
sub data frame by expression
Hi All,
I've the following data frame with 54 rows and 4 colums:
> x
Ratio Dose Time Batch
R.010mM.04h.NEW 0.02 010mM 04h NEW
R.010mM.04h.NEW.1 0.07 010mM 04h NEW
...
R.010mM.24h.NEW.2 0.06 010mM 24h NEW
R.010mM.04h.OLD 0.19 010mM 04h OLD
...
R.010mM.04h.OLD.1 0.49 010mM 04h OLD
R.100mM.24h.OLD 0.40 100mM 24h OLD
I'd
2003 Sep 05
3
all values from a data frame
Hello,
I've a data frame with 15 colums and 6000 rows, and I need the data in a
single vector of size 90000 for ttest. Is there such a conversion function in
R, or would I have to write my own loop over the colums?
thanks for your help + kind regards
Arne
2004 May 10
5
R versus SAS: lm performance
Hello,
A collegue of mine has compared the runtime of a linear model + anova in SAS and S+. He got the same results, but SAS took a bit more than a minute whereas S+ took 17 minutes. I've tried it in R (1.9.0) and it took 15 min. Neither machine run out of memory, and I assume that all machines have similar hardware, but the S+ and SAS machines are on windows whereas the R machine is Redhat
2005 Aug 26
1
basic anova and t-test question
Hello,
I'm posting this to receive some comments/hints about a rather statistical than R-technical question ... .
In an anova of a lme factor SSPos11 shows up non-significant, but in the t-test of the summay 2 of the 4 levels (one for constrast) are significant. See below for some truncated output.
I realize that the two test are different (F-test/t-test), but I'm looking for for a
2004 Jul 26
5
binning a vector
Hello,
I was wondering wether there's a function in R that takes two vectors (of same length) as input and computes mean values for bins (intervals) or even a sliding window over these vectros.
I've several x/y data set (input/response) that I'd like plot together. Say the x-data for one data set goes from -5 to 14 with 12,000 values, then I'd like to bin the x-vector in steps of
2004 Jun 28
1
unbalanced design for anova with low number of replicates
Hello,
I'm wondering what's the best way to analyse an unbalanced design with a low number of replicates. I'm not a statistician, and I'm looking for some direction for this problem.
I've a 2 factor design:
Factor batch with 3 levels, and factor dose within each batch with 5 levels. Dose level 1 in batch one is replicated 4 times, level 3 is replicated only 2 times. all
2004 Feb 04
3
number point under-flow
Hello,
I've come across the following situation in R-1.8.1 (compile + running under
RedHat 7.1):
> phyper(24, 514, 5961-514, 53, lower.tail=T)
[1] 1
> phyper(24, 514, 5961-514, 53, lower.tail=F)
[1] -1.037310e-11
I'd expect the later to be 0 or some very small positive number. Is this a
number under-flow of the calculation? Do you think I'm safe if I just set the
result to 0