Displaying 20 results from an estimated 2000 matches similar to: "cross validation and parameter determination"
2005 Feb 15
1
shrinkage estimates in lme
Hello. Slope estimates in lme are shrinkage estimates which pull the
OLS slope estimates towards the population estimates, the degree of
which depends on the group sample size and the distance between the
group-based estimate and the overall population estimate. Although
these shrinkage estimates as said to be more precise with respect to the
true values, they are also biased. So there is a
2006 Jan 04
2
Looking for packages to do Feature Selection and Classification
Hi All,
Sorry if this is a repost (a quick browse didn't give me the answer).
I wonder if there are packages that can do the feature selection and
classification at the same time. For instance, I am using SVM to classify my
samples, but it's easy to get overfitted if using all of the features. Thus,
it is necessary to select "good" features to build an optimum hyperplane
(?).
2017 Oct 27
1
genetics: backward haplotype transmission association algorithm
Dear friends - a couple of papers in PNAS (lastly:framework for making
better predictions by directly estimating variables' predictivity, Lo et
al PNAS 2016; 113:14277-14282) have focused interest on mapping complex
traits to multiple loci spread all over the genome. I have been around
on the relevant taskview(s) I hope but fail to see that the backward
haplotype transmission association
2007 Jan 05
5
eval(parse(text vs. get when accessing a function
Dear All,
I've read Thomas Lumley's fortune "If the answer is parse() you should usually
rethink the question.". But I am not sure it that also applies (and why) to
other situations (Lumley's comment
http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html
was in reply to accessing a list).
Suppose I have similarly called functions, except for a postfix. E.g.
f.1 <-
2005 May 05
1
building from source after installing binary package
Dear All,
I've got into the habit of installing R from the precompiled Debian binaries, including many of the packages from the r-cran-* Debian packages, and later building from source (e.g., to link against Goto's BLAS, or to build patched versions, etc). I install the newly built R to the very same place (/usr/lib/R). This allows me to build and update R when I wish, AND provides the
2005 Sep 19
5
FDR analyses: minimum number of features
Dear List,
We are planning a genotyping study to be analyzed using false discovery
rates (FDRs) (See Storey and Tibshirani PNAS 2003; 100:9440-5). I am
interested in learning if there is any consensus as to how many
features (ie. how many P values) need to be studied before reasonably
reliable FDRs can be derived. Does anyone know of a citation where
this is discussed?
Bill Dupont
William D.
2006 Feb 16
1
Interaction between R and Perl
Hello!
I'm calling R from Perl with Statistics-R perl module for a microarray
analysis integrated web tool.
I have some questions for a multi-users utilisation:
- Can I change the directory where R is running in order to have a directory
per user? Then no problem of erasing R data of an other user.
- If it's not possible, can I limite the number of users at the same time? I
see
2006 Jul 05
2
Colinearity Function in R
Is there a colinearty function implemented in R? I
have tried help.search("colinearity") and
help.search("collinearity") and have searched for
"colinearity" and "collinearity" on
http://www.rpad.org/Rpad/Rpad-refcard.pdf but with no
success.
Many thanks in advance,
Peter Lauren.
2006 Aug 11
1
rpvm/snow packages on a cluster with dual-processor machines
Hi,
does anybody know how to use the dual processors in the machines of a cluster? I am using R with rpvm and snow packages. I usually start pvm daemon and add host machines first, and then run R to start my computing work. But I find that only one processor in each machine is used in this way and the other one always stays idle. Is there any simple way to tell pvm to use the two processors at
2006 Oct 25
1
Cross-compilation
Hi everyone,
I am trying to cross-compile a package I wrote using the Yan and Rossini
tutorial "Building Microsoft Windows versions of R and R packages using
Intel Linux". I have got reasonably far with this but when doing the
linking using the line:
i586-mingw32-g++ -shared -s -o mylibrary.dll mylibrary.def mylibrary.o
mylibrary_res.o
2006 Jul 05
2
Editors which have strong/solid support for SWeave?
Greetings!
I have a few colleagues who like the idea of Sweave, but have failed
to become enlightened monks of the One True Editor
(http://www.dina.dk/~abraham/religion/)
Are there any other Microsoft-centric editors or IDEs which have solid
support for writing SWeave documents (dual R / LaTeX enhancements
similar to ESS's support)? Has anyone tried the folding editors which
support Noweb?
2004 Sep 21
3
can't understand "R"
hi. i really need help using this program. computer language is a foreign
language to me, and thus, i cannot make heads nor tails of the user manuals
from the website. i need to locate step-by-step examples of simple
problems such as "graph f(x)+g(x) and f(g(x)) for the domain 0<x<2" and
"graph 2H(x), H(x)+1, H(x+1)" i do know how to define the functions, but
2004 Nov 24
2
LDA with previous PCA for dimensionality reduction
Dear all, not really a R question but:
If I want to check for the classification accuracy of a LDA with
previous PCA for dimensionality reduction by means of the LOOCV method:
Is it ok to do the PCA on the WHOLE dataset ONCE and then run the LDA
with the CV option set to TRUE (runs LOOCV)
-- OR--
do I need
- to compute for each 'test-bag' (the n-1 observations) a PCA
2007 Jun 21
1
mgcv: lowest estimated degrees of freedom
Dear list,
I do apologize if these are basic questions. I am fitting some GAM
models using the mgcv package and following the model selection criteria
proposed by Wood and Augustin (2002, Ecol. Model. 157, p. 157-177). One
criterion to decide if a term should be dropped from a model is if the
estimated degrees of freedom (EDF) for the term are close to their lower
limit.
What would be the
2007 Jan 30
4
Speed of for loops
Hi Everyone,
I have a question about for loops. If you have something like:
f <- function(x) {
y <- rep(NA,10);
for( i in 1:10 ) {
if ( i > 3 ) {
if ( is.na(y[i-3]) == FALSE ) {
# some calculation F which depends on one or more of the previously
generated values in the series
y[i] = y[i-1]+x[i];
} else {
y[i] <- x[i];
}
}
}
y
}
e.g.
>
2007 Jan 30
4
Speed of for loops
Hi Everyone,
I have a question about for loops. If you have something like:
f <- function(x) {
y <- rep(NA,10);
for( i in 1:10 ) {
if ( i > 3 ) {
if ( is.na(y[i-3]) == FALSE ) {
# some calculation F which depends on one or more of the previously
generated values in the series
y[i] = y[i-1]+x[i];
} else {
y[i] <- x[i];
}
}
}
y
}
e.g.
>
2006 Nov 07
2
snow's makeCluster hanging (using Rmpi)
Hello everyone,
I've been fiddling around with the snow and Rmpi packages on my new Intel
Mac, and have run into a few problems. When I make a cluster on my machine,
both slaves start up just fine, and everything works as expected. When I try
to make a cluster including another networked machine it hangs. I've
followed the suggestions at
2002 Mar 01
2
step, leaps, lasso, LSE or what?
Hi,
I am trying to understand the alternative methods that are available for
selecting
variables in a regression without simply imposing my own bias (having "good
judgement"). The methods implimented in leaps and step and stepAIC seem to
fall into the general class of stepwise procedures. But these are commonly
condemmed for inducing overfitting.
In Hastie, Tibshirani and Friedman
2006 May 27
2
boosting - second posting
Hi
I am using boosting for a classification and prediction problem.
For some reason it is giving me an outcome that doesn't fall between 0
and 1 for the predictions. I have tried type="response" but it made no
difference.
Can anyone see what I am doing wrong?
Screen output shown below:
> boost.model <- gbm(as.factor(train$simNuance) ~ ., # formula
+
2009 Aug 14
1
Permutation test and R2 problem
Hi,
I have optimized the shrinkage parameter (GCV)for ridge and got my r2
value is 70% . to check the sensitivity of the result, I did permutation
test. I permuted the response vector and run for 1000 times and draw a
distribution. But now, I get r2 values highest 98% and some of them more
than 70 %. Is it expected from such type of test?
*I was under impression that, r2 with real data set