Displaying 20 results from an estimated 200 matches similar to: "Selecting a subsample so that it follows a distribution."
2011 Oct 30
1
Normality tests on groups of rows in a data frame, grouped based on content in other columns
Dear R users,
I have a data frame in the form below, on which I would like to make normality tests on the values in the ExpressionLevel column.
> head(df)
ID Plant Tissue Gene ExpressionLevel
1 1 p1 t1 g1 366.53
2 2 p1 t1 g2 0.57
3 3 p1 t1 g3 11.81
4 4 p1 t2 g1 498.43
5 5 p1 t2 g2 2.14
6 6 p1 t2 g3 7.85
I
2008 Jul 02
1
help on list comparison
hi
I want to compare two list by its names and get the values of that list.
can anybody let me know the syntax of comparing the list by their names
using a for loop
c.genes<- list()
for(i in 1:100)
c.genes[[1]]<- geneset(which(geneset == tobecampared[i]))
}
here geneset is a list and also tobecampared is a list
Thank you
Ramya
--
View this message in context:
2008 Jun 27
3
For loop
Hi,
Could you please let me know to use a list in a for loop here geneset is a
loop.I am trying to match the names of the list with 1st row of the output.
result<- list()
for(i in 1:length(output)
{
result[[i]] <- geneset(which(geneset %n% output[,1]))
}
Kindly help me out
--
View this message in context: http://www.nabble.com/For-loop-tp18163665p18163665.html
Sent from the R
2008 Jul 28
2
writing the plots
hi there,
I want to write the plots in the pdfs and the details about the graph in a
seperate notepad.
plot(as.numeric(lapply(resultgenes,length)),
main= "Geneset.gene#.bias.test",xlab="Top.Ranked.Genesets",
ylab="gene.number.per.geneset")
lines(loess.smooth(c(1:1000),as.numeric(lapply(resultgenes,length)), span =
2/3, degree = 1,
family =
2008 Jul 21
3
vector help
hi
I have vector test. It has 3 elements. I want to join the three into one
vector.
"Geneset=HSA04910_INSULIN_SIGNALING_PATHWAY-157- 20".
how can i do it.
> class(test)
[1] "character"
> test
[1] "Geneset=HSA04910_INSULIN_SIGNALING_PATHWAY" "157"
[3] "20"
Ramya
--
2007 Sep 25
1
'load' does not properly add 'show' methods for classes extending 'list'
The GeneSetCollection class in the Bioconductor package GSEABase
extends 'list'
> library(GSEABase)
> showClass("GeneSetCollection")
Slots:
Name: .Data
Class: list
Extends:
Class "list", from data part
Class "vector", by class "list", distance 2
Class "AssayData", by class "list", distance 2
If I create
2010 Apr 19
2
Error message GSA package
Dear list,
I have gene expression measurements obtained by PCR on 11 genes,
tabulated as a data matrix.
I'm attempting to use GSA package to distinguish any significant changes
in these genes as a pathway.
My response variable is binary, 0=no disease, 1=disease.
I have read the PCR data into R as follows:
data <-
2009 Jun 24
1
Rscript segfaults with lazy loading
Hi,
I have an RData file containing a GeneSetCollection object
(Bioconductor), http://www.cs.mu.oz.au/~gabraham/c2.RData. I think it
uses lazy loading because packages are only loaded when I access the
object (see below) in the R console.
When I try the same with Rscript, it segfaults. This happens on 2.9.0
both on Linux and Mac:
Rscript -e 'load("c2.RData"); c2[1]'
***
2009 Apr 09
1
.Call()
Hi guys,
I want to transfer the following code from R into .Call compatible form. How
can i do that?
Thanks!!!
INT sim;
for(i in 1:sim){
if(i>2) genemat <- genemat[,sample(1:ncol(genemat))]
ranklist[,1] <- apply(genemat, 1, function(x){
(mean(x[cols]) -
mean(x[-cols]))/sd(x)})
ranklist <- ranklist[order(ranklist[,1]),]
2013 Jan 18
0
repeat resampling with different subsample sizes
Hi,
I'm trying to write a code (see below) to randomly resample measurements of
one variable (say here the variable "counts" in the data frame "dat") with
different resampled subsample sizes.
The code works fine for a single resampled subsample size (in the code below
= 10).
I then tried to generalize this by writing a function with a loop, where in
each loop the function
2008 Sep 16
1
analyze subsample of dataframe
Hi there,
I'm dealing with a pretty big dataset (~22,000 entries) with numerous
entries for every day over a period of several years. I have a column
"judy" (for Julian Day) with 0 beginning on Jan. 1st of every new year (I
want to compare tendencies between years). However, in order to control for
a leap year (2004), I simply need to subtract 1 from every judy value for
the year
2009 Jul 21
1
Subsample points for mclust
Hi all!
I have an ordered vector of values. The distribution of these values can
be modeled by a sum of Gaussians.
So I'm using the package 'mclust' to get the Gaussians's parameters for
this 1D distribution. It works very well, but, for input sizes above
100.000 values it starts taking really forever. Unfortunately my dataset
has around 4.6M values...
My question: is it
2012 Jun 28
2
Size of subsample in ecodist mantel()
What is the size of the boostrapped subsample in ecodist mantel()
thanks
[[alternative HTML version deleted]]
2012 Aug 16
1
Big Data reading subsample csv
Hello,
I'm most grateful for your time to read this.
I have a uber size 30GB file of 6 million records and 3000 (mostly
categorical data) columns in csv format. I want to bootstrap subsamples for
multinomial regression, but it's proving difficult even with my 64GB RAM
in my machine and twice that swap file , the process becomes super slow
and halts.
I'm thinking about generating
2010 Dec 19
1
Random selection from a subsample
Dear Mailing List
I have a data set (data4) consisting of a number of factors and a response variable. I wish to randomly sample from a combination of two of those factors (GIS_station and Distance_code2) and return a new dataframe containing the original data structure (i.e. all the columns) but only containing the randomly selected rows. The number of rows in each combination of GIS_station
2009 Jun 26
1
Where can I find information on how to subsample a time series?
I suspect I'm looking in the wrong places, so guidance to the relevant
documentation would be as welcome as a little code snippet.
I have time series data stored in a MySQL database. There is the usual DATE
field, along with a double precision number: there are daily values
(including only normal working days: Monday through Friday). I actually
have to do a couple things here. Because of
2017 Sep 25
0
Sample of a subsample
For personal aesthetic reasons, I changed the name "data" to "dat".
Your code, with a slight modification:
set.seed (1357) ## for reproducibility
dat <- data.frame(var1=seq(1:40), var2=seq(40,1))
dat$sampleNo <- 0
idx <- sample(seq(1,nrow(dat)), size=10, replace=F)
dat[idx,"sampleNo"] <-1
## yielding
> dat
var1 var2 sampleNo
1 1 40
2017 Sep 25
2
Sample of a subsample
Hello everybody!
I have the following problem: I'd like to select a sample from a subsample
in a dataset. Actually, I don't want to select it, but to create a new
variable sampleNo that indicates to which sample (one or two) a case
belongs to.
Lets suppose I have a dataset containing 40 cases:
data <- data.frame(var1=seq(1:40), var2=seq(40,1))
The first sample (n=10) I drew like
2017 Sep 25
1
Sample of a subsample
Hi David,
I was about to post a reply when Bert responded. His answer is good
and his comment to use the name 'dat' rather than 'data' is instructive.
I am providing my suggestion as well because I think it may address
what was causing you some confusion (mainly to use "which", but also
the missing !)
idx2 <- sample( which( (!data$var1%%2) & data$sampleNo==0 ),
2009 Apr 06
3
how to subsample all possible combinations of n species taken 1:n at a time?
Hello
I apologise for the length of this entry but please bear with me.
In short:
I need a way of subsampling communities from all possible communities of n
taxa taken 1:n at a time without having to calculate all possible
combinations (because this gives me a memory error - using
combn() or expand.grid() at least). Does anyone know of a function? Or can
you help me edit the
combn
or