Displaying 20 results from an estimated 20000 matches similar to: "sample from very large distribution"
2010 Jul 05
2
to remove duplicate values
Dear R family,
Suppose I have two series.
order value
1 0.52
2 0.23
3 0.43
4 0.21
5 0.32
6 0.32
7 0.32
8 0.32
9 0.32
10 0.12
11 0.46
12 0.09
13 0.32
14 0.25
For these two series, I figured out the way to detect the locations of
duplicate values.
The next thing to do is remove the repeated values except for a value
that would not be next to each other.
In other words, while keeping the
2006 Dec 10
4
sample "n" random positions from a matrix
Hi there,
I have a binary matrix (dim 100x100) filled with values 0 and 1. I need select a record "n" positions of that matrix when values are 1. How can I do that?
Thanks for all,
Miltinho
Brazil
---------------------------------
[[alternative HTML version deleted]]
2009 Nov 16
3
Cluster analysis: hclust manipulation possible?
I am doing cluster analysis [hclust(Dist, method="average")] on
data that potentially contains redundant objects. As expected,
the inclusion of redundant objects affects the clustering result,
i.e., the data a1, = a2, = a3, b, c, d, e1, = e2 is likely to
cluster differently from the same data without the redundancy,
i.e., a1, b, c, d, e1. This is apparent when the outcome is
visualized
2008 Oct 30
3
why does sample(x, n) give the same n items in every separate runs?
Hello R users,
I have gene expression data of two groups of genes (large and small). Gene expression intensities of those genes are classified into 1 to 10 levels. What I want is to make a random set of genes that have the same levels as the small group from large group using sample().
I used smallvec to hold the number of genes in each levels (1 to 10) for small group, largevec for large group.
2008 Jul 17
2
Sampling distribution (PDF & CDF) of correlation
Hi all,
I'm looking for an analytic method to obtain the PDF & CDF of the
sampling distribution of a given correlation (rho) at a given sample
size (N).
I've attached code describing a monte carlo method of achieving this,
and while it is relatively fast, an analytic solution would obviously
be optimal.
get.cors <- function(i, x, y, N){
end=i*N
2008 Aug 28
2
sample consecutive integers efficiently
Hi all,
I have some rough code to sample consecutive integers with length
according to a vector of lengths
#sample space (representing positions)
pos<-c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)
#sample lengths
lengths<-c(2,3,2)
From these two vectors I need a vector of sampled positions.
the sampling is without replacement, making things tough as the sampled
integers need
2007 Aug 04
2
Mixture of Normals with Large Data
All:
I am trying to fit a mixture of 2 normals with > 110 million observations. I
am running R 2.5.1 on a box with 1gb RAM running 32-bit windows and I
continue to run out of memory. Does anyone have any suggestions.
Thanks so much,
Tim
[[alternative HTML version deleted]]
2008 Nov 06
2
Confidence limits for the parameter of the Poisson distribution
Hi all,
So far I only know one way to get the confidence limit for the Poisson
distribution is to use the look-up table given by the 2 parameter (the
number of observation x and the confidence level, e.g. 95%) and the table is
limit by the maximum number of observations (x <= 50).
I know the formula to compute the CI, however, mathematically it is not
easy to do it. So, anyone know an R
2008 Mar 14
1
Rejection sampling to draw from distributions
Dear friends,
Please find below the code that I have employed for a rejection sampler to draw from asymmetric laplace distributions. I was wondering if this code can be written more efficiently? Are there more efficient ways of drawing random numbers from asymmetric laplace distributions??
Thanks in advance for your help and have a great weekend.
Regards
Anup
2008 Nov 01
2
calculation for standard normal cumulative distribution
Is there anyone knowing a function or way for standard normal cumulative
distribution?
?(z=-0.1)=?
also
?(z=?)=0.025
Thank you,
--
View this message in context: http://www.nabble.com/calculation-for-standard-normal-cumulative-distribution-tp20282804p20282804.html
Sent from the R help mailing list archive at Nabble.com.
2009 Oct 13
7
lapply() reccursively
Hi all,
I was wondering whether it is possible to use the lapply() function
to alter the value of the input, something in the spirit of :
a1<-runif(100)
a2<-function(i){
a1[i]<-a1[i-1]*a1[i];a1[i]
}
a3<-lapply(2:100,a2)
Something akin to a for() loop, but using the lapply() infrastructure.
I haven't been able to get rapply() to do this.
The reason is that the "real"
2007 Nov 19
1
Finding proportion of observations that are outliers from the left tail of the normal distribution
Hi fellow users
I have a new R problem i am hoping to get some pointers on. I have a
dataset that is approximately normally distributed but with a fat left
tail. I am interested in a good measurement on how much fatter the
left tail is than can be expected from a normal distribution. One
thing I'll tried was fitting a two component mixture model with the
Rmix package but i am also interested
2008 Aug 21
1
[dist]how to analise a large matrix?
Hi all,
I have a matrix of about 100.000?x 4?that I need?to classify using
euclidean metric. For that I am using dist?or daisy functions, but I
am afraid that the message: Error in vector("double", length) : vector
size specified is too large, means too much lines.
Can anyone suggest me how should I analyse this matrix?
Thanks in advance,
Diogo Andr? Alagador
MNCN,CSIC, Madrid, Spain
2008 Dec 03
1
help on tapply using sample with differing sample-sizes
Hello, My question likely got buried so I am reposting it in the hopes that someone has an answer. I have thought more about the question and modified my question. I hope tha
my specific question is:
I am attempting to create a bootstrap procedure for a finite sample using the theory of Rao and Wu, JASA (1988) that replicates within each strata (h) n_h - 1 times. To this end, I require a
2007 Sep 11
1
Fitting Data to a Noncentral Chi-Squared Distribution using MLE
Hi, I have written out the log-likelihood function to fit some data I have (called ONES20) to the non-central chi-squared distribution.
>library(stats4)
>ll<-function(lambda,k){x<-ONES20; 25573*0.5*lambda-25573*log(2)-sum(-x/2)-log((x/lambda)^(0.25*k-0.5))-log(besselI(sqrt(lambda*x),0.5*k-1,expon.scaled=FALSE))}
> est<-mle(minuslog=ll,start=list(lambda=0.05,k=0.006))
2007 Aug 29
3
OT: distribution of a pathological random variate
Folks,
I wonder if anything could be said about the distribution of a random variate x, where
x = N(0,1)/N(0,1)
Obviously x is pathological because it could be 0/0. If we exclude this point, so the set is {x/(0/0)}, does x have a well defined distribution? or does it exist a distribution that approximates x.
(The case could be generalized of course to N(mu1, sigma1)/N(mu2, sigma2) and one
2008 Jun 05
1
Limit distribution of continuous-time Markov process
I have (below) an attempt at an R script to find the limit distribution
of
a continuous-time Markov process, using the formulae outlined at
http://www.uwm.edu/~ziyu/ctc.pdf, page 5.
First, is there a better exposition of a practical algorithm for doing
this? I have not found an R package that does this specifically, nor
anything on the web.
Second, the script below will give the right
2006 Jul 20
3
Permutation Distribution
Hallo
Is there an elegant way to do the following:
Dataset consists of 2 variables: var1: some measurements, and var2: a grouping variable with two values, 1 and 2.
There are (say) 10 measurements from group 1 and 15 measurements from group 2.
The idea is to study the permutation distribution of mean(group 1) * mean(group2).
One way would be to permute 1s and 2s and select the corresponding
2010 Mar 30
1
Multivariate hypergeometric distribution version of phyper()
Dear R Users,
I employed the phyper() function to estimate the likelihood that the
number of genes overlapping between 2 different lists of genes is due to
chance. This appears to work appropriately.
Now i want to try this with 3 lists of genes which phyper() does not
appear to support.
Some googling suggests i can utilize the Multivariate hypergeometric
distribution to achieve this. eg.:
2011 Dec 21
4
qqnorm & huge datasets
Hi,
When qqnorm on a vector of length 10M+ I get a huge pdf file which
cannot be loaded by acroread or evince.
Any suggestions? (apart from sampling the data).
Thanks.
--
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://mideasttruth.com http://honestreporting.com http://camera.org
http://openvotingconsortium.org http://pmw.org.il