thr3ads.net - similar to: "sample from very large distribution"

Displaying 20 results from an estimated 20000 matches similar to: "sample from very large distribution"

2010 Jul 05

to remove duplicate values

Dear R family, Suppose I have two series. order value 1 0.52 2 0.23 3 0.43 4 0.21 5 0.32 6 0.32 7 0.32 8 0.32 9 0.32 10 0.12 11 0.46 12 0.09 13 0.32 14 0.25 For these two series, I figured out the way to detect the locations of duplicate values. The next thing to do is remove the repeated values except for a value that would not be next to each other. In other words, while keeping the

sample "n" random positions from a matrix

2006 Dec 10

sample "n" random positions from a matrix

Hi there, I have a binary matrix (dim 100x100) filled with values 0 and 1. I need select a record "n" positions of that matrix when values are 1. How can I do that? Thanks for all, Miltinho Brazil --------------------------------- [[alternative HTML version deleted]]

Cluster analysis: hclust manipulation possible?

2009 Nov 16

Cluster analysis: hclust manipulation possible?

I am doing cluster analysis [hclust(Dist, method="average")] on data that potentially contains redundant objects. As expected, the inclusion of redundant objects affects the clustering result, i.e., the data a1, = a2, = a3, b, c, d, e1, = e2 is likely to cluster differently from the same data without the redundancy, i.e., a1, b, c, d, e1. This is apparent when the outcome is visualized

why does sample(x, n) give the same n items in every separate runs?

2008 Oct 30

why does sample(x, n) give the same n items in every separate runs?

Hello R users, I have gene expression data of two groups of genes (large and small). Gene expression intensities of those genes are classified into 1 to 10 levels. What I want is to make a random set of genes that have the same levels as the small group from large group using sample(). I used smallvec to hold the number of genes in each levels (1 to 10) for small group, largevec for large group.

Sampling distribution (PDF & CDF) of correlation

2008 Jul 17

Sampling distribution (PDF & CDF) of correlation

Hi all, I'm looking for an analytic method to obtain the PDF & CDF of the sampling distribution of a given correlation (rho) at a given sample size (N). I've attached code describing a monte carlo method of achieving this, and while it is relatively fast, an analytic solution would obviously be optimal. get.cors <- function(i, x, y, N){ end=i*N

sample consecutive integers efficiently

2008 Aug 28

sample consecutive integers efficiently

Hi all, I have some rough code to sample consecutive integers with length according to a vector of lengths #sample space (representing positions) pos<-c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20) #sample lengths lengths<-c(2,3,2) From these two vectors I need a vector of sampled positions. the sampling is without replacement, making things tough as the sampled integers need

Mixture of Normals with Large Data

2007 Aug 04

Mixture of Normals with Large Data

All: I am trying to fit a mixture of 2 normals with > 110 million observations. I am running R 2.5.1 on a box with 1gb RAM running 32-bit windows and I continue to run out of memory. Does anyone have any suggestions. Thanks so much, Tim [[alternative HTML version deleted]]

Confidence limits for the parameter of the Poisson distribution

2008 Nov 06

Confidence limits for the parameter of the Poisson distribution

Hi all, So far I only know one way to get the confidence limit for the Poisson distribution is to use the look-up table given by the 2 parameter (the number of observation x and the confidence level, e.g. 95%) and the table is limit by the maximum number of observations (x <= 50). I know the formula to compute the CI, however, mathematically it is not easy to do it. So, anyone know an R

Rejection sampling to draw from distributions

2008 Mar 14

Rejection sampling to draw from distributions

Dear friends, Please find below the code that I have employed for a rejection sampler to draw from asymmetric laplace distributions. I was wondering if this code can be written more efficiently? Are there more efficient ways of drawing random numbers from asymmetric laplace distributions?? Thanks in advance for your help and have a great weekend. Regards Anup

calculation for standard normal cumulative distribution

2008 Nov 01

calculation for standard normal cumulative distribution

Is there anyone knowing a function or way for standard normal cumulative distribution? ?(z=-0.1)=? also ?(z=?)=0.025 Thank you, -- View this message in context: http://www.nabble.com/calculation-for-standard-normal-cumulative-distribution-tp20282804p20282804.html Sent from the R help mailing list archive at Nabble.com.

lapply() reccursively

2009 Oct 13

lapply() reccursively

Hi all, I was wondering whether it is possible to use the lapply() function to alter the value of the input, something in the spirit of : a1<-runif(100) a2<-function(i){ a1[i]<-a1[i-1]*a1[i];a1[i] } a3<-lapply(2:100,a2) Something akin to a for() loop, but using the lapply() infrastructure. I haven't been able to get rapply() to do this. The reason is that the "real"

Finding proportion of observations that are outliers from the left tail of the normal distribution

2007 Nov 19

Finding proportion of observations that are outliers from the left tail of the normal distribution

Hi fellow users I have a new R problem i am hoping to get some pointers on. I have a dataset that is approximately normally distributed but with a fat left tail. I am interested in a good measurement on how much fatter the left tail is than can be expected from a normal distribution. One thing I'll tried was fitting a two component mixture model with the Rmix package but i am also interested

[dist]how to analise a large matrix?

2008 Aug 21

[dist]how to analise a large matrix?

Hi all, I have a matrix of about 100.000?x 4?that I need?to classify using euclidean metric. For that I am using dist?or daisy functions, but I am afraid that the message: Error in vector("double", length) : vector size specified is too large, means too much lines. Can anyone suggest me how should I analyse this matrix? Thanks in advance, Diogo Andr? Alagador MNCN,CSIC, Madrid, Spain

help on tapply using sample with differing sample-sizes

2008 Dec 03

help on tapply using sample with differing sample-sizes

Hello, My question likely got buried so I am reposting it in the hopes that someone has an answer. I have thought more about the question and modified my question. I hope tha my specific question is: I am attempting to create a bootstrap procedure for a finite sample using the theory of Rao and Wu, JASA (1988) that replicates within each strata (h) n_h - 1 times. To this end, I require a

Fitting Data to a Noncentral Chi-Squared Distribution using MLE

2007 Sep 11

Fitting Data to a Noncentral Chi-Squared Distribution using MLE

Hi, I have written out the log-likelihood function to fit some data I have (called ONES20) to the non-central chi-squared distribution. >library(stats4) >ll<-function(lambda,k){x<-ONES20; 25573*0.5*lambda-25573*log(2)-sum(-x/2)-log((x/lambda)^(0.25*k-0.5))-log(besselI(sqrt(lambda*x),0.5*k-1,expon.scaled=FALSE))} > est<-mle(minuslog=ll,start=list(lambda=0.05,k=0.006))

OT: distribution of a pathological random variate

2007 Aug 29

OT: distribution of a pathological random variate

Folks, I wonder if anything could be said about the distribution of a random variate x, where x = N(0,1)/N(0,1) Obviously x is pathological because it could be 0/0. If we exclude this point, so the set is {x/(0/0)}, does x have a well defined distribution? or does it exist a distribution that approximates x. (The case could be generalized of course to N(mu1, sigma1)/N(mu2, sigma2) and one

Limit distribution of continuous-time Markov process

2008 Jun 05

Limit distribution of continuous-time Markov process

I have (below) an attempt at an R script to find the limit distribution of a continuous-time Markov process, using the formulae outlined at http://www.uwm.edu/~ziyu/ctc.pdf, page 5. First, is there a better exposition of a practical algorithm for doing this? I have not found an R package that does this specifically, nor anything on the web. Second, the script below will give the right

Permutation Distribution

2006 Jul 20

Permutation Distribution

Hallo Is there an elegant way to do the following: Dataset consists of 2 variables: var1: some measurements, and var2: a grouping variable with two values, 1 and 2. There are (say) 10 measurements from group 1 and 15 measurements from group 2. The idea is to study the permutation distribution of mean(group 1) * mean(group2). One way would be to permute 1s and 2s and select the corresponding

Multivariate hypergeometric distribution version of phyper()

2010 Mar 30

Multivariate hypergeometric distribution version of phyper()

Dear R Users, I employed the phyper() function to estimate the likelihood that the number of genes overlapping between 2 different lists of genes is due to chance. This appears to work appropriately. Now i want to try this with 3 lists of genes which phyper() does not appear to support. Some googling suggests i can utilize the Multivariate hypergeometric distribution to achieve this. eg.:

qqnorm & huge datasets

2011 Dec 21

qqnorm & huge datasets

Hi, When qqnorm on a vector of length 10M+ I get a huge pdf file which cannot be loaded by acroread or evince. Any suggestions? (apart from sampling the data). Thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000 http://mideasttruth.com http://honestreporting.com http://camera.org http://openvotingconsortium.org http://pmw.org.il

similar to: sample from very large distribution