similar to: Randomly split a sample in two equal subsamples

Displaying 20 results from an estimated 4000 matches similar to: "Randomly split a sample in two equal subsamples"

2003 Feb 12
1
Na/NaN error in subsampling script
R-help readers, I''m having a problem with an R script (see below), which regularly generates the error message, Error in start:(start + (sample.length - 1)) : NA/NaN argument , for which I am unsure of the cause. In essence, the script (below) generates the start and end points for random subsamples from along a vector (in reality a transect (of a given length,
2011 May 21
2
unbalanced anova with subsampling (Type III SS)
Hello R-users, I am trying to obtain Type III SS for an ANOVA with subsampling. My design is slightly unbalanced with either 3 or 4 subsamples per replicate. The basic aov model would be: fit <- aov(y~x+Error(subsample)) But this gives Type I SS and not Type III. But, using the drop() option: drop1(fit, test="F") I get an error message: "Error in
2005 Jan 14
5
subsampling
hi, I would like to subsample the array c(1:200) at random into ten subsamples v1,v2,...,v10. I tried with to go progressively like this: > x<-c(1:200) > v1<-sample(x,20) > y<-x[-v1] > v2<-sample(y,20) and then I want to do: >x<-y[-v2] Error: subscript out of bounds.
2012 Aug 16
1
Big Data reading subsample csv
Hello, I'm most grateful for your time to read this. I have a uber size 30GB file of 6 million records and 3000 (mostly categorical data) columns in csv format. I want to bootstrap subsamples for multinomial regression, but it's proving difficult even with my 64GB RAM in my machine and twice that swap file , the process becomes super slow and halts. I'm thinking about generating
2009 Apr 06
3
how to subsample all possible combinations of n species taken 1:n at a time?
Hello I apologise for the length of this entry but please bear with me. In short: I need a way of subsampling communities from all possible communities of n taxa taken 1:n at a time without having to calculate all possible combinations (because this gives me a memory error - using combn() or expand.grid() at least). Does anyone know of a function? Or can you help me edit the combn or
2003 Dec 04
4
Selecting subsamples
Hi all, I?m working with a dataset with 9 columns and 2000 rows. Each row represents an individual and one of the columns represents the volume of that individual (measured in cubic meters). I?d like to select a sample from this dataset (without considering any probability of the rows) in which the sum of the volume of the individuals in that sample >= 100 cubic m. I?ll appreciate any
2003 Aug 06
1
Standard error of standard deviation: bootstrap or theoretical results?
Dear R users, This is more a statistical question rather than an R question. I'd appreciate it if you can give me some suggestions. I have a sample of a time series (sample size 500, fat tail in density). I am trying to calculate the Standard error of standard deviation of a sub-block-sample (sample size 250). I take 100 this kind of sub-block-sample, randomly. For these 100 subsamples, I
2007 Oct 09
1
pseudo code
Hey there! I got a pseudo code and don't know how to apply it to R, maybe someone can help me: Input: A dataset X, kmax: maximum number of clusters, num_subsamples: number of subsamples. Output: S(i; k) - a distribution of similarities between partitions into k clusters of a reference clustering and clustering of subsamples; i = 1 to num_subsamples Requires: T = cluster(X): A hierarchical
2012 Mar 21
1
fwdmsa package: Error in search.normal(X[samp, ], verbose = FALSE) : At least one item has no variance
I'm using the fwdmsa package to identify deviant cases in a Mokken scale analysis. I've run into a problem., separate from the one I posted previously. The problem comes with items that are "easy" by IRT standards. A good scale should include a range of difficulties; yet when I include "easy" items in a forward search I continuously run into the problem that these items
2011 Sep 08
1
random sampling but with caveats!
Hi, I wonder if someone can help me. I have built a gam model to predict the presence of cold water corals and am now trying to evaluate my model by splitting my dataset into training/test datasets. In an ideal world I would use the sample() function to randomly select rows of data for me so for example with 936 rows of data in my HH dataset I might say ss <- sample(nrow(HH), size =
2011 May 13
0
routine for dependent correlation test with stratified random sample
Dear R-List,   I would like to have a large number of stratified random subsamples drawn from my dataframe and automatically test for correlation differences in every subsample.   Let this be my dataframe   df<-data.frame(group=c(rep(1,5),rep(2,5),rep(3,5)),a=c(3,4,5,6,3,4,5,4,5,4,1,2,1,2,1),b=c(1,2,3,4,5,3,4,3,4,5,6,5,6,2,3),c=c(2,2,3,3,5,1,1,6,6,5,6,1,1,2,1))   Then I would like to have n
2006 Mar 28
2
Skewed t distribution
Dear All, I am working with skewed-t copula in my research recently, so I needed to write an mle procedure instead of using a standard fit one; I stick to the sn package. On subsamples of the entire population that I deal with, everything is fine. However, on the total sample (difference in cross-sectional dimension: 30 vs 240) things go wrong - the objective function diverges to infinity. I
2004 Jan 23
2
keyboard activity logging in FreeBSD
Hi, I would like to log all keyboard activities in all ttys in my FreeBSD 5.2 box. Is there anyway to do it? I read the watch man page and it seems like I should run watch with tty as many times as number of ttys. Am I right? Also is it possible to do the log in invisible way? The main reason is to log all commands typed in shell and tty and send the log to the remote server. How can I
2009 May 25
1
Working with daily data
Hello I have daily S&P 500 from 1950 for which I would like to do some time series analysis in R. Could someone please show me an example of how to create a ts/ irts object for my data? Additionally, how do I create monthly subsamples of the data. I've experimented with the window function but haven't had any luck. Thank you Ian Confidential: This electronic message and
2011 Jun 22
1
Subsetting data systematically
I would like to subset data from a larger dataset and generate a smaller dataset. However, I don't want to use sample() because it does it randomly. I would like to take non-random subsamples, for example, every 2nd number, or every 3rd number. Is there a procedure that does this? Thanks, Nate -- View this message in context:
2001 Dec 10
1
Graphics with moderately large amounts of data
Hi, A major attraction to R and to S-plus are the graphics. (Up to now my experience is with STATA and SAS.) Most of the graphical examples that I have seen in the documentation are for relatively small size data sets. I am working with a moderately large data set -- the order of magnitude is 180,000 observations by 50 variables. There seem to be standard problems that I keep bumping into in
2004 Jun 18
1
how to store estimates results as scalars of a matrix?
Dear R users, I've written a loop to generate Moran's test (spdep package) on serval subsamples of a large dataset. See below a short example. My loop is working fine, however I would like to be able to store the test results as lines of a matrix, that I would latter be able to export as a dataset. My problem is that I'm not sure how I could do this using R. Any help will be much
2011 Sep 21
2
Limitations of audio processing in R
Hello everybody I am trying to process audio files in R and had some problems with files size. I?m using R packages 'audio' and 'sound'. I?m trying a really simple thing and it is working well with small sized .wav files. When I try to open huge audio files I received this error message: "cannot allocate vector of size 2.7 Gb". My job is open in R a 3-hour .wav file,
2010 Jul 12
2
exercise in frustration: applying a function to subsamples
>From the documentation I have found, it seems that one of the functions from package plyr, or a combination of functions like split and lapply would allow me to have a really short R script to analyze all my data (I have reduced it to a couple hundred thousand records with about half a dozen records. I get the same result from ddply and split/lapply: >
2004 Jul 26
1
group definition for a bootstrap
Hi, This is probably really simple, but I am clearly not R-minded, I have read the help files, and reread them, and I still can't work out what to do... I have a data frame (d) with 3 columns (age (0-5), quarter (1-4) and x). I want to estimate the precision of my mean x by age and quarter, so I want to carry out a bootstrap for each group. I am trying to do this within a loop, so I don't