thr3ads.net - similar to: "Randomly split a sample in two equal subsamples"

Displaying 20 results from an estimated 4000 matches similar to: "Randomly split a sample in two equal subsamples"

2003 Feb 12

Na/NaN error in subsampling script

R-help readers, I''m having a problem with an R script (see below), which regularly generates the error message, Error in start:(start + (sample.length - 1)) : NA/NaN argument , for which I am unsure of the cause. In essence, the script (below) generates the start and end points for random subsamples from along a vector (in reality a transect (of a given length,

unbalanced anova with subsampling (Type III SS)

2011 May 21

unbalanced anova with subsampling (Type III SS)

Hello R-users, I am trying to obtain Type III SS for an ANOVA with subsampling. My design is slightly unbalanced with either 3 or 4 subsamples per replicate. The basic aov model would be: fit <- aov(y~x+Error(subsample)) But this gives Type I SS and not Type III. But, using the drop() option: drop1(fit, test="F") I get an error message: "Error in

subsampling

2005 Jan 14

subsampling

hi, I would like to subsample the array c(1:200) at random into ten subsamples v1,v2,...,v10. I tried with to go progressively like this: > x<-c(1:200) > v1<-sample(x,20) > y<-x[-v1] > v2<-sample(y,20) and then I want to do: >x<-y[-v2] Error: subscript out of bounds.

Big Data reading subsample csv

2012 Aug 16

Big Data reading subsample csv

Hello, I'm most grateful for your time to read this. I have a uber size 30GB file of 6 million records and 3000 (mostly categorical data) columns in csv format. I want to bootstrap subsamples for multinomial regression, but it's proving difficult even with my 64GB RAM in my machine and twice that swap file , the process becomes super slow and halts. I'm thinking about generating

how to subsample all possible combinations of n species taken 1:n at a time?

2009 Apr 06

how to subsample all possible combinations of n species taken 1:n at a time?

Hello I apologise for the length of this entry but please bear with me. In short: I need a way of subsampling communities from all possible communities of n taxa taken 1:n at a time without having to calculate all possible combinations (because this gives me a memory error - using combn() or expand.grid() at least). Does anyone know of a function? Or can you help me edit the combn or

Selecting subsamples

2003 Dec 04

Selecting subsamples

Hi all, I?m working with a dataset with 9 columns and 2000 rows. Each row represents an individual and one of the columns represents the volume of that individual (measured in cubic meters). I?d like to select a sample from this dataset (without considering any probability of the rows) in which the sum of the volume of the individuals in that sample >= 100 cubic m. I?ll appreciate any

random sampling but with caveats!

2011 Sep 08

random sampling but with caveats!

Hi, I wonder if someone can help me. I have built a gam model to predict the presence of cold water corals and am now trying to evaluate my model by splitting my dataset into training/test datasets. In an ideal world I would use the sample() function to randomly select rows of data for me so for example with 936 rows of data in my HH dataset I might say ss <- sample(nrow(HH), size =

routine for dependent correlation test with stratified random sample

2011 May 13

routine for dependent correlation test with stratified random sample

Dear R-List, I would like to have a large number of stratified random subsamples drawn from my dataframe and automatically test for correlation differences in every subsample. Let this be my dataframe df<-data.frame(group=c(rep(1,5),rep(2,5),rep(3,5)),a=c(3,4,5,6,3,4,5,4,5,4,1,2,1,2,1),b=c(1,2,3,4,5,3,4,3,4,5,6,5,6,2,3),c=c(2,2,3,3,5,1,1,6,6,5,6,1,1,2,1)) Then I would like to have n

Standard error of standard deviation: bootstrap or theoretical results?

2003 Aug 06

Standard error of standard deviation: bootstrap or theoretical results?

Dear R users, This is more a statistical question rather than an R question. I'd appreciate it if you can give me some suggestions. I have a sample of a time series (sample size 500, fat tail in density). I am trying to calculate the Standard error of standard deviation of a sub-block-sample (sample size 250). I take 100 this kind of sub-block-sample, randomly. For these 100 subsamples, I

pseudo code

2007 Oct 09

pseudo code

Hey there! I got a pseudo code and don't know how to apply it to R, maybe someone can help me: Input: A dataset X, kmax: maximum number of clusters, num_subsamples: number of subsamples. Output: S(i; k) - a distribution of similarities between partitions into k clusters of a reference clustering and clustering of subsamples; i = 1 to num_subsamples Requires: T = cluster(X): A hierarchical

fwdmsa package: Error in search.normal(X[samp, ], verbose = FALSE) : At least one item has no variance

2012 Mar 21

fwdmsa package: Error in search.normal(X[samp, ], verbose = FALSE) : At least one item has no variance

I'm using the fwdmsa package to identify deviant cases in a Mokken scale analysis. I've run into a problem., separate from the one I posted previously. The problem comes with items that are "easy" by IRT standards. A good scale should include a range of difficulties; yet when I include "easy" items in a forward search I continuously run into the problem that these items

Skewed t distribution

2006 Mar 28

Skewed t distribution

Dear All, I am working with skewed-t copula in my research recently, so I needed to write an mle procedure instead of using a standard fit one; I stick to the sn package. On subsamples of the entire population that I deal with, everything is fine. However, on the total sample (difference in cross-sectional dimension: 30 vs 240) things go wrong - the objective function diverges to infinity. I

keyboard activity logging in FreeBSD

2004 Jan 23

keyboard activity logging in FreeBSD

Hi, I would like to log all keyboard activities in all ttys in my FreeBSD 5.2 box. Is there anyway to do it? I read the watch man page and it seems like I should run watch with tty as many times as number of ttys. Am I right? Also is it possible to do the log in invisible way? The main reason is to log all commands typed in shell and tty and send the log to the remote server. How can I

Sample of a subsample

2017 Sep 25

Sample of a subsample

Hello everybody! I have the following problem: I'd like to select a sample from a subsample in a dataset. Actually, I don't want to select it, but to create a new variable sampleNo that indicates to which sample (one or two) a case belongs to. Lets suppose I have a dataset containing 40 cases: data <- data.frame(var1=seq(1:40), var2=seq(40,1)) The first sample (n=10) I drew like

Working with daily data

2009 May 25

Working with daily data

Hello I have daily S&P 500 from 1950 for which I would like to do some time series analysis in R. Could someone please show me an example of how to create a ts/ irts object for my data? Additionally, how do I create monthly subsamples of the data. I've experimented with the window function but haven't had any luck. Thank you Ian Confidential: This electronic message and

Subsetting data systematically

2011 Jun 22

Subsetting data systematically

I would like to subset data from a larger dataset and generate a smaller dataset. However, I don't want to use sample() because it does it randomly. I would like to take non-random subsamples, for example, every 2nd number, or every 3rd number. Is there a procedure that does this? Thanks, Nate -- View this message in context:

Graphics with moderately large amounts of data

2001 Dec 10

Graphics with moderately large amounts of data

Hi, A major attraction to R and to S-plus are the graphics. (Up to now my experience is with STATA and SAS.) Most of the graphical examples that I have seen in the documentation are for relatively small size data sets. I am working with a moderately large data set -- the order of magnitude is 180,000 observations by 50 variables. There seem to be standard problems that I keep bumping into in

Sample of a subsample

2017 Sep 25

Sample of a subsample

For personal aesthetic reasons, I changed the name "data" to "dat". Your code, with a slight modification: set.seed (1357) ## for reproducibility dat <- data.frame(var1=seq(1:40), var2=seq(40,1)) dat$sampleNo <- 0 idx <- sample(seq(1,nrow(dat)), size=10, replace=F) dat[idx,"sampleNo"] <-1 ## yielding > dat var1 var2 sampleNo 1 1 40

how to store estimates results as scalars of a matrix?

2004 Jun 18

how to store estimates results as scalars of a matrix?

Dear R users, I've written a loop to generate Moran's test (spdep package) on serval subsamples of a large dataset. See below a short example. My loop is working fine, however I would like to be able to store the test results as lines of a matrix, that I would latter be able to export as a dataset. My problem is that I'm not sure how I could do this using R. Any help will be much

Limitations of audio processing in R

2011 Sep 21

Limitations of audio processing in R

Hello everybody I am trying to process audio files in R and had some problems with files size. I?m using R packages 'audio' and 'sound'. I?m trying a really simple thing and it is working well with small sized .wav files. When I try to open huge audio files I received this error message: "cannot allocate vector of size 2.7 Gb". My job is open in R a 3-hour .wav file,

similar to: Randomly split a sample in two equal subsamples