thr3ads.net - similar to: "sampling and testing"

Displaying 20 results from an estimated 700 matches similar to: "sampling and testing"

2010 Oct 21

RandomForest Proximity Matrix

Greetings R Users! I am posting to inquire about the proximity matrix in the randomForest R-package. I am having difficulty pushing very large data through the algorithm and it appears to hang on the building of the prox matrix. I have read on Dr. Breiman's website that in the original code a choice can be made between using an N x N matrix OR to increase the ability to compute large

Retain parts of a matrix

2011 Nov 28

Retain parts of a matrix

Hi all, I'm working to apply a function that will generate a matrix of results only when a specific criteria is met. I want my final results to be a matrix with both the values that meet the criteria (the results of the function), and those that to do in the same positions in the matrix (the original numbers). Here's a sample of what I would like to do: t.mean.1.c <- c(-15, -20,

Installing rgdal in R: correct -configure flags for GDAL install on Linux Redhat

2011 May 06

Installing rgdal in R: correct -configure flags for GDAL install on Linux Redhat

Hi, I'm installing rgdal but I keep having failures because I have not been able to find a good source of information for the correct configuration settings when installing GDAL. My error from the R install.packages("rgdal") is below. Can someone point me to a good source to tell me how to set after ./configure when installing GDAL? I'd like to be able to work with raster

Timeseries Data Plotted as Monthly Boxplots

2011 Feb 16

Timeseries Data Plotted as Monthly Boxplots

Hello, I'm trying to develop a box plot of time series data to look at the range in the data values over the entire period of record. My data initially starts out as a list of hourly data, and then I've been using this code to make this data into the final ts array. # Read in the station list stn.list <- read.csv("/home/kbennett/fews/stnlist3", as.is=T, header=F) # Read in

Converting text to numbers

2006 Sep 27

Converting text to numbers

Hi, I have Forecast Class and Observed Class in a data matrix as below. > Sample1 FCT OBS 1 1 5 2 2 4 3 3- 3+ 4 3 3 5 3+ 3- 6 4 2 7 5 1 I want to find the difference between Observed and Forecast Classes. How can I get this done? I tried to following to convert the 1 through 5 classes, to 1 through 7 for both OBS and FCT column. > Sample1$OBS2 <- Sample1$OBS

phantom NA/NaN/Inf in foreign function call (or something altogether different?)

2012 Jul 31

phantom NA/NaN/Inf in foreign function call (or something altogether different?)

Dear experts, Please forgive the puzzled title and the length of this message - I thought it would be best to be as complete as possible and to show the avenues I have explored. I'm trying to fit a linear model to data with a binary dependent variable (i.e. Target.ACC: accuracy of response) using lrm, and thought I would start from the most complex model (of which "sample1.lrm1" is

computing marginal values based on multiple columns?

2012 Dec 04

computing marginal values based on multiple columns?

Hello all, I have what feels like a simple problem, but I can't find an simple answer. Consider this data frame: > x <- data.frame(sample1=c(35,176,182,193,124), sample2=c(198,176,190,23,15), sample3=c(12,154,21,191,156), class=c('a','a','c','b','c')) > x sample1 sample2 sample3 class 1 35 198 12 a 2 176 176

How to Store the executed values in a dataframe & rle function

2011 Sep 26

How to Store the executed values in a dataframe & rle function

Hi group, This is how my test file looks like: Chr start end sample1 sample2 chr2 9896633 9896683 0 0 chr2 9896639 9896690 0 0 chr2 14314039 14314098 0 -0.35 chr2 14404467 14404502 0 -0.35 chr2 14421718 14421777 -0.43 -0.35 chr2 16031710 16031769 -0.43 -0.35 chr2 16036178 16036237 -0.43 -0.35 chr2 16048665 16048724 -0.43 -0.35 chr2 37491676 37491735 0 0 chr2 37702947 37703009 0 0

average columns of data frame corresponding to replicates

2010 Sep 07

average columns of data frame corresponding to replicates

Hi Group, I have a data frame below. Within this data frame there are samples (columns) that are measured more than once. Samples are indicated by "idx". So "id1" is present in columns 1, 3, and 5. Not every id is repeated. I would like to create a new data frame so that the repeated ids are averaged. For example, in the new data frame, columns 1, 3, and 5 of the original

Loop with random sampling and write.table

2011 Sep 03

Loop with random sampling and write.table

Hi! I need to perform this simple sampling function several hundred times: x1=as.character(rnorm(1000, 100, 15)) x2=as.character(rnorm(1000, 150, 10)) y1=as.data.frame(x1,x2) sample1=as.data.frame(sample(y1$x1, 12, replace = FALSE, prob = NULL)) sample1 write.table(sample1, "sample1.txt", sep=" ",row.names=F,quote=F) My knowledge of loops is quite low. How can I produce 100

help to slip a file name using "strsplit" function

2012 Jan 25

help to slip a file name using "strsplit" function

Dear Researchers, I have several files as this example: Myfile_MyArea1_sample1.txt i wish to split in "Myfile", "MyArea1", "sample1", and "txt", becasue i need to use "sample1" label. I try to use "strsplit" but I am able just to split as "Myfile_MyArea1_sample1" and "txt" OR "Myfile", "MyArea1",

conditional filter resulting in 2 new dataframes

2011 Aug 14

conditional filter resulting in 2 new dataframes

This is what I am starting with: initial<- matrix(c(1,5,4,8,4,4,8,6,4,2,7,5,4,5,3,2,4,6), nrow=6, ncol=3,dimnames=list(c("1900","1901","1902","1903","1904","1905"), c("sample1","sample2","sample3"))) And I need to apply a filter (in this case, any value <5) to give me one dataframe with only the

A question regarding bootstrap

2009 Feb 02

A question regarding bootstrap

Dear List Members, I have two small samples (n=20), the distributions are highly skewed. Does it make any sense to do a boostrap test to check for difference in means? And if so, could this be done like this: x <- numeric(10000) for(i in 1:10000) { x[i] <- mean(sample(sample1,replace=TRUE)) - mean(sample(sample2,replace=TRUE)) } (mean(sample1)-mean(sample2))/sd(x) Regards, Erika

Re :argument is not numeric or logical

2010 May 02

Re :argument is not numeric or logical

Hi all, I have data size of : > dim(sample) [1] 35943 17 The first column is "stdate" - is date ( 01/11/2009 00:00:00,02/11/2009 00:00:00,02/11/2009 00:00:00 etc... ) Login is 13th column - is numbers (12,0,1 erc...) The below operation return the following error. > sample1 <- read.csv(file="sample1.csv",sep=",",header=TRUE) > avglog <-

Random sample from a data frame where ID column values don't match the values in an ID column in a second data frame

2012 Mar 29

Random sample from a data frame where ID column values don't match the values in an ID column in a second data frame

Hello, Let's say I've drawn a random sample (sample1.df) from a large data frame (main.df), and I want to create a second random sample (sample2.df) where the values in its ID column *are not* in the equivalent ID column in the first sample (sample1.df). How would I go about doing this? In other words: The values in sample2.df$ID *are not found* in sample1.df$ID, and both samples are

how to pass "arguments" to a function within a function?

2007 May 10

how to pass "arguments" to a function within a function?

I have searched the r-help files but have not been able to find an answer to this question. I apologize if this questions has been asked previously. (Please excuse the ludicrousness of this example, as I have simplified my task for the purposes of this help inquiry. Please trust me that something like this will in fact be useful what I am trying to accomplish. I am using R 2.4.1 in Windows XP.)

Running R embedded in an mpiexec spawned process - Fatal error: you must specify '--save', '--no-save' or '--vanilla'

2013 Nov 21

Running R embedded in an mpiexec spawned process - Fatal error: you must specify '--save', '--no-save' or '--vanilla'

I'd like someone familiar with the R options initialization to comment on a difference of behavior within/without mpiexec I have a (.NET) application with embedded R that is proven to run in a single process: ./Sample1.exe on a Debian Linux with R 3.0.2 Running the same code with mpiexec, it fails at the R engine initialization: mpiexec -n 1 ./Sample1.exe Fatal error: you must

simple if question

2011 Mar 26

simple if question

Hi everyone, I have just got different samples from a dataframe (independent and exclusive, there aren't common elements among them). I want to create a variable that indicate the sampling selection of the elements in the original dataframe (for example, 0 = no selected, 1= sample 1, 2=sample 2, etc.). I have tried to do it with ifelse command, but the problem is that the second line

Converting Strings to Variable names

2010 Nov 04

Converting Strings to Variable names

Hi all, I am processing 24 samples data and combine them in single table called CombinedSamples using following: CombinedSamples<-rbind(Sample1,Sample2,Sample3) Now variables Sample1, Sample2 and Sample3 have many different columns. To make it more flexible for other samples I'm replacing above code with a for loop: #Sample is a string vector containing all 24 sample names for (k in

T-test to check equality, unable to interpret the results.

2009 Sep 16

T-test to check equality, unable to interpret the results.

Hi, I have the precision values of a system on two different data sets. The snippets of these results are as shown: sample1: (total 194 samples) 0.6000000238 0.8000000119 0.6000000238 0.2000000030 0.6000000238 ... ... sample2: (total 188 samples) 0.80000001 0.20000000 0.80000001 0.00000000 0.80000001 0.40000001 ... ... I want to check if these results are statistically significant? Intuitively,

similar to: sampling and testing