thr3ads.net - similar to: "randomForest memory footprint"

Displaying 20 results from an estimated 700 matches similar to: "randomForest memory footprint"

2011 Feb 25

speed up process

Dear users, I have a double for loop that does exactly what I want, but is quite slow. It is not so much with this simplified example, but IRL it is slow. Can anyone help me improve it? The data and code for foo_reg() are available at the end of the email; I preferred going directly into the problematic part. Here is the code (I tried to simplify it but I cannot do it too much or else it

Boxplot Help for Neophyte

2006 Feb 20

Boxplot Help for Neophyte

R helpers I am getting to grips with R but came across a small problem today that I could not fix by myself. I have 3 text files, each with a single column of data. I read them in using: myData1<-scan("C:/Program Files/R/myData1.txt") myData2<-scan("C:/Program Files/R/myData2.txt") myData3<-scan("C:/Program Files/R/myData3.txt") I wanted to produce a

Using apply for two datasets

2009 Jan 06

Using apply for two datasets

I can run one-sample t-test on an array, for example a matrix myData1, with the following apply(myData1, 2, t.test) Is there a similar fashion using apply() or something else to run 2-sample t-test with datasets from two groups, myData1 and myData2, without looping? TIA, Gang

insert missing dates

2012 Jul 03

insert missing dates

Hello I have dataframes. mydata1 <-data.frame(value=c(15,20,25,30,45,50),dates=c("2005-05-25 07:00:00 ","2005-05-25 19:00:00","2005-06-25 07:00:00","2005-06-25 19:00:00 ","2005-07-25 07:00:00","2005-8-25 19:00:00")) or mydata2 <-data.frame(value=c(15,20,25,30,45,50),dates=c("2005-05-25 00:00:00 ","2005-05-25

reading data into R

2012 May 15

reading data into R

Hi I am really new using R, so this is really a beginner stuff! I created a very small data set on excel and then converted it to .csv file. I am able to open the data on R using the command "read.table ("mydata1.csv", sep=",", header=T)" and it just works fine. But when I want to work on the data (e.g. calculate the mean of variable "X") R says

GLS - Plotting Graphs with 95% conf interval

2011 Jul 11

GLS - Plotting Graphs with 95% conf interval

Hi, I am trying to plot the original data with the line of the model using the predict function. I want to add SE to the graph, but not sure how to get them out as the predict function for gls does not appear to allow for SE=TRUE argument. Here is my code so far: f1<-formula(MaxNASC40_50~hu3+flcmax+TidalFlag) vf1Exp<-varExp(form=~hu3) B1D<-gls(f1,correlation=corGaus(form=Lat~Lon,

problem about set operation and computation after split

2012 Jun 06

problem about set operation and computation after split

hi, I met some problems in R, plz help me. 1. How to do a intersect operation among several groups in one list, without a loop statement? (I think It may be a list) create data: myData <- data.frame(product = c(1,2,3,1,2,3,1,2,2), year=c(2009,2009,2009,2010,2010,2010,2011,2011,2011),value=c(1104,608,606,1504,508,1312,900,1100,800)) mySplit<- split(myData,myData$year)

Semi Parametric Bootstrap

2013 Jan 10

Semi Parametric Bootstrap

Greetings to you all, I am performing a semi parametric bootstrap in R on a Gamma Distributed data and a Binomial distributed data. The main challenge am facing is the fact that the residual variance depends on the mean (if I am correct). I strongly feel that the script below may be wrong due to mean-variance relationship #####R code####### fit1s

How can I import user-defined missings from Spss?

2008 Apr 15

How can I import user-defined missings from Spss?

Hi, It works for me to import spss datasets via library(foreign) with read.spss or via library Hmisc by (spss.get). But no matter which way I do import the data, user-defined missings from Spss are always lost. (it makes no difference if there are a single value, a range, or any combination of them. They are always ignored). Is there any way in R to find out if any value was user-defined missing

confidence intervals for mean (GLM)

2010 Jan 22

confidence intervals for mean (GLM)

Dear useRs, How could I obtain the confidence intervals for the means of my treatments, when my data was fitted to a GLM? I need the CI's for the Poisson and Negative Binomial distributions. Here's what I have: mydata1 <- data.frame('treatments'=gl(4,20), 'value'=rpois(80, 1)) model1 <- glm(value ~ treatments, data=mydata1, family=poisson) means1 <-

stats 'dist' euclidean distance calculation

2018 Mar 15

stats 'dist' euclidean distance calculation

Hello, I am working with a matrix of multilocus genotypes for ~180 individual snail samples, with substantial missing data. I am trying to calculate the pairwise genetic distance between individuals using the stats package 'dist' function, using euclidean distance. I took a subset of this dataset (3 samples x 3 loci) to test how euclidean distance is calculated: 3x3 subset used

Fwd: Re: speed up process

2011 Feb 28

Fwd: Re: speed up process

Dear Jim, Here is again exactly what I did and with the output of Rprof (with this reduced dataset and with a simpler function, it is here much faster than in real life). Thanks you again for your help! ## CODE ## mydata1<- structure(list(species = structure(1:8, .Label = c("alsen","gogor", "loalb", "mafas", "pacyn", "patro",

Error Handling

2008 Jun 24

Error Handling

Hi All, The for-loop below stopped when error("Cannot get confidence intervals on var-cov components: Non-positive definite approximate variance-covariance") occurred. I assigned a row of NA values to the data frame "m1" manually and reset "j" in the for-loop every time error returned. I’m wondering if there is a function that can detect error or failure, so the

Kolmogorov Smirnov Test

2010 Nov 11

Kolmogorov Smirnov Test

I'm using ks.test (mydata, dnorm) on my data. I know some of my different variable samples (mydata1, mydata2, etc) must be normally distributed but the p value is always < 2.0^-16 (the 2.0 can change but not the exponent). I want to test mydata against a normal distribution. What could I be doing wrong? I tried instead using rnorm to create a normal distribution: y = rnorm

Data in packages: save or write.table?

2013 May 02

Data in packages: save or write.table?

Hi all, I am trying to understand Writing R Extension... Section 1.1.5, data: I include two datasets in a package, one using 'save', the other using 'write.table': --- 8< ---- myData1 <- data.frame(x=1:10) write.table(myData1,file="myData1.txt") myData2 <- data.frame(x=2:10) save(myData2,file="myData2.Rdata") --- 8< ---- Then R CMD check aks me to

i need help in cluster analyse

2003 Jul 17

i need help in cluster analyse

Hello, My name is Rodrigo, I am using R program and I have a trouble. I am trying to do a dendrogram with genetics information. Let me explain... The Similarity Matrix was already did, and with this matrix I want to construct a dendrogram. So, the distance is done. I need to transform this matrix (that I have) in a dendrogram, I woud be very grateful if someone could help me. PS: I am sending

multivariate regression and lm()

2012 Mar 16

multivariate regression and lm()

Hello, I would like to perform a multivariate regression analysis to model the relationship between m responses Y1, ... Ym and a single set of predictor variables X1, ..., Xr. Each response is assumed to follow its own regression model, and the error terms in each model can be correlated. Based on my readings of the R help archives and R documentation, the function lm() should be able to

creating NAs for some values only

2011 Feb 13

creating NAs for some values only

Hello, I have some data file, say, mydata 1,2,3,4,5,6,7 3,3,4,4,w,w,1 w,3,6,5,7,8,9 4,4,w,5,3,3,0 i want to replace some percentages of "mydata" file in to NAs for those values that are NOT w's. I know how to apply the percentage thing here but don't know how to select those values that are not "w"s. So far, i was able to do it but the result replaces the w's

Multi part problem...array manipulation and sampling

2012 Apr 05

Multi part problem...array manipulation and sampling

Ok, I have a new, multipart problem that I need help figuring out. Part 1. I have a three dimensional array (species, sites, repeat counts within sites). Sampling effort per site varies so the array should be ragged. Maximum number of visits at any site = 22 Number of species = 161 Number of sites = 56 I generated the array first by;

help on model selection - step()

2008 Aug 11

help on model selection - step()

dears R-users, I'm interested in model selection problem, and i have faced some problems that i would like to ask for help. well, this is a very small example with 4 variable (just one var. is the response - z) with 100 individuals i would like to do a stepwise search, for the "best" model, and a use BIC criteria. I know when I have a lot of variables, let's say 120, I know,

similar to: randomForest memory footprint