thr3ads.net - similar to: "Loading only particular columns from csv file..."

Displaying 20 results from an estimated 10000 matches similar to: "Loading only particular columns from csv file..."

2008 May 13

Max consecutive increase in sequence

Hi all R helpers, I'm trying to comeup with nice and elegant way of "detecting" consecutive increases/decreases in the sequence of numbers. I'm trying with combination of which() and diff() functions but unsuccesifuly. For example: sq <- c(1, 2, 3, 4, 4, 4, 5, 6, 5, 4, 3, 2, 1, 1, 1, 1, 1); I'd like to find way to calculate a) maximum consecutive increase = 3 (from 1

Subsetting data frame problem....

2008 Jan 02

Subsetting data frame problem....

Dear R users, I'm new but already fascinated R user so please forgive for my ignorance. I have the problem, I read most of help pages but couldn't find the solution. The problem follows.... I have large data set 10,000 rows and more than 100 columns... Say something like var1,var2,var2,var4.......var120 ------------------------------------------- 12,12,345,657,67,8.....

Stepwise logistic regression....take too long...

2008 Apr 20

Stepwise logistic regression....take too long...

Dear R helpers, I'm trying to build logistic regression model large dataset 360 factors and 850 observations. All 360 factors are known to be good predictors of outcome variable but I have to find best model with maximum 10 factors. I tried to fit full model and use stepAIC function to get best model but unfortenatly, the process takes too long to complete (more than 4 hours)... Is it

Discretize continous variables....

2008 Jul 19

Discretize continous variables....

Hi R helpers, I'm preparing dataset to fir logistic regression model with lrm(). I have various cointinous and discrete variables and I would like to: 1. Optimaly discretize continous variables (Optimaly means, maximizing information value - IV for example) 2. Regroup discrete variables to achieve perhaps smaller number of level and better information value... Please suggest if there is

How to see source code for na.omit?

2008 Oct 27

How to see source code for na.omit?

Hi R helpers, I'd like to see source code for some of built-in R functions... for example, I would like too see how "na.omit" was implemented? Thanks?

One rather theoretical question about fitting algorithm

2009 Jun 07

One rather theoretical question about fitting algorithm

Hi, What I'm trying to achieve is very fast algorithm for fitting logistic regression model. I have to estimate regression coeficients using about 10k observations. Once I have coefficients estimated, new 100 rows of data becomes available.... Now I need to reestimate coeficients using 100 newly arrived observations and removing 100 oldest observations. So, my question is would it be

Reading in a value of .Random.seed in .Rprofile

2008 Aug 20

Reading in a value of .Random.seed in .Rprofile

For reasons that are best known to myself [ ;-) ] I have a value of .Random.seed saved (via dput()) in a file ``.Random.seed.save''. In my .Rprofile I have the lines: .Random.seed <- dget(".Random.seed.save") Junk <- dget(".Random.seed.save") print(all.equal(.Random.seed,dget(".Random.seed.save")))

subscripting a one column matrix drops dimension

2008 Oct 21

subscripting a one column matrix drops dimension

Hi all, Why subscripting a one column matrix drops one dimension? > x<- matrix(rnorm(100), ncol=1) > str(x) num [1:100, 1] -0.413 -0.845 -1.625 -1.393 0.507 ... > str(x[20:30,]) num [1:11] -0.315 -0.693 -0.771 0.448 0.204 ... > str(x[20:30]) num [1:11] -0.315 -0.693 -0.771 0.448 0.204 ... This breaks: > cov(x) [,1] [1,] 0.9600812 >

power of a multiway ANOVA

2008 Jun 05

power of a multiway ANOVA

dear all, in the package pwr , there is the fonction power.anova.test which permit to obtain the power for a one-way ANOVA...but I'm looking for a way to compute the power of a multiway ANOVA.( find the 1-beta). Is it possible? do you have some ideas ? regards [[alternative HTML version deleted]]

Confidence limits for the parameter of the Poisson distribution

2008 Nov 06

Confidence limits for the parameter of the Poisson distribution

Hi all, So far I only know one way to get the confidence limit for the Poisson distribution is to use the look-up table given by the 2 parameter (the number of observation x and the confidence level, e.g. 95%) and the table is limit by the maximum number of observations (x <= 50). I know the formula to compute the CI, however, mathematically it is not easy to do it. So, anyone know an R

Duplicates among columns of a data frame

2008 Dec 15

Duplicates among columns of a data frame

Dear list, I have a data frame of survey respondents, a little like this: set.seed(20081215) n <- 100 dat <- data.frame(id=1:100, addr1=sample(LETTERS, n, replace=TRUE), addr2=sample(LETTERS, n, replace=TRUE), addr3=sample(LETTERS, n, replace=TRUE)) head(dat) id addr1 addr2 addr3 1 1 R H Q 2 2 H C K 3 3

rowMean, specify subset of columns within Dataframe?

2007 Nov 25

rowMean, specify subset of columns within Dataframe?

I would like to calculate the mean of tree leader increment growth over 5 years (I1 through I5) where each tree is a row and each row has 5 columns. So far I have achieved this using rowMeans when all columns are numeric type and used in the calculation: Data1 <- data.frame(cbind(I1 = 3, I2 = c(0,3:1, 2:5,NA), I3 =c(1:4,NA,5:2),I4=2,I5=3)) Data1 Data1$mean_5 <- rowMeans(Data1, na.rm =T)

Partially reading a file (particularly)

2007 May 29

Partially reading a file (particularly)

Hello, I am trying to figure out if there exists some R command that allows one to be particularly selective when reading a file. I'm dealing with large fixed-width data sets that look like 539001.. 639001.. 639001.. ... 539002.. 639002.. ... Presently, I am using read.fwf to read an entire file, but I am interested only in reading those records beginning with 5. I have been unable to

Outputting csv file from dataframe with columns in a particular order

2011 Jan 12

Outputting csv file from dataframe with columns in a particular order

I have a dataframe with columns "ID",'date","estimate","actual" (but not necessarily in that order - I do a merge somewhere and that somehow messes up the order of the columns). How can I output it to a csv file with the columns in the order that I want? Thanks.

Calculating R2 for a unit slope regression

2008 Nov 03

Calculating R2 for a unit slope regression

Does anyone know of a literature reference, or a piece of code that can help me calculate the amount of variation explained (R2 value), in a regression constrained to have a slope of 1 and an intercept of 0? Thanks! Sebastian J. Sebastián Tello Department of Biological Sciences 285 Life Sciences Building Louisiana State University Baton Rouge, LA, 70803 (225) 578-4284 (office and lab.)

Merge or combine data frames with missing columns

2008 Dec 29

Merge or combine data frames with missing columns

Hi R-experts, suppose I have a list with containing data frame elements: [[1]] (Intercept) y1 y2 y3 y4 -6.64 0.761 0.383 0.775 0.163 [[2]] (Intercept) y2 y3 -3.858 0.854 0.834 Now I want to put them into ONE dataframe like this: (Intercept) y1

Matching on multiple columns

2007 Jan 11

Matching on multiple columns

Am I correct in believing that one cannot match on multiple columns? One can indeed subset on multiple criteria from different variables (or columns) but not from unique combinations thereof. I need to exclude about 10000 rows from 108000 rows of data based on several unique combinations of identifiers in two columns. Only merge() seems to be able to do that. Merge would allow me to positively

lapply() reccursively

2009 Oct 13

lapply() reccursively

Hi all, I was wondering whether it is possible to use the lapply() function to alter the value of the input, something in the spirit of : a1<-runif(100) a2<-function(i){ a1[i]<-a1[i-1]*a1[i];a1[i] } a3<-lapply(2:100,a2) Something akin to a for() loop, but using the lapply() infrastructure. I haven't been able to get rapply() to do this. The reason is that the "real"

sample "n" random positions from a matrix

2006 Dec 10

sample "n" random positions from a matrix

Hi there, I have a binary matrix (dim 100x100) filled with values 0 and 1. I need select a record "n" positions of that matrix when values are 1. How can I do that? Thanks for all, Miltinho Brazil --------------------------------- [[alternative HTML version deleted]]

Loading data into a list of environments

2008 May 31

Loading data into a list of environments

Dear All, Thanks to an answer which I received from a previous post, I'm now able to create a series of environments using the following: nmes <- c("en1", "en2", "en3") for(i in nmes) assign(i, new.env(parent = .GlobalEnv)) My next question is how, using "load", can I automatically place data into each of these newly created environments. The

similar to: Loading only particular columns from csv file...