similar to: Handling large dataset & dataframe [Broadcast]

Displaying 20 results from an estimated 2000 matches similar to: "Handling large dataset & dataframe [Broadcast]"

2007 Aug 16
4
Linear models over large datasets
I'd like to fit linear models on very large datasets. My data frames are about 2000000 rows x 200 columns of doubles and I am using an 64 bit build of R. I've googled about this extensively and went over the "R Data Import/Export" guide. My primary issue is although my data represented in ascii form is 4Gb in size (therefore much smaller considered in binary), R consumes about
2009 Mar 31
1
error during DPpackage compilation
Dear All, I've had trouble compiling DPpackage as a user in one system. It works fine as root in other machines. I can see any clues in error messages My guess is that it is a permissions matter. Any help is appreciated. OS: Linux Kernel: 2.6.27 SMP Arch: Intel 64 bits gfortran not available Thank you. ----------------------><8------------------------------------- g77 ? -fpic ?-g
2006 Apr 24
6
Handling large dataset & dataframe
Hi, I have a dataset consisting of 350,000 rows and 266 columns. Out of 266 columns 250 are dummy variable columns. I am trying to read this data set into R dataframe object but unable to do it due to memory size limitations (object size created is too large to handle in R). Is there a way to handle such a large dataset in R. My PC has 1GB of RAM, and 55 GB harddisk space running
2012 Apr 23
0
linear model benchmarking
I cleaned up my old benchmarking code and added checks for missing data to compare various ways of finding OLS regression coefficients. I thought I would share this for others. the long and short of it is that I would recommend ols.crossprod = function (y, x) { x <- as.matrix(x) ok <- (!is.na(y))&(!is.na(rowSums(x))) y <- y[ok]; x
2008 Jun 12
0
[LLVMdev] code generation order revisited.
On Jun 12, 2008, at 11:38, Hendrik Boom wrote: > On Tue, 06 May 2008 16:06:35 -0400, Gordon Henriksen wrote: > >> On 2008-05-06, at 13:42, Hendrik Boom wrote: >> >>> One more question. I hope you're not getting tired of me already. >>> Does generating LLVM code have to proceed in any particular order? >>> >>> Of course, if I am writing
2009 Mar 12
4
stats lm() function
Hi, Im using the lm() function where the formula is quite big (300 arguments) and the data is a frame of 3000 values. This is running in a loop where in each step the formula is reduced by one argument, and the lm command is called again (to check which arguments are useful) . This takes 1-2 minutes. Is there a way to speed this up? i checked the code of the lm function and its seems that its
2006 Nov 05
2
solution to a regression with multiple independent variable
Please forgive a statistics question. I know that a simple bivariate linear regression, y=f(x) or in R parlance lm(y~x) can be solved using the variance-covariance matrix: beta(x)=covariance(x,y)/variance(x). I also know that a linear regression with multiple independent variables, for example y=f(x,z) can also be solved using the variance-covariance matrix, but I don't know how to do this.
2008 Mar 24
6
vlookup in R
Hi, Is there are function similar to excel vlookup in R. Please let me know. Thanks, Sachin ____________________________________________________________________________________ [[alternative HTML version deleted]]
2006 May 23
5
conditional replacement
Hi How can do this in R. >df 48 1 35 32 80 If df < 30 then replace it with 30 and else if df > 60 replace it with 60. I have a large dataset so I cant afford to identify indexes and then replace. Desired o/p: 48 30 35 32 60 Thanx in advance. Sachin
2008 May 13
2
Plotting Frequency Distribution in R
Hi, How can plot a frequency distribution curve for the following data.    V1      V2 1   1 160.54% 2   1 201.59% 3   1  18.45% 4   1 179.03% 5   1 274.37% 6   1   0.00% 7   1  24.52% 8   1  39.17% 9   3  43.72% 10  1  53.06% 11  1  64.97% 12  1  79.84% 13  1  98.08% 14  1 115.32% 15  1 127.96% 16  1 155.38% 17  1 157.25% 18  1 193.17% 19  1  51.53% 20 15  99.32% 21  1 106.86% 22  1 219.44%
2007 Oct 03
1
inverse of matrix made by low.tri function
Hi all, I am using R trying to get a inverse matrix of (X^T)X , but I keep getting the error message like: no b argument and no default value for sprintf(gettext(fmt, domain = domain), ...) . -------------------------------------------------------------------------------------------- # my code X<-Matrix(rep(1,500),100,5) X[lower.tri(X)]<-1-10^-7 XtX<- t(X)%*% X XtXu<-lu(XtX)
2011 Aug 16
2
generalized inverse using matinv (Design)
i am trying to use matinv from the Design package to compute the generalized inverse of the normal equations of a 3x3 design via the sweep operator. That is, for the linear model y = ? + x1 + x2 + x1*x2 where x1, x2 are 3-level factors and dummy coding is being used the matrix to be inverted is X'X = 9 3 3 3 3 3 3 1 1 1 1 1 1 1 1 1 3 3 0 0 1 1 1 1 0 0 1 0 0 1 0 0 3 0 3 0 1 1 1 0 1 0 0 1
2006 Jul 14
2
Recreate new dataframe based on condition
Hi, How can I achieve this in R. Dataset is as follows: >df x 1 2 2 4 3 1 4 3 5 3 6 2 structure(list(x = c(2, 4, 1, 3, 3, 2)), .Names = "x", row.names = c("1", "2", "3", "4", "5", "6"), class = "data.frame") I want to recreate a new data frame whose rows are sum of (1&2, 3&4, 5&6)
2006 Apr 21
3
Creat new column based on condition
Hi, How can I accomplish this task in R? V1 10 20 30 10 10 20 Create a new column V2 such that: If V1 = 10 then V2 = 4 If V1 = 20 then V2 = 6 V1 = 30 then V2 = 10 So the O/P looks like this V1 V2 10 4 20 6 30 10 10 4 10 4 20 6 Thanks in advance. Sachin
2005 Aug 24
1
lm.ridge
Hello, I have posted this mail a few days ago but I did it wrong, I hope is right now: I have the following doubts related with lm.ridge, from MASS package. To show the problem using the Longley example, I have the following doubts: First: I think coefficients from lm(Employed~.,data=longley) should be equal coefficients from lm.ridge(Employed~.,data=longley, lambda=0) why it does not happen?
2012 Mar 22
3
Recommendations regarding textbooks
Hello I was hoping to get some advice regarding teaching R in an academic environment. What are the best choices with respect to textbooks? When this question was asked a few years back, people were primarily recommending ?Modern Applied Statistics with S? and ?Introductory Statistics with R? as two good choices. I?ve also heard some good thinks regarding ?An R Companion to Applied
2010 Feb 22
9
Couldn't find Order with ID=pending_orders
I have a Controller named Orders which has a pending_orders method which is expected to fetch some records from the database. If i dont write a route for this method, I get the following error when i call this method. Couldn''t find Order with ID=pending_orders I am using rails 2.3.5, in the previous versions i use to get this I am not getting whether its new version requirement... Help
2006 Apr 20
2
Conditional Row Sum
Hi, How can I accomplish this in R. Example: R1 R2 3 101 4 102 3 102 18 102 11 101 I want to find Sum(101) = 14 - i.e SUM(R1) where R2 = 101 Sum(102) = 25 - SUM(R2) where R2 = 102 TIA Sachin --------------------------------- [[alternative HTML version deleted]]
2006 Jun 26
2
write.table & csv help
Hi, How can I produce the following output in .csv format using write.table function. for(i in seq(1:2)) { df <- rnorm(4, mean=0, sd=1) write.table(df,"C:/output.csv", append = TRUE, quote = FALSE, sep = ",", row.names = FALSE, col.names = TRUE) } Current O/p: x 0.287816 -0.81803 -0.15231 -0.25849 x 2.26831 0.863174
2009 Mar 25
3
very fast OLS regression?
Dear R experts: I just tried some simple test that told me that hand computing the OLS coefficients is about 3-10 times as fast as using the built-in lm() function. (code included below.) Most of the time, I do not care, because I like the convenience, and I presume some of the time goes into saving a lot of stuff that I may or may not need. But when I do want to learn the properties of an