thr3ads.net - similar to: "VIF's in R using BIGLM"

2009 Feb 19

1

Questions about biglm

Hello folks, I am very excited to have discovered R and have been exploring its capabilities. R's regression models are of great interest to me as my company is in the business of running thousands of linear regressions on large datasets. I am using biglm to run linear regressions on datasets that are as large as several GB's. I have been pleasantly surprised that biglm runs the

biglm() and NeweyWest()

2011 Jul 25

1

biglm() and NeweyWest()

Dear all, I am working on a large dataset and need to use biglm() to perform OLS regressions. I have detected significant ARCH effects which I try to account for using the Newey-West correction. So far, I have worked with NeweyWest() in the sandwich package. NeweyWest() however seems to be unable to handle an object of class "biglm". Looking into the code, I figured out that

help biglm.big.matrix; problem with weights

2010 Jun 15

1

help biglm.big.matrix; problem with weights

Hello colleagues, I have tried to use the package biglm. I want to specify a multivariate regression with a weight. I have imported a large dataset with the library(bigmemory). I load the library (biglm) and specified a regression with a weight. But I get everytime a error message like ?object not found? or ?`weights' must be a formula? or "error in eval(expr, envir, enclos)". I

predict on biglm class

2007 Feb 12

0

predict on biglm class

Hi Everyone, I often use the 'safe prediction' feature available through glm(). Now, I'm at a situation where I must use biglm:::bigglm. ## begin example library(splines) library(biglm) ff <- log(Volume)~ns(log(Girth), df=5) fit.glm <- glm(ff, data=trees) fit.biglm <- bigglm(ff, data=trees) predict(fit.glm, newdata=data.frame(Girth=2:5)) ## -1.3161465 -0.2975659

biglm: how it handles large data set?

2010 Oct 31

1

biglm: how it handles large data set?

I am trying to figure out why 'biglm' can handle large data set... According to the R document - "biglm creates a linear model object that uses only p^2 memory for p variables. It can be updated with more data using update. This allows linear regression on data sets larger than memory." After reading the source code below? I still could not figure out how 'update'

biglm.big.matrix: Problem with weighting

2010 Jun 16

0

biglm.big.matrix: Problem with weighting

Hello colleagues, I have tried to use the package bigmemory, biganalytics and biglm. I want to specify a multivariate regression with a weight. I have imported a large dataset with the library(bigmemory). I load the library (biglm) and specified a regression with a weight. But I get everytime an error message like "object not found" or "`weights' must be a

Biglm source code alternatives (E.g. Call to Fortran)

2012 Jan 03

0

Biglm source code alternatives (E.g. Call to Fortran)

Hi everyone, I have been looking at the Bigglm (Basically does Generalised Linear Models for big data under the Biglm package) command and I have done some profiling on this code and found that to do a GLM on a 100mb file (9 million rows by 5 columns matrix(most of the numbers were either a 0,1 or 2 randomly generated)) it took about 2 minutes on a linux machine with 8gb of RAM and 4 cores.

anova or liklihood ratio test from biglm output

2011 Nov 03

0

anova or liklihood ratio test from biglm output

(Sorry if this is a repost, I got a bounce reply from the r-help server) Hi, I’m using the biglm() function to create some linear models for a very large data set than lm() can’t fit due to memory issues (the problem is with the number of interactions, I can fit the main effects model) I need to determine if the 2-way interactions are necessary or not. Ideally I’d like to use anova() to

leaps and biglm

2009 Feb 25

0

leaps and biglm

New versions of leaps and biglm are percolating through CRAN. The new version of biglm fixes a bug in sandwich standard errors with weights, and adds predict(), deviance() and AIC() methods [based on code from Christophe Dutang]. The new version of leaps adds a regsubsets() method for biglm objects, so that the subset selection algorithms can be run efficiently on large data sets. -thomas

leaps and biglm

2009 Feb 25

0

leaps and biglm

New versions of leaps and biglm are percolating through CRAN. The new version of biglm fixes a bug in sandwich standard errors with weights, and adds predict(), deviance() and AIC() methods [based on code from Christophe Dutang]. The new version of leaps adds a regsubsets() method for biglm objects, so that the subset selection algorithms can be run efficiently on large data sets. -thomas

Residuals from biglm package

2007 Oct 23

0

Residuals from biglm package

Hi all, first of all, I'm not an expert on R, I'm still learning, so sorry if this is a stupid question... I have a large dataset that is to big for my computer memory, and I found quite useful the package biglm. Now everything is working perfectly. But if I want the residuals, how I can do it? Let's say that we are running the example: > data(trees)>

R-Squared with biglm?

2009 Apr 20

1

R-Squared with biglm?

I've been working with a rather large data set (~10M rows), and while biglm works beautifully for generating coefficients, it does not report an r-squared. It does report RSS. Any idea on how one could coax an R-squared out of biglm? Thanks in advance for any help with this! Bryan Lim Lecturer Department of Finance University of Melbourne [[alternative HTML version deleted]]

Using predict on a biglm object returns NA

2009 Mar 20

1

Using predict on a biglm object returns NA

Hi R experts, I used biglm to construct a model (which has categorical variables). When I run predict on the model output on a new data (for testing) or on the same data, I get only NA's. I'm able to run predict with some other models constructed with biglm. One reason I suspect is that the model itself has a few undefined terms (NA's). I'm wondering if there's any way to

Example function for bigglm (biglm) data input from file

2007 Jan 22

1

Example function for bigglm (biglm) data input from file

This is to submit a commented example function for use in the data argument to the bigglm(biglm) function, when you want to read the data from a file (instead of a URL), or rescale or modify the data before fitting the model. In the hope that this may be of help to someone out there. make.data <- function (filename, chunksize, ...) { conn<-NULL; function (reset=FALSE) { if

Resultado de la consola como un tibble

2020 Oct 18

1

Resultado de la consola como un tibble

Hola, Bueno, puedes hacer el cálculo de una forma mucho más compacta y rápida. Esta forma es especialmente recomendable cuando tienes muchas columnas y muchas filas. > library(data.table) > myDT <- as.data.table(mtcars) > myDTlong <- melt(myDT, measure.vars=1:ncol(myDT)) > myDTlong[ , list(p_value = shapiro.test(value)$p.value, v_stat = shapiro.test(value)$statistic) , by

getting R2 (goodness of fit) result after using biglm()

2011 Nov 15

1

getting R2 (goodness of fit) result after using biglm()

Hello. I had been struggling with running linear regression using lm() primarily because my data has a few categorical variables with at least a thousand levels. I tried the biglm() function and it worked. My problem now is that i don't know how to get the R2 results. Could someone help? Thanks, sean

variable/model selction (step/stepAIC) for biglm ?

2009 Feb 21

1

variable/model selction (step/stepAIC) for biglm ?

Hello dear R mailing list members. I have recently became curious of the possibility applying model selection algorithms (even as simple as AIC) to regressions of large datasets. I searched as best as I could, but couldn't find any reference or wrapper for using step or stepAIC to packages such as biglm. Any ideas or directions of how to implement such a concept ? Best, Tal --

biglm 0.4

2006 Aug 25

0

biglm 0.4

biglm fits linear and generalized linear models to large data sets, using bounded memory. What's New: generalized linear models. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle _______________________________________________ R-packages mailing list R-packages at stat.math.ethz.ch

biglm 0.4

2006 Aug 25

0

biglm 0.4

biglm fits linear and generalized linear models to large data sets, using bounded memory. What's New: generalized linear models. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle _______________________________________________ R-packages mailing list R-packages at stat.math.ethz.ch

Bug in by() function which works for some FUN argument and does not work for others

2016 Apr 14

0

Bug in by() function which works for some FUN argument and does not work for others

I think you are not using the best function for what your intentions are. Try: > by(data=mtcars, INDICES=list(as.factor(mtcars$am)), FUN=colMeans) : 0 mpg cyl disp hp drat wt qsec vs 17.1473684 6.9473684 290.3789474 160.2631579 3.2863158 3.7688947 18.1831579 0.3684211 am gear carb 0.0000000

similar to: VIF's in R using BIGLM