thr3ads.net - similar to: "Can we do GLM on 2GB data set with R?"

Displaying 20 results from an estimated 1000 matches similar to: "Can we do GLM on 2GB data set with R?"

2007 Feb 06

glm gamma scale parameter

I would like the option to specify alternative scale parameters when using the gamma family, log link glm. In particular I would like the option to specify any of the following: 1. maximum likelihood estimate 2. moment estimator/Pearson's 3. total deviance estimator Is this easy? Possible? In addition, I would like to know what estimation process (maximum likelihood?) R is using to

Size of data vs. needed memory...rule of thumb?

2007 Jan 25

Size of data vs. needed memory...rule of thumb?

I have been searching all day & most of last night, but can't find any benchmarking or recommendations regarding R system requirements for very large (2-5GB) data sets to help guide our hardware configuration. If anybody has experience with this they're willing to share or could anybody point me in a direction that might be productive to research, it would be much appreciated.

FW: reducing RODBC odbcQuery memory use?

2007 Jan 26

FW: reducing RODBC odbcQuery memory use?

New to R, sorry if one or either of these is an inappropriate list for a question like this below; please let me know if this is a general help question. Jill Willie Open Seas Safeco Insurance jilwil at safeco.com -----Original Message----- From: WILLIE, JILL Sent: Thursday, January 25, 2007 2:27 PM To: r-help at stat.math.ethz.ch Subject: reducing RODBC odbcQuery memory use? Basic

reducing RODBC odbcQuery memory use?

2008 Mar 03

reducing RODBC odbcQuery memory use?

1. Can I avoid having RODBC use so much memory (35 times the data size or more) making a data.frame & then .rda file via. sqlQuery/save? 2. If not, is there some more appropriate way from w/in R to pull large data sets (2-5GB) into .rda files from sql? [R] reducing RODBC odbcQuery memory use? From: WILLIE, JILL <JILWIL_at_SAFECO.com> Date: Thu 25 Jan 2007 - 22:27:02 GMT

Forward Selection with regsubsets

2008 Mar 14

Forward Selection with regsubsets

Hi, I would like to perform a forward selection procedure on a data set with 6 observations and 10 predictors. I tried to run it with regsubsets (I set nvmax=number of observations) but I keep getting these warning messages: Warning messages: 1: 5 linear dependencies found in: leaps.setup(x, y, wt = weights, nbest = nbest, nvmax = nvmax, 2: nvmax reduced to 5 in: leaps.setup(x, y, wt =

Logistic geographical weighted regression

2009 Dec 04

Logistic geographical weighted regression

Dear all, is it possible to perform logstic type of geographical weighted regression in R software? thanks in advance. robert. [[alternative HTML version deleted]]

list

2007 Jun 06

list

hello, I wanna know how to create a list of list if it's possible and if it isn't possible how to do without. thanks. _____________________________________________________________________________ [[alternative HTML version deleted]]

Bar plot between two different liniar models

2009 Jan 13

Bar plot between two different liniar models

Hello I have a problem that i ant make a Bar plot like the one i have tried to illustrate below (made in paint); http://www.nabble.com/file/p21437080/LG5%2Bgraf%2Bredigeret.jpg http://www.nabble.com/file/p21437080/LG5%2Bgraf%2Bredigeret.JPG LG5+graf+redigeret.JPG Where each line represents a model; model1 = 0.58*x+12.65 model2 = 1.16*x+12.65 But i only want the bars and with y-values above

save() object w/o all of the loaded environment

2010 Aug 24

save() object w/o all of the loaded environment

I have two packages, one that does the actual work (SC) and the other a Tcl/Tk UI (SCUI) that invokes methods within the former. Within the SCUI's invocation method, I save an object returned from SC, the results of a long-running method. Now the object is completely described by the SC package. Unfortunately, any attempt to load the object (in a fresh R session) fails as below. R>

how to store package options over sessions?

2010 Nov 20

how to store package options over sessions?

Hi, I posted this a week ago on r-help but did not get an answer. So I hope that someone here can help me: I want to define some options for my package the user may change. It would be convenient if the changes could be saved when terminating an R session and recovered automatically on the next package load. Is that possible and if yes, is the standard way to implement this? Thanks, Mark

table with values as dots in increasing sizes

2010 Nov 05

table with values as dots in increasing sizes

I was just thinking of a way to present data and if it is possible in R. I have a data frame that looks as follows (this is just mockup data). df location,"species1","species2","species3","species4","species5" "loc1",0.44,0.28,0.37,-0.24,0.41 "loc2",0.54,0.62,0.34,0.52,0.71 "loc3",-0.33,0.75,-0.34,0.48,0.61 location

exporting s3 and s4 methods

2009 Mar 17

exporting s3 and s4 methods

If a package defined an S3 generic and an S4 generic for the same function (so as to add methods for S4 classes to the existing code), how do I set up the namespace to have them exported? With import(stats) exportMethods(bigglm) importClassesFrom(DBI) useDynLib(biglm) export(biglm) export(bigglm) in NAMESPACE, the S3 generic is not exported. > methods("bigglm") [1] bigglm.RODBC*

bigglm() results different from glm()

2009 Jul 03

bigglm() results different from glm()

Hi Sir, Thanks for making package available to us. I am facing few problems if you can give some hints: Problem-1: The model summary and residual deviance matched (in the mail below) but I didn't understand why AIC is still different. > AIC(m1) [1] 532965 > AIC(m1big_longer) [1] 101442.9 Problem-2: chunksize argument is there in bigglm but not in biglm, consequently,

Fitting a model with an offset in bigglm

2011 Feb 08

Fitting a model with an offset in bigglm

Dear all, I have a large data set and would like to fit a logistic regression model using the bigglm function. I need to include an offset in the model but when I do this the bigglm function seems to ignore it. For example, running the two models below produces the same model and the offset is ignored bigglm(y~x,offset=z,data=Test,family=binomial(link = "logit"))

bigglm() results different from glm()

2009 Mar 17

bigglm() results different from glm()

Dear all, I am using the bigglm package to fit a few GLM's to a large dataset (3 million rows, 6 columns). While trying to fit a Poisson GLM I noticed that the coefficient estimates were very different from what I obtained when estimating the model on a smaller dataset using glm(), I wrote a very basic toy example to compare the results of bigglm() against a glm() call. Consider the

Comparison: glm() vs. bigglm()

2007 Jun 29

Comparison: glm() vs. bigglm()

Hi, Until now, I thought that the results of glm() and bigglm() would coincide. Probably a naive assumption? Anyways, I've been using bigglm() on some datasets I have available. One of the sets has >15M observations. I have 3 continuous predictors (A, B, C) and a binary outcome (Y). And tried the following: m1 <- bigglm(Y~A+B+C, family=binomial(), data=dataset1, chunksize=10e6)

bigglm binomial negative fitted value

2012 May 31

bigglm binomial negative fitted value

Hi, there Since glm cannot handle factors very well. I try to use bigglm like this: logit_model <- bigglm(responser~var1+var2+var3, data, chunksize=1000, family=binomial(), weights=~trial, sandwich=FALSE) fitted <- predict(logit_model, data) only var2 is factor, var1 and var3 are numeric. I expect fitted should be a vector of value falls in (0,1) However, I get something like this:

bigglm "update" with ff

2009 Apr 03

bigglm "update" with ff

Hi, since bigglm doesn't have update, I was wondering how to achieve something like (similar to the example in ff package manual using biglm): first <- TRUE ffrowapply ({ if (first) { first <- FALSE fit <- bigglm(eqn, as.data.frame(bigdata[i1:i2,,drop=FALSE]), chunksize = 10000, family = binomial()) } else { fit <- update(fit,

paste adjacent elements matching string

2009 Dec 05

paste adjacent elements matching string

Hi all, I would like to combine elements of a vector: vec <- c("astring", "b", "cstring", "d", "e") > vec [1] "astring" "b" "cstring" "d" "e" such that for every element that contains "string" at the end, it is combined with the next element, so that I get this:

grouped output

2006 May 03

grouped output

hello, Suppose I have a table that looks like this: center name email Health Jon jon@test.com Health Bob bob@test.com Admin Jane jan@test.com Admin Jill jill@test.com I would like the output to look like this: Health Jon jon@test.com Bob bob@test.com Admin Jane jan@test.com Jill jill@test.com when i using cold fusion, this was easy via a tag called cfoutput. when i was using java, this was

similar to: Can we do GLM on 2GB data set with R?