Displaying 20 results from an estimated 6000 matches similar to: "Reading large datasets and fitting logistic models in R"
2010 Oct 05
6
SVM functions
Hi !
Right now I am learning to use svm functions available in R and trying to
use these function with given example. I was stuck with svmlight function
which is available in klaR package. Any help would be appreciated regarding
this function.
1. I am unable to use svmlight( ) which is available in package: klaR.
Although I have downloaded klaR_0.6-3 package from
2011 Feb 08
1
Fitting a model with an offset in bigglm
Dear all,
I have a large data set and would like to fit a logistic regression
model using the bigglm function. I need to include an offset in the
model but when I do this the bigglm function seems to ignore it.
For example, running the two models below produces the same model and
the offset is ignored
bigglm(y~x,offset=z,data=Test,family=binomial(link = "logit"))
2005 Jul 19
1
a possible bug in svmlight (PR#8012)
When I used svmlight, I got below error:
my command is:
foo <- svmlight(y~., data= myData)
the results:
Error in file(con, "r") : unable to open connection
In addition: Warning messages:
1: svm_learn not found
2: cannot open file '_model_1.txt'
> myData[1:2,]
y X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17
1 1 63 1 0 0 145 233 1 1 0 150 0 2.3 1
2009 Mar 20
1
Using predict on a biglm object returns NA
Hi R experts,
I used biglm to construct a model (which has categorical variables).
When I run predict on the model output on a new data (for testing) or on the
same data, I get only NA's. I'm able to run predict with some other models
constructed with biglm. One reason I suspect is that the model itself has a
few undefined terms (NA's). I'm wondering if there's any way to
2009 Mar 17
1
exporting s3 and s4 methods
If a package defined an S3 generic and an S4 generic for the same function (so as to add methods for S4 classes to the existing code), how do I set up the namespace to have them exported?
With
import(stats)
exportMethods(bigglm)
importClassesFrom(DBI)
useDynLib(biglm)
export(biglm)
export(bigglm)
in NAMESPACE, the S3 generic is not exported.
> methods("bigglm")
[1] bigglm.RODBC*
2007 Aug 16
4
Linear models over large datasets
I'd like to fit linear models on very large datasets. My data frames
are about 2000000 rows x 200 columns of doubles and I am using an 64
bit build of R. I've googled about this extensively and went over the
"R Data Import/Export" guide. My primary issue is although my data
represented in ascii form is 4Gb in size (therefore much smaller
considered in binary), R consumes about
2009 Jul 03
2
bigglm() results different from glm()
Hi Sir,
Thanks for making package available to us. I am facing few problems if
you can give some hints:
Problem-1:
The model summary and residual deviance matched (in the mail below) but
I didn't understand why AIC is still different.
> AIC(m1)
[1] 532965
> AIC(m1big_longer)
[1] 101442.9
Problem-2:
chunksize argument is there in bigglm but not in biglm, consequently,
2010 Sep 29
2
What's the meaning of "Species ~ ." in IRIS data
I am refering to a function call like this:
>data(iris)
>x <- svmlight(Species ~ ., data = iris)
I tried to see the content of it by typing:
> Species ~ .
but it gives nothing. How can I see it's content ?
- P.Dubois
2009 Mar 17
2
bigglm() results different from glm()
Dear all,
I am using the bigglm package to fit a few GLM's to a large dataset (3
million rows, 6 columns). While trying to fit a Poisson GLM I noticed
that the coefficient estimates were very different from what I obtained
when estimating the model on a smaller dataset using glm(), I wrote a
very basic toy example to compare the results of bigglm() against a
glm() call. Consider the
2005 Jan 01
1
Multiple partitions in a Guest OS
Hi,
I was trying to run a guest OS with multiple partitions. I added a line like
disk=[''phy:hda6,hda6,w'']
in the configuration file. The disk was not mounted in the host OS.
However, the kernel refuses to boot with an error message which says
"kernel not syncing... attempted to kill init... rebooting...".
When I remove that line, everything works. Is there
2009 Apr 03
1
bigglm "update" with ff
Hi, since bigglm doesn't have update, I was wondering how to achieve
something like (similar to the example in ff package manual using biglm):
first <- TRUE
ffrowapply ({
if (first) {
first <- FALSE
fit <- bigglm(eqn, as.data.frame(bigdata[i1:i2,,drop=FALSE]), chunksize =
10000, family = binomial())
} else {
fit <- update(fit,
2008 Oct 22
1
Disabling the auto-complete feature in named list indexing
Is there any way to disable the auto-complete feature when we index a named
list?
E.g: a <- list ('longname'=1, 'anothername'=2)
a$l will return 1 and a$a will return 2
the '[[' operator behaves in the same way, the '[' operator does not do
autocomplete.
Is there any way to disable autocomplete for all the operators?
Thanks
Pradheep
[[alternative HTML version
2011 Jan 10
1
debug biglm response error on bigglm model
G'morning
What does the error message "Error in x %*% coef(object) : non-
conformable arguments" indicate when calculating the response values
for
newdata with a model from bigglm (in package biglm), and how can I
debug it? I am attempting to do Monte Carlo simulations, which may
explain the loop in the code that follows. After the code I
have included the output, which shows that
2010 Jul 02
2
unable to get bigglm working, ATTN: Thomas Lumley
I am using an example posted in this help forum to work with a file. the head
of the file looks like:
988887 2007-03-05 2007-06-01 90 3 5.450 205500.00 999.00 999.000 0.000 0 0
988887 2007-03-06 2007-06-01 90 3 5.450 205500.00 999.00 999.000 0.000 1 0
988887 2007-03-07 2007-06-01 90 3 5.450 205500.00 999.00 999.000 -0.100 2 0
988887 2007-03-08 2007-06-01 90 3 5.450 205500.00 999.00 999.000 -0.100
2007 Jan 21
1
Can we do GLM on 2GB data set with R?
We are wanting to use R instead of/in addition to our existing stats
package because of it's huge assortment of stat functions. But, we
routinely need to fit GLM models to files that are approximately 2-4GB
(as SQL tables, un-indexed, w/tinyint-sized fields except for the
response & weight variables). Is this feasible, does anybody know,
given sufficient hardware, using R? It appears to
2012 May 31
2
bigglm binomial negative fitted value
Hi, there
Since glm cannot handle factors very well. I try to use bigglm like this:
logit_model <- bigglm(responser~var1+var2+var3, data, chunksize=1000,
family=binomial(), weights=~trial, sandwich=FALSE)
fitted <- predict(logit_model, data)
only var2 is factor, var1 and var3 are numeric.
I expect fitted should be a vector of value falls in (0,1)
However, I get something like this:
2012 Mar 30
3
ff usage for glm
Greetings useRs,
Can anyone provide an example how to use ff to feed a very large data frame to glm?
The data.frame cannot be loaded in R using conventional read.csv as it is too big.
glm(...,data=ff.file) ??
Thank you
Stephen B
2007 Jan 22
1
Example function for bigglm (biglm) data input from file
This is to submit a commented example function for use in the data
argument to the bigglm(biglm) function, when you want to read the data
from a file (instead of a URL), or rescale or modify the data before
fitting the model. In the hope that this may be of help to someone out
there.
make.data <- function (filename, chunksize, ...) {
conn<-NULL;
function (reset=FALSE) {
if
2005 Jul 19
0
svmlight running error
Dear R Users,
When I used svmlight, I got below error:
my command is:
foo <- svmlight(y~., data= myData)
the results:
Error in file(con, "r") : unable to open connection
In addition: Warning messages:
1: svm_learn not found
2: cannot open file '_model_1.txt'
> myData[1:2,]
y X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17
1 1 63 1 0 0 145 233 1 1 0
2007 Jun 29
1
Comparison: glm() vs. bigglm()
Hi,
Until now, I thought that the results of glm() and bigglm() would
coincide. Probably a naive assumption?
Anyways, I've been using bigglm() on some datasets I have available.
One of the sets has >15M observations.
I have 3 continuous predictors (A, B, C) and a binary outcome (Y).
And tried the following:
m1 <- bigglm(Y~A+B+C, family=binomial(), data=dataset1, chunksize=10e6)