thr3ads.net - similar to: "Residuals from biglm package"

Displaying 20 results from an estimated 300 matches similar to: "Residuals from biglm package"

2006 May 17

Re : Large database help

Thanks for doing this Thomas, I have been thinking about what it would take to do this, but if it were left to me, it would have taken a lot longer. Back in the 80's there was a statistical package called RUMMAGE that did all computations based on sufficient statistics and did not keep the actual data in memory. Memory for computers became cheap before datasets turned huge so there

(S|odf)weave : how to intersperse (\LaTeX{}|odf) comments in source code ? Delayed R evaluation ?

2010 Dec 11

(S|odf)weave : how to intersperse (\LaTeX{}|odf) comments in source code ? Delayed R evaluation ?

Dear list, Inspired by the original Knuth tools, and for paedaogical reasons, I wish to produce a document presenting some source code with interspersed comments in the source (see Knuth's books rendering TeX and metafont sources to see what I mean). I seemed to remember that a code chunk could be defined piecewise, like in Comments... <<Chunk1, eval=FALSE, echo=TRUE>>=

Sweave question: prevent expansion of unevaluated reused code chunk

2007 Mar 13

Sweave question: prevent expansion of unevaluated reused code chunk

Hi, Consider the following (much simplified) Sweave example: -------------- First, we set the value of $x$: <<chunk1,eval=FALSE>>= x <- 1 @ Then we set the value of $y$: <<chunk2,eval=FALSE>>= y <- 2 @ Thus, the overall algorithm has this structure: <<combined,eval=FALSE>>= <<chunk1>> <<chunk2>> @

Sweave bug using 'FDR' in chunk label (PR#9567)

2007 Mar 15

Sweave bug using 'FDR' in chunk label (PR#9567)

Full_Name: Kevin Coombes Version: 2.4.0 OS: Windows XP Submission from: (NULL) (143.111.22.24) I'm running R 2.4.0 on a Windows XP machine, with only the default packages loaded. Running Sweave or Stangle on the following Rnw file: -------------- % bug.Rnw \begin{document} Demonstrate an Sweave/Stangle bug. <<info>>= sessionInfo() @ <<getFDR>>= x <- 1 @

scan(..., skip=1e11): infinite loop; cannot interrupt

2023 Feb 11

scan(..., skip=1e11): infinite loop; cannot interrupt

On Fri, 10 Feb 2023 23:38:55 -0600 Spencer Graves <spencer.graves at prodsyse.com> wrote: > I have a 4.54 GB file that I'm trying to read in chunks using > "scan(..., skip=__)". It works as expected for small values of > "skip" but goes into an infinite loop for "skip=1e11" and similar > large values of skip: I cannot even interrupt it; I

Buffer flushing

2008 Feb 07

Buffer flushing

Short question: is there way to tell EM to actually send data after send_data call? I''m building a file transferring app. I send Mashal.dump''ed metadata first, and then - the file contents (chunked). I found a silly bug: receive_data() gets marshalled metadata and the first chunk of the file in a single variable. Like that: c1.send_data("meta")

Looping over files

2011 Dec 21

Looping over files

Hi, ?I have a list of files in one of my working directories: "chr17.chunk1.dose.fvd" "chr17.chunk1.dose.fvi" "chr17.chunk1.prob.fvd"? "chr17.chunk1.prob.fvi"? ........... ......... ........ "chr17.chunk10.dose.fvd" "chr17.chunk10.dose.fvi" "chr17.chunk10.prob.fvd" "chr17.chunk10.prob.fvi" And I am

Maximum number of patterns and speed in grep

2012 Jul 06

Maximum number of patterns and speed in grep

Hi, I am using R's grep function to find patterns in vectors of strings. The number of patterns I would like to match is 7,700 (of different sizes). I noticed that I get an error message when I do the following: data <- array() for (j in 1:length(x)) { array[j] <- length(grep(paste(patterns[1:7700], collapse = "|"), x[j], value = T)) } When I break this up into 4 chunks of

gsub help

2011 Nov 15

gsub help

Hi, ?I am working with the following list of files: [1] "study_chr1.one.phased.impute2.chunk1"?????????????? [2] "study_chr1.one.phased.impute2.chunk1_info"????????? [3] "study_chr1.one.phased.impute2.chunk1_info_by_sample" [4] "study_chr1.one.phased.impute2.chunk1_summary"?????? [5] "study_chr1.one.phased.impute2.chunk1_warnings"?????? The

predict on biglm class

2007 Feb 12

predict on biglm class

Hi Everyone, I often use the 'safe prediction' feature available through glm(). Now, I'm at a situation where I must use biglm:::bigglm. ## begin example library(splines) library(biglm) ff <- log(Volume)~ns(log(Girth), df=5) fit.glm <- glm(ff, data=trees) fit.biglm <- bigglm(ff, data=trees) predict(fit.glm, newdata=data.frame(Girth=2:5)) ## -1.3161465 -0.2975659

Regarding the memory allocation problem

2012 Oct 25

Regarding the memory allocation problem

Dear All, My main objective was to compute the distance of 100000 vectors from a set having 900 other vectors. I've a file named "seq_vec" containing 100000 records and 256 columns. While computing, the memory was not sufficient and resulted in error "cannot allocate vector of size 152.1Mb" So I've approached the problem in the following: Rather than reading the data

scan(..., skip=1e11): infinite loop; cannot interrupt

2023 Feb 11

scan(..., skip=1e11): infinite loop; cannot interrupt

Hello, All: I have a 4.54 GB file that I'm trying to read in chunks using "scan(..., skip=__)". It works as expected for small values of "skip" but goes into an infinite loop for "skip=1e11" and similar large values of skip: I cannot even interrupt it; I must kill R. Below please find sessionInfo() with a toy example. My real problem is a large

Best practices?

2012 Jan 22

Best practices?

Suppose I start building nodes with (say) 24 drives each in them. Would the standard/recommended approach be to make each drive its own filesystem, and export 24 separate bricks, server1:/data1 .. server1:/data24 ? Making a distributed replicated volume between this and another server would then have to list all 48 drives individually. At the other extreme, I could put all 24 drives into some

possible rails -> postgresql bug

2006 Feb 19

possible rails -> postgresql bug

variable/model selction (step/stepAIC) for biglm ?

2009 Feb 21

variable/model selction (step/stepAIC) for biglm ?

Hello dear R mailing list members. I have recently became curious of the possibility applying model selection algorithms (even as simple as AIC) to regressions of large datasets. I searched as best as I could, but couldn't find any reference or wrapper for using step or stepAIC to packages such as biglm. Any ideas or directions of how to implement such a concept ? Best, Tal --

anova or liklihood ratio test from biglm output

2011 Nov 03

anova or liklihood ratio test from biglm output

(Sorry if this is a repost, I got a bounce reply from the r-help server) Hi, I’m using the biglm() function to create some linear models for a very large data set than lm() can’t fit due to memory issues (the problem is with the number of interactions, I can fit the main effects model) I need to determine if the 2-way interactions are necessary or not. Ideally I’d like to use anova() to

Using predict on a biglm object returns NA

2009 Mar 20

Using predict on a biglm object returns NA

Hi R experts, I used biglm to construct a model (which has categorical variables). When I run predict on the model output on a new data (for testing) or on the same data, I get only NA's. I'm able to run predict with some other models constructed with biglm. One reason I suspect is that the model itself has a few undefined terms (NA's). I'm wondering if there's any way to

leaps and biglm

2009 Feb 25

leaps and biglm

New versions of leaps and biglm are percolating through CRAN. The new version of biglm fixes a bug in sandwich standard errors with weights, and adds predict(), deviance() and AIC() methods [based on code from Christophe Dutang]. The new version of leaps adds a regsubsets() method for biglm objects, so that the subset selection algorithms can be run efficiently on large data sets. -thomas

leaps and biglm

2009 Feb 25

leaps and biglm

biglm and epicalc ROC curves

2010 Nov 10

biglm and epicalc ROC curves

Hello list, I am trying to avoid "Rifying" some of my SAS code to generate ROC plots, and the logistic.display() and lroc() functions in the epicalc package do what I want. However, I must generate my logistic model with bigglm because I have 1) limited hardware, 2) ~2.5 million rows, and 4 categorical and 2 continuous independent variables. When I attempt to invoke epicalc's

similar to: Residuals from biglm package