similar to: Re : Large database help

Displaying 20 results from an estimated 400 matches similar to: "Re : Large database help"

2007 Oct 23
0
Residuals from biglm package
Hi all, first of all, I'm not an expert on R, I'm still learning, so sorry if this is a stupid question... I have a large dataset that is to big for my computer memory, and I found quite useful the package biglm. Now everything is working perfectly. But if I want the residuals, how I can do it? Let's say that we are running the example: > data(trees)>
2010 Dec 11
5
(S|odf)weave : how to intersperse (\LaTeX{}|odf) comments in source code ? Delayed R evaluation ?
Dear list, Inspired by the original Knuth tools, and for paedaogical reasons, I wish to produce a document presenting some source code with interspersed comments in the source (see Knuth's books rendering TeX and metafont sources to see what I mean). I seemed to remember that a code chunk could be defined piecewise, like in Comments... <<Chunk1, eval=FALSE, echo=TRUE>>=
2007 Mar 13
2
Sweave question: prevent expansion of unevaluated reused code chunk
Hi, Consider the following (much simplified) Sweave example: -------------- First, we set the value of $x$: <<chunk1,eval=FALSE>>= x <- 1 @ Then we set the value of $y$: <<chunk2,eval=FALSE>>= y <- 2 @ Thus, the overall algorithm has this structure: <<combined,eval=FALSE>>= <<chunk1>> <<chunk2>> @
2007 Mar 15
1
Sweave bug using 'FDR' in chunk label (PR#9567)
Full_Name: Kevin Coombes Version: 2.4.0 OS: Windows XP Submission from: (NULL) (143.111.22.24) I'm running R 2.4.0 on a Windows XP machine, with only the default packages loaded. Running Sweave or Stangle on the following Rnw file: -------------- % bug.Rnw \begin{document} Demonstrate an Sweave/Stangle bug. <<info>>= sessionInfo() @ <<getFDR>>= x <- 1 @
2023 Feb 11
1
scan(..., skip=1e11): infinite loop; cannot interrupt
On Fri, 10 Feb 2023 23:38:55 -0600 Spencer Graves <spencer.graves at prodsyse.com> wrote: > I have a 4.54 GB file that I'm trying to read in chunks using > "scan(..., skip=__)". It works as expected for small values of > "skip" but goes into an infinite loop for "skip=1e11" and similar > large values of skip: I cannot even interrupt it; I
2008 Feb 07
6
Buffer flushing
Short question: is there way to tell EM to actually send data after send_data call? I''m building a file transferring app. I send Mashal.dump''ed metadata first, and then - the file contents (chunked). I found a silly bug: receive_data() gets marshalled metadata and the first chunk of the file in a single variable. Like that: c1.send_data("meta")
2011 Dec 21
1
Looping over files
Hi, ?I have a list of files in one of my working directories: "chr17.chunk1.dose.fvd" "chr17.chunk1.dose.fvi" "chr17.chunk1.prob.fvd"? "chr17.chunk1.prob.fvi"? ........... ......... ........ "chr17.chunk10.dose.fvd" "chr17.chunk10.dose.fvi" "chr17.chunk10.prob.fvd" "chr17.chunk10.prob.fvi" And I am
2012 Jul 06
2
Maximum number of patterns and speed in grep
Hi, I am using R's grep function to find patterns in vectors of strings. The number of patterns I would like to match is 7,700 (of different sizes). I noticed that I get an error message when I do the following: data <- array() for (j in 1:length(x)) { array[j] <- length(grep(paste(patterns[1:7700], collapse = "|"), x[j], value = T)) } When I break this up into 4 chunks of
2011 Nov 15
1
gsub help
Hi, ?I am working with the following list of files: [1] "study_chr1.one.phased.impute2.chunk1"?????????????? [2] "study_chr1.one.phased.impute2.chunk1_info"????????? [3] "study_chr1.one.phased.impute2.chunk1_info_by_sample" [4] "study_chr1.one.phased.impute2.chunk1_summary"?????? [5] "study_chr1.one.phased.impute2.chunk1_warnings"?????? The
2008 Aug 17
1
package building problem on windows
Hi, I'm trying to compile the package biglm, but when I build it with R CMD build biglm, it failed : C:\LOCAL\c-dutang\code\R\biglm2>R CMD build biglm * checking for file 'biglm/DESCRIPTION' ... OK * preparing 'biglm': * checking DESCRIPTION meta-information ...C:/DOCUME~1/c-dutang/Local: Can't op n C:/DOCUME~1/c-dutang/Local: No such file or directory
2010 Oct 31
1
biglm: how it handles large data set?
I am trying to figure out why 'biglm' can handle large data set... According to the R document - "biglm creates a linear model object that uses only p^2 memory for p variables. It can be updated with more data using update. This allows linear regression on data sets larger than memory." After reading the source code below? I still could not figure out how 'update'
2011 Jul 25
1
biglm() and NeweyWest()
Dear all, I am working on a large dataset and need to use biglm() to perform OLS regressions. I have detected significant ARCH effects which I try to account for using the Newey-West correction. So far, I have worked with NeweyWest() in the sandwich package. NeweyWest() however seems to be unable to handle an object of class "biglm". Looking into the code, I figured out that
2009 Feb 19
1
Questions about biglm
Hello folks, I am very excited to have discovered R and have been exploring its capabilities. R's regression models are of great interest to me as my company is in the business of running thousands of linear regressions on large datasets. I am using biglm to run linear regressions on datasets that are as large as several GB's. I have been pleasantly surprised that biglm runs the
2009 Apr 27
0
VIF's in R using BIGLM
Dear R-help This is a follow-up to my previous post here: http://groups.google.com/group/r-help-archive/browse_thread/thread/d9b6f87ce06a9fb7/e9be30a4688f239c?lnk=gst&q=dobomode#e9be30a4688f239c I am working on developing an open-source automated system for running batch-regressions on very large datasets. In my previous post, I posed the question of obtaining VIF's from the output of
2010 Jun 15
1
help biglm.big.matrix; problem with weights
Hello colleagues, I have tried to use the package biglm. I want to specify a multivariate regression with a weight. I have imported a large dataset with the library(bigmemory). I load the library (biglm) and specified a regression with a weight. But I get everytime a error message like ?object not found? or ?`weights' must be a formula? or "error in eval(expr, envir, enclos)". I
2007 Feb 12
0
predict on biglm class
Hi Everyone, I often use the 'safe prediction' feature available through glm(). Now, I'm at a situation where I must use biglm:::bigglm. ## begin example library(splines) library(biglm) ff <- log(Volume)~ns(log(Girth), df=5) fit.glm <- glm(ff, data=trees) fit.biglm <- bigglm(ff, data=trees) predict(fit.glm, newdata=data.frame(Girth=2:5)) ## -1.3161465 -0.2975659
2010 Jun 16
0
biglm.big.matrix: Problem with weighting
Hello colleagues, I have tried to use the package bigmemory, biganalytics and biglm. I want to specify a multivariate regression with a weight. I have imported a large dataset with the library(bigmemory). I load the library (biglm) and specified a regression with a weight. But I get everytime an error message like "object not found" or "`weights' must be a
2012 Jan 03
0
Biglm source code alternatives (E.g. Call to Fortran)
Hi everyone, I have been looking at the Bigglm (Basically does Generalised Linear Models for big data under the Biglm package) command and I have done some profiling on this code and found that to do a GLM on a 100mb file (9 million rows by 5 columns matrix(most of the numbers were either a 0,1 or 2 randomly generated)) it took about 2 minutes on a linux machine with 8gb of RAM and 4 cores.
2011 Nov 03
0
anova or liklihood ratio test from biglm output
(Sorry if this is a repost, I got a bounce reply from the r-help server) Hi, I’m using the biglm() function to create some linear models for a very large data set than lm() can’t fit due to memory issues (the problem is with the number of interactions, I can fit the main effects model) I need to determine if the 2-way interactions are necessary or not. Ideally I’d like to use anova() to
2009 Feb 25
0
leaps and biglm
New versions of leaps and biglm are percolating through CRAN. The new version of biglm fixes a bug in sandwich standard errors with weights, and adds predict(), deviance() and AIC() methods [based on code from Christophe Dutang]. The new version of leaps adds a regsubsets() method for biglm objects, so that the subset selection algorithms can be run efficiently on large data sets. -thomas