thr3ads.net - similar to: "File too big for filehash?"

Displaying 20 results from an estimated 100 matches similar to: "File too big for filehash?"

2010 Apr 09

Read data in sequences

Dear R users, I tried to find a solution in the search list, but I cannot find it. I would like to read a .txt file with, let say, three variables, with two of which have repeated values in a number a columns. An example: The variables: Treat, x1, x2. The values: A 2.5 3.4 2.7 5.6 5.7 5.4 10.1 9.4 ... B 5.3 5.4 6.5 7.5 1.3 4.5 10.5 4.1 ... ... In the first column, the letters represent the

Appending objects created using filehash package

2009 Jan 23

Appending objects created using filehash package

Hi, I am working with a very large dataset, and am using the 'filehash' package to manage such a large file. While I have no problem accessing objects that I load into a database, I was hoping there is a better way to append to objects already in the database. The only way I know now to append to an object, basically requires rewriting the entire object. Sample code:

Huge matrix: allocation works but assignment fails

2009 Oct 01

Huge matrix: allocation works but assignment fails

Hello everyone, I am working with one big matrix: w=matrix(0,18000,18000) on a Linux computer with 16Go of RAM. I can actually create the matrix, and even access elements: w[10,10] 0 but if I try to change one element, it fails: w[10,10]=1 Erreur : impossible d'allouer un vecteur de taille 2531250 Ko (Failed to allocate a vector of size...) What can I do? And, maybe even more important,

spatial correlation in lme and huge correlation matrix (memory limit)

2012 Aug 29

spatial correlation in lme and huge correlation matrix (memory limit)

Hi, I'm trying to introduce a (spatial) exponential correlation structure (with range=200 and nugget.effet of 0.3) in a lme model of this form: lme(ARBUS~YEAR, random=~1|IDSOUS). The structure of the data is "IDSOUS" "XMIN" "YMAX" "YEAR" "ARBUS" with 2 years of data and 5600 points for each year. I do:

Dump the "source code" of data frame

2011 Apr 13

Dump the "source code" of data frame

Dear R experts, I remember a similar function existed and have been mentioned in R-help before. I tried my best to search but I really can't find it out. suppose I have an data frame like this: > somedata <- data.frame(age.min = 1, age.max = 1.5, male = TRUE, l = -1.013, m=16.133, s=0.07656) In order to back up the data and I don't want to use write.table(), I would like to back

Memory limit in Aggregate()

2011 Aug 02

Memory limit in Aggregate()

Dear all, I am trying to aggregate a table (divided in two lists here), but get a memory error. Here is the code I'm running : sessionInfo() print(paste("memory.limit() ", memory.limit())) print(paste("memory.size() ", memory.size())) print(paste("memory.size(TRUE) ", memory.size(TRUE))) print(paste("size listX ", object.size(listX)))

Kmeans performance difference

2007 Jul 04

Kmeans performance difference

Hi All, A question from a newbie using R 2-5-0 on windows XP. Why is it that kmeans clustering with apparently the exact same parameters behaves so differently between the two following examples : > cl1 <- kmeans(subset(pointsUXO15555, select = c(2:4)), 10) Takes about 2 seconds to deliver a result > cl1 <- clust(subset(pointsUXO15555, select = c(2:4)), k=10,

Memory problem in R

2012 Mar 01

Memory problem in R

Hi all, I am running an -MNP- multinomial probit model package using R. It gives me the following objection instead of giving me the results: Erreur : impossible d'allouer un vecteur de taille 137.9 Mo (in english: cannot allocate a 137.9 Mb vector memory). I have already increased the memory size upto 2047Mb. This problem has been discussed in 2008 (archives) but no profitable answers were

Avoiding choosing parameters with mix[mixdist]

2011 Mar 10

Avoiding choosing parameters with mix[mixdist]

Hi, I am working on a population of an invasive clam. The data are the size of each clam per station (2mm on average). Each station is found at a different distance from a power nuclear station, so at different water temperatures. The fist step I want to do is to identify cohort size at each station or (zone of water temperature). The second step will be to see whether the size or number of

Large data set

2012 Jul 23

Large data set

Hi all, Have a problem. Trying to read in a data set that has about 112,000,000 rows and 8 columns and obviously enough it was too big for R to handle. The columns are mode up of 2 integer columns and 6 logical columns. The text file is about 4.2 Gb in size. Also I have 4 Gb of RAM and 218 Gb of available space on the hard drive. I tried the dumpDF function but it was too big. Also tried bring in

Can't import this 4GB DATASET

2012 May 04

Can't import this 4GB DATASET

Dear Experienced R Practitioners, I have 4GB .txt data called "dataset.txt" and have attempted to use *ff, bigmemory, filehash and sqldf *packages to import it, but have had no success. The readLines output of this data is: readLines("dataset.txt",n=20) [1] " "

Can the file locking in filehash be reused? (Was: Re: [R] [R-pkgs] filehash 2.0)

2008 Aug 28

Can the file locking in filehash be reused? (Was: Re: [R] [R-pkgs] filehash 2.0)

Hi (Roger), I saw the announcement of filehash v2.0 and the sentence "This development has lead to better file locking for concurrent access and faster reading and writing of data in general" caught my attention. What kind of file locking do you refer to here? I am looking for a mechanism that can be used to lock files for reading and/or writing, and I'd love to have a cross

filehash 2.0

2008 Aug 28

filehash 2.0

I have just uploaded to CRAN version 2.0 of the 'filehash' package. This version contains a major rewriting of many of the internals (much rewritten in C) for the DB1 format, which is the default. This development has lead to better file locking for concurrent access and faster reading and writing of data in general. In addition to rewriting the internals, I have added two modules for a

filehash 2.0

2008 Aug 28

filehash 2.0

filehash - multiple indices via '[' not allowed when using RDS format

2010 Jan 02

filehash - multiple indices via '[' not allowed when using RDS format

Hi, I have been using filehash for a while. It has performed very well. However, recently I found filehash gives an error when I need to do something like db[c("a", "b")] when the db is in RDS format. Does any one know a way to get around that? The code below reproduces the error thanks Jeff filehashOption(defaultType = "DB1") dbCreate("mydb3", type =

filehash

2008 Mar 15

filehash

Hello, I'm using filehash on the windows XP and it has been working fine with the newest R version 2.6.2. However, on the windows vista, when I ran the same code, I got the following error: > dbCreate("simdb") #create simdb database [1] TRUE > db<-dbInit("simdb") #initiate an object of database Error in sprintf(gettext(fmt, domain = domain), ...) : object

filehash does not install on FreeBSD

2010 Jan 21

filehash does not install on FreeBSD

Trying to install package 'filehash' I get the following error on FreeBSD 9.0-CURRENT (amd64) with R version 2.11.0 (2010-01-15 r50990): ----------------------------------- R CMD INSTALL filehash_2.0-1.tar.gz * installing to library '/usr/local/lib/R/library' * installing *source* package 'filehash' ... ** libs gcc -std=gnu99 -I/usr/local/lib/R/include

filehash for big data

2011 Jan 02

filehash for big data

Hi all, I am trying to use the filehash library to analyze a 5M by 20 matrix with both double and string data types. After consulting a few tutorials online, it seems as though one needs to first read the data into R; then create an R object; and then assign that object a location in my computer via filehash. It seems like the benefit of this is minimizing memory allocation when running

big panel: filehash, bigmemory or other

2010 Feb 22

big panel: filehash, bigmemory or other

Dear R-list I'm on my way to start a new project on a rather big panel, consisting of approximately 8 million observations in 30 waves of data and about 15 variables. I have a similar data set that is approximately 7 gigabytes in size. Until now I have done my data management in SAS, and Stata, mostly identifying spells, counting events in intervals, and a like, but I would like to

Problem to transfer Splus functions

2001 Nov 05

Problem to transfer Splus functions

Hello I would like to transfer some Splus functions in R. But I have a problem first about this assignation in Splus : xnom <- deparse(substitute(x)) I am a bad programmer : I don't understand the R help How to modify these functions ? Thank you very much for your help Here are the four functions and a data test

similar to: File too big for filehash?