Displaying 20 results from an estimated 47 matches for "ffdf".
2012 Jul 25
3
ff package: reading selected columns from csv
...--------------------------------------------------------------------
## *I want to read the second column only:*
x.class <- c('NULL', 'numeric','NULL','NULL','NULL', 'NULL', 'NULL')
##* The following command works fine:*
> read.csv.ffdf(file=csvfile, header=FALSE, skip=100,
> colClasses=x.class, nrows=1e3)
ffdf (all open) dim=c(1000,1), dimorder=c(1,2) row.names=NULL
ffdf virtual mapping
PhysicalName VirtualVmode PhysicalVmode AsIs VirtualIsMatrix
V2 V2 double double FALSE FALSE
PhysicalI...
2012 Sep 14
1
Any way to get read.table.ffdf() (in the ff package) to pass colClasses or comment.char parameters through to read.fwf() ?
Hi everyone, my apologies if I'm overlooking something obvious in the
documentation. I'm relatively inexperienced with the (awesome) ff package.
My goal is to use the read.table.ffdf() function to call the read.fwf()
function and pass through the colClasses and comment.char arguments. The
code below shows exactly what doesn't work for me.
If the colClasses and comment.char parameters cannot be passed to
read.fwf() through read.table.ffdf(), I'd love to hear any ideas...
2010 Dec 24
1
How to specify ff object filepaths when reading a CSV file into a ff data frame.
Hi,
The read.csv.ffdf function in package ff will create the ff object
physical file in the default directories, I am trying to let the files
created in the paths users specify, I think the point is to make use
of the asffdf_args parameter,
I have a test CSV file named D:\rtemp\fftest.csv, the content of the
file is as...
2013 Nov 18
1
Reading in csv data with ff package
...e("Integer"=round(100000*runif(size)),"Character"=sample(LETTERS,size,replace=T),"Logical"=sample(c(T,F),size,replace=T))
#Write to csv
write.csv(fake.data,"data.csv",row.names=F)
-------------------------------------------------
Now to read it in as a 'ffdf' class, I can do the following:
-------------------------------------------------
data = read.csv.ffdf(x=NULL,file="data.csv",nrows=1001,first.rows = 500,
next.rows = 1005,sep=",")
-------------------------------------------------
That works. But with my current large dat...
2013 May 07
1
how to read numeric vector as factors using read.table.ffdf
...d linear models. However, since big.matrix will convert all
character vectors to factors and the character labels will be lost. I
decided to create a lookup table outside of R for my character columns and
use numbers to represent different levels for R. However, I do not know how
to tell read.table.ffdf these columns should be considered factors instead
of numerics. Please help. thanks.
[[alternative HTML version deleted]]
2013 Feb 27
0
How to specify ff object filepaths when reading a CSV file into a ff data frame.
Really old subject?, so, all my apologizes for digging up
but, since I also ran into this? maybe this hack can be useful to someone
I propose monkey patching here:
library(ff)
my.as.ffdf.data.frame <- function (x, vmode = NULL, col_args = list(), ...)
{
rnam <- attr(x, "row.names")
if (is.integer(rnam)) {
if (all(rnam == seq_along(rnam)))
rnam <- NULL
else rnam <- as.character(rnam)
}
x <- as.list(x)
vmodes <- vector("list", l...
2011 Jan 18
2
help with read.table.ffdf parameters
...V21 V22 V23
V24
"integer" "numeric" "numeric" "numeric" "numeric" "factor" "numeric"
"numeric"
V25 V26
"numeric" "numeric"
> library(ff)
> results.ff <- read.table.ffdf(file = "./results/results.txt",
+ header = F,
+ colClasses = classes,
+ first.rows = 1000,
+ next.rows = 1000,
+...
2011 Dec 22
1
ff object in lapply function
Hello. I'm using as.ffdf(mydataframe) to create ffdf objects inside an lapply
loop and returning that. I then use crbind to combine the lapply results
into allData.
So...simplified flow looks like this.
res <- lapply(1:nchunks, function(n)
{
blah blah with nth chunk
mydatafr...
2010 Apr 13
2
how to work with big matrices and the ff-package?
...,i]+a[,j]
namb[x] <- paste(i, "_", j, sep="")
x <- x+1
}
}
dimnames(b)[[2]] <- namb
After the above step I need to convert my ff_matrix to a data.frame to discretize the whole matrix and calculate the mutual information. The calculated result should be saved as an ffdf-object or something similar.
require(infotheo)
disc <- as.ffdf(discretize(as.data.frame(as.ffdf(cc)), disc="equalwidth", nbins=5))
This won't work. After this step it somehow loses the path to the working directory. As soon as I try to discretize the next data.frame I get the fol...
2010 Jun 11
1
ff package when reading .csv files
Hi
My aim is to read a large .csv file into R. I ran the following code and am
using R version 10.1 on Windows.
>library(ff)
> read.csv.ffdf(x=NULL,"file.csv",fileEncoding="",nrows=-1,first.rows=NULL,next.rows=NULL,levels=NULL,appendLevels=TRUE,FUN="read.table",transFUN=NULL,asffdf_args=list(),BATCHBYTES=getOption("ffbatchbytes"),VERBOSE=FALSE)
Error in read.table.ffdf(FUN = "read.csv",...
2013 Sep 30
4
read.table() with quoted integers
...ex or (depending on ?as.is?)
factor as appropriate. Quotes are (by default) interpreted in all
fields, so a column of values like ?"42"? will result in an
integer column.
Should the former behavior be considered a bug?
This creates problems when combined with read.table.ffdf from package
ff, since this function tries to guess the column classes by reading the
first rows of the file, and then passes colClasses to read.table to read
the remaining rows by chunks. A column of quoted integers is correctly
detected as integer in the first read, but read.table() fails in
subs...
2012 May 04
2
Can't import this 4GB DATASET
...total allocation of 4078Mb: see help(memory.size)
#This occurs after a 30 minute wait.
###/*MY ATTEMPT USING FF*/###
#First, try with the 29 row "datatset2.txt",
# open a connection to the file
con <- file('dataset2.txt', 'rt')
# read the remainder using read.table.ffdf
ffdf <- read.table.ffdf(file=con)
# close connection
close(con)
ffdf
#ffdf (all open) dim=c(29,9), dimorder=c(1,2) row.names=NULL
#ffdf virtual mapping
# PhysicalName VirtualVmode PhysicalVmode AsIs VirtualIsMatrix
PhysicalIsMatrix PhysicalElementNo PhysicalFirstCol PhysicalLastCol
Physical...
2010 Jan 07
1
A question about the ff package
Hi,
I am using version 2.1-1 of the ff package.
I have a data set with 80 million rows and I need to create a new ffdf
object, subseting by values in one of the original ffdf's columns. Here is
my code:
bigData <- read.table.ffdf(file="/data/demodata/data/smallData.txt",
next.rows=1e5, head=TRUE, sep="|")
dim(bigData)
N <- nrow(bigData);N
select <- ff( vmode='logical', leng...
2010 Feb 11
0
ff package: How to save and open ff(df) files.
Hello to everyone,
I'm a newbie with ff package and I´m starting to use it. I´ve been reading
the ff.pdf guide and another documents and questions , but I´m really
confused about some procedures I can´t see how to do. I´d want to know if
it´s possible (and how) to "save" a ffdf file(s) and open it in another
session, via saving it in an permanet location.
Let´s supose we´re reading from a text file to a ffdf object, with
read.table.ffdf and we want to save the files and information in a permanent
path and file. First of all I read about GetOption("fftempdir") b...
2012 Apr 15
0
Specifying splits - in read.csv.ffdf
Hi All,
I am currently trying to familiarize with "ff" package which allows me to store R objects in the hard drive. One of things that I notice when reading in a text file with "read.csv.ffdf" function - is that, in the R temp folder, 1000+ small files get created, each file having a name like "ffdf1bcd8b4aa0.ff". Each file is about 5KB in size.
My understanding is, the whole file has been split into small small pieces and stored in the hard drive. What I am trying to...
2012 Oct 31
1
ffdfindexget from package ff
I'm having trouble getting ffdfindexget to work right in Windows. Even the
most trivial of examples gives me problems.
> myVec = ff(1:5)
> another = ff(10:14)
> littleFrame = ffdf(myVec, another)
> posVec = ff(c(2, 4), vmode = 'integer')
> ffdfindexget(littleFrame, posVec)
Error in if (any(B < 1)) sto...
2009 Nov 09
3
Hand-crafting an .RData file
Hello,
I frequently have to export a large quantity of data from some
source (for example, a database, or a hand-written perl script) and then
read it into R. This occasionally takes a lot of time; I'm usually using
read.table("filename",comment.char="",quote="") to read the data once it is
written to disk.
However, I *know* that the program that generates
2012 Nov 13
5
Getting information encoded in a SAS, SPSS or Stata command file into R.
...allow me to
extract the necessary information from these command files. Does anyone know
of any r package or other non-proprietary tools that would allow me to get
this data set from its current form into any of the following formats:
SAS, SPSS or Stata binary files read by R.
A MySQL data base
An ffdf object readable using the ff package.
My ultimate goal is to get the data into an ffdf object so that I can
manipulate it in R, perhaps by way of a database. In allocation I will
probably be using no more than 20 variables at a time, probably a bit under
a gig. I am working on a machine with three...
2009 Nov 06
0
New version of package ff
...now supports large data.frames,
csv import/export, packed atomic datatypes and bit filtering from package
'bit' on which it depends from now.
Some performance results in seconds from test data with 78 mio rows and 7 columns on a 3 GB notebook:
sequential reading 1 mio rows: csv = 32.7 ffdf = 1.3
sequential writing 1 mio rows: csv = 35.5 ffdf = 1.5
Examples of things you can do with ff and bit:
- direct random access to rows of large data-frame instead of talking to SQL database (?ffdf)
- store 4-level factor like A,T,G,C with 2bit instead of 32bit (?vmode)
- fast chunked iteration...
2009 Nov 06
0
New version of package ff
...now supports large data.frames,
csv import/export, packed atomic datatypes and bit filtering from package
'bit' on which it depends from now.
Some performance results in seconds from test data with 78 mio rows and 7 columns on a 3 GB notebook:
sequential reading 1 mio rows: csv = 32.7 ffdf = 1.3
sequential writing 1 mio rows: csv = 35.5 ffdf = 1.5
Examples of things you can do with ff and bit:
- direct random access to rows of large data-frame instead of talking to SQL database (?ffdf)
- store 4-level factor like A,T,G,C with 2bit instead of 32bit (?vmode)
- fast chunked iteration...