Hi all,
I am trying to read a large csv file (~11 Gb - ~900,000 columns, 3000
rows) using the read.big.matrix command from the bigmemory package. I
am using the following command:
x<-read.big.matrix('data.csv', sep=',', header=TRUE,
type='char',
backingfile='data.bin', descriptorfile='data.desc')
When the command starts, everything seems to be fine, and when
examining the CPU being used on my linux system, R is using 100%.
Within about 10 minutes, the data.bin file is created, but then the
CPU use drops to 0%, where it remains for at least 24 hours (I have
let the program continue for up to 48 hours, but after that I kill the
command or R session). The data.desc file never seems to be created.
Are there any estimates on how long this process should take?
Unfortunately, I am not getting any error messages, so I am not sure
if the command is continuing to work (despite using 0% CPU) or if it
has crashed or if I am just being too impatient!
I am running version 2.10.0 of R with version of 3.12 of bigmemory on
a Linux server with 8 cores and 24 GB of RAM.
Any help/advice would be greatly appreciated!
Thanks,
Eric Claus