thr3ads.net - R help - [R] Read in from multiple Excel wksheets [Jun 2011]

If this information is useful, please help other people find it:
Share via:

dM/

2011-Jun-14 16:06 UTC

[R] Read in from multiple Excel wksheets

I?ve got an Excel workbook with about 30 worksheets.  Each worksheet
has 10000 rows of data over 30 columns.

I?d like to read the data from each worksheet into a dataframe or
matrix in R for processing.  Normally, I use read.csv when interacting
with Excel but I?d rather manipulate a multisheet workbook directly
than set about splitting the original workbook and saving down each
part as a csv.

So far, I?ve tried using read.xlsx from the xlsx package.  This works
fine for small test files ? e.g. suppose I?m trying to read from the
test_file workbook on my desktop.  The following code extracts rows 1
and 2 from worksheet = ?johnny?.

setwd("C:\\Documents and Settings\\dmenezes\\Desktop")
info<-
read.xlsx("test_file.xlsx",sheetName="johnny",rowIndex=1:2,header=FALSE)
info

However, when I try to apply this to my real, large workbook, things
go wrong, with the following error message.  Any ideas/workarounds?

Error in .jcall("RJavaTools", "Ljava/lang/Object;",
"invokeMethod",
cl,  :
  java.lang.OutOfMemoryError: Java heap space

Jonathan Daily

2011-Jun-14 18:28 UTC

head link

[R] Read in from multiple Excel wksheets

Java uses "heap space" when creating new objects. My guess is that
since the default size is 128 Mb iirc, you are reading in an object
larger than this. I don't know the guts of the xlsx package or if
there is a way to increase the heap, but you may get by if you can
divide up your data import.

HTH,
Jon

On Tue, Jun 14, 2011 at 12:06 PM, dM/ <david.n.menezes at gmail.com>
wrote:> I?ve got an Excel workbook with about 30 worksheets. ?Each worksheet
> has 10000 rows of data over 30 columns.
>
> I?d like to read the data from each worksheet into a dataframe or
> matrix in R for processing. ?Normally, I use read.csv when interacting
> with Excel but I?d rather manipulate a multisheet workbook directly
> than set about splitting the original workbook and saving down each
> part as a csv.
>
> So far, I?ve tried using read.xlsx from the xlsx package. ?This works
> fine for small test files ? e.g. suppose I?m trying to read from the
> test_file workbook on my desktop. ?The following code extracts rows 1
> and 2 from worksheet = ?johnny?.
>
> setwd("C:\\Documents and Settings\\dmenezes\\Desktop")
> info<-
>
read.xlsx("test_file.xlsx",sheetName="johnny",rowIndex=1:2,header=FALSE)
> info
>
> However, when I try to apply this to my real, large workbook, things
> go wrong, with the following error message. ?Any ideas/workarounds?
>
> Error in .jcall("RJavaTools", "Ljava/lang/Object;",
"invokeMethod",
> cl, ?:
> ?java.lang.OutOfMemoryError: Java heap space
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
==============================================Jon Daily
Technician
==============================================#!/usr/bin/env outside
# It's great, trust me.

Ben Bolker

2011-Jun-14 18:38 UTC

head link

[R] Read in from multiple Excel wksheets

dM/ <david.n.menezes <at> gmail.com> writes:
> 
> I?ve got an Excel workbook with about 30 worksheets.  Each worksheet
> has 10000 rows of data over 30 columns.
> 
> I?d like to read the data from each worksheet into a dataframe or
> matrix in R for processing.  Normally, I use read.csv when interacting
> with Excel but I?d rather manipulate a multisheet workbook directly
> than set about splitting the original workbook and saving down each
> part as a csv.
> 
[snip]> However, when I try to apply this to my real, large workbook, things
> go wrong, with the following error message.  Any ideas/workarounds?
> 
  Try read.xls from the gdata package?
  Or see
http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windows&s=excel
for more suggestions on I/O between R and Excel.

Gabor Grothendieck

2011-Jun-14 18:44 UTC

head link

[R] Read in from multiple Excel wksheets

On Tue, Jun 14, 2011 at 12:06 PM, dM/ <david.n.menezes at gmail.com>
wrote:> I?ve got an Excel workbook with about 30 worksheets. ?Each worksheet
> has 10000 rows of data over 30 columns.
>
> I?d like to read the data from each worksheet into a dataframe or
> matrix in R for processing. ?Normally, I use read.csv when interacting
> with Excel but I?d rather manipulate a multisheet workbook directly
> than set about splitting the original workbook and saving down each
> part as a csv.
>
> So far, I?ve tried using read.xlsx from the xlsx package. ?This works
> fine for small test files ? e.g. suppose I?m trying to read from the
> test_file workbook on my desktop. ?The following code extracts rows 1
> and 2 from worksheet = ?johnny?.
>
> setwd("C:\\Documents and Settings\\dmenezes\\Desktop")
> info<-
>
read.xlsx("test_file.xlsx",sheetName="johnny",rowIndex=1:2,header=FALSE)
> info
>
> However, when I try to apply this to my real, large workbook, things
> go wrong, with the following error message. ?Any ideas/workarounds?
>
> Error in .jcall("RJavaTools", "Ljava/lang/Object;",
"invokeMethod",
> cl, ?:
> ?java.lang.OutOfMemoryError: Java heap space
Try increasing the heap size.  Restart R and:

options(java.parameters="-Xmx1024m")
library(...)



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

Seemingly Similar Threads

Search for more seemingly similar threads

R help - Jun 2011 - Read in from multiple Excel wksheets

[R] Read in from multiple Excel wksheets

[R] Read in from multiple Excel wksheets

[R] Read in from multiple Excel wksheets

[R] Read in from multiple Excel wksheets

Seemingly Similar Threads