Trying to read and clean up the FERC data on Advanced Metering infrastructure. Of course it is in XLS for the first two survey years and then converts to XLSX for the final two. Bad enough that it is all in excel, they had to change the survey design and data format as well. Still, I?m sorting through it. However, when I try and read in the 2008 data, I?m getting this error: ### Wide character in print at /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/xls2csv.pl line 270. Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,? : ? EOF within quoted string ### Here is the code I?m running to get the data: ### install.packages("gdata") library("gdata") fileUrl <- "http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls" download.file(fileUrl, destfile="./ami.data/ami-data2008.xls") list.files("ami.data") dateDown.2008 <- date() ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, header=TRUE) ### Reviewed the data in the XLS file, and both ?? and # are present within it. Don?t know how to get the read.xls to ignore them so I can read all the data into my data frame. Tried : ### ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="", header=TRUE) ### And it spits out ?More columns than column names? output. Been searching this, and I can find some ?solutions? for read.table, but nothing specific to read.xls Many thanks, Benjamin Baker ? Sent from Mailbox [[alternative HTML version deleted]]
maybe library(xlsx) tf <- tempfile() ami <- " http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls " download.file( ami , tf , mode = 'wb' ) ami.data2008 <- read.xlsx( tf , sheetIndex = 1 ) On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker <bbaker at reed.edu> wrote:> Trying to read and clean up the FERC data on Advanced Metering > infrastructure. Of course it is in XLS for the first two survey years and > then converts to XLSX for the final two. Bad enough that it is all in > excel, they had to change the survey design and data format as well. Still, > I?m sorting through it. However, when I try and read in the 2008 data, I?m > getting this error: > ### > Wide character in print at > /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/ > xls2csv.pl line 270. > Warning message: > In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > EOF within quoted string > ### > > > > Here is the code I?m running to get the data: > ### > install.packages("gdata") > library("gdata") > fileUrl <- " > http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls > " > download.file(fileUrl, destfile="./ami.data/ami-data2008.xls") > list.files("ami.data") > dateDown.2008 <- date() > ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, > header=TRUE) > ### > > > Reviewed the data in the XLS file, and both ?? and # are present within > it. Don?t know how to get the read.xls to ignore them so I can read all the > data into my data frame. Tried : > ### > ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="", > header=TRUE) > ### > > > And it spits out ?More columns than column names? output. > > > Been searching this, and I can find some ?solutions? for read.table, but > nothing specific to read.xls > > > Many thanks, > > > Benjamin Baker > > > > ? > Sent from Mailbox > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
My suggestion is to use XLConnect to read the file:> x <-"C:\\Users\\jh52822\\AppData\\Local\\Temp\\Rtmp6nVgFC\\file385c632aba3.xls"> require(XLConnect)Loading required package: XLConnect Loading required package: XLConnectJars XLConnect 0.2-10 by Mirai Solutions GmbH [aut], Martin Studer [cre], The Apache Software Foundation [ctb, cph] (Apache POI, Apache Commons Codec), Stephen Colebourne [ctb, cph] (Joda-Time Java library) http://www.mirai-solutions.com , http://miraisolutions.wordpress.com> input <- f.readXLSheet(x, 1) > > str(input)'data.frame': 2266 obs. of 51 variables: $ EIA : num 34 59 87 97 108 118 123 149 150 157 ... $ Entity.Name : chr "City of Abbeville" "City of Abbeville" "City of Ada" "Adams Electric Cooperative" ... $ State : chr "SC" "LA" "MN" "IL" ... $ NERC.Region : chr "SERC" "SPP" "MRO" "SERC" ... $ Filing.Order : num 12 11 1237 392 252 ... $ Q5.MultRegion : chr "" "" "" "" ... $ Q6.OwnMeters. : chr "Yes" "Yes" "Yes" "Yes" ... $ Q7.ResMeters : num 3051 4253 857 8154 33670 ... $ Q7.ComMeters : num 531 972 132 155 1719 ... $ Q7.IntMeters : num 0 19 32 NA 626 NA 29 0 2 NA ... $ Q7.TransMeters : num 0 NA NA NA NA NA NA 0 0 NA ... $ Q7.OtherMeters : num 0 NA NA 57 NA NA NA 0 0 NA ... $ Q7...total.meters : num 3582 5244 1021 8366 36015 ... $ Q8.15Min.ResAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.ComAMI : num 0 NA NA 155 NA NA NA NA NA NA ... $ Q8.15Min.IndAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.TransAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.OtherAMI : num 0 NA NA NA NA NA NA NA NA NA ... $ Q8.15Min.TotalAMI : num 0 0 0 155 0 0 0 0 0 0 ... $ Q8.Hourly.ResAMI : num 0 NA NA NA 16100 NA NA NA NA NA ... $ Q8.Hourly.ComAMI : num 0 NA NA NA 1600 NA NA NA NA NA ... .... Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker <bbaker at reed.edu> wrote:> Trying to read and clean up the FERC data on Advanced Metering > infrastructure. Of course it is in XLS for the first two survey years and > then converts to XLSX for the final two. Bad enough that it is all in > excel, they had to change the survey design and data format as well. Still, > I?m sorting through it. However, when I try and read in the 2008 data, I?m > getting this error: > ### > Wide character in print at > /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/ > xls2csv.pl line 270. > Warning message: > In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > EOF within quoted string > ### > > > > Here is the code I?m running to get the data: > ### > install.packages("gdata") > library("gdata") > fileUrl <- " > http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls > " > download.file(fileUrl, destfile="./ami.data/ami-data2008.xls") > list.files("ami.data") > dateDown.2008 <- date() > ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, > header=TRUE) > ### > > > Reviewed the data in the XLS file, and both ?? and # are present within > it. Don?t know how to get the read.xls to ignore them so I can read all the > data into my data frame. Tried : > ### > ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="", > header=TRUE) > ### > > > And it spits out ?More columns than column names? output. > > > Been searching this, and I can find some ?solutions? for read.table, but > nothing specific to read.xls > > > Many thanks, > > > Benjamin Baker > > > > ? > Sent from Mailbox > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
Jim, I?m not seeing the command f.readXLSheet in the documentation, nor is it executing in my code. ? Sent from Mailbox On Thursday, Mar 26, 2015 at 5:15 AM, jim holtman <jholtman at gmail.com>, wrote: My suggestion is to use XLConnect to read the file:> x <- "C:\\Users\\jh52822\\AppData\\Local\\Temp\\Rtmp6nVgFC\\file385c632aba3.xls"> require(XLConnect)Loading required package: XLConnect Loading required package: XLConnectJars XLConnect 0.2-10 by Mirai Solutions GmbH [aut], ? Martin Studer [cre], ? The Apache Software Foundation [ctb, cph] (Apache POI, Apache Commons ? ? Codec), ? Stephen Colebourne [ctb, cph] (Joda-Time Java library) http://www.mirai-solutions.com , http://miraisolutions.wordpress.com> input <- f.readXLSheet(x, 1)>?> str(input)'data.frame': ? 2266 obs. of ?51 variables: ?$ EIA ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?: num ?34 59 87 97 108 118 123 149 150 157 ... ?$ Entity.Name ? ? ? ? ? ? ? ? ? ? ? ? ?: chr ?"City of Abbeville" "City of Abbeville" "City of Ada" "Adams Electric Cooperative" ... ?$ State ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?: chr ?"SC" "LA" "MN" "IL" ... ?$ NERC.Region ? ? ? ? ? ? ? ? ? ? ? ? ?: chr ?"SERC" "SPP" "MRO" "SERC" ... ?$ Filing.Order ? ? ? ? ? ? ? ? ? ? ? ? : num ?12 11 1237 392 252 ... ?$ Q5.MultRegion ? ? ? ? ? ? ? ? ? ? ? ?: chr ?"" "" "" "" ... ?$ Q6.OwnMeters. ? ? ? ? ? ? ? ? ? ? ? ?: chr ?"Yes" "Yes" "Yes" "Yes" ... ?$ Q7.ResMeters ? ? ? ? ? ? ? ? ? ? ? ? : num ?3051 4253 857 8154 33670 ... ?$ Q7.ComMeters ? ? ? ? ? ? ? ? ? ? ? ? : num ?531 972 132 155 1719 ... ?$ Q7.IntMeters ? ? ? ? ? ? ? ? ? ? ? ? : num ?0 19 32 NA 626 NA 29 0 2 NA ... ?$ Q7.TransMeters ? ? ? ? ? ? ? ? ? ? ? : num ?0 NA NA NA NA NA NA 0 0 NA ... ?$ Q7.OtherMeters ? ? ? ? ? ? ? ? ? ? ? : num ?0 NA NA 57 NA NA NA 0 0 NA ... ?$ Q7...total.meters ? ? ? ? ? ? ? ? ? ?: num ?3582 5244 1021 8366 36015 ... ?$ Q8.15Min.ResAMI ? ? ? ? ? ? ? ? ? ? ?: num ?0 NA NA NA NA NA NA NA NA NA ... ?$ Q8.15Min.ComAMI ? ? ? ? ? ? ? ? ? ? ?: num ?0 NA NA 155 NA NA NA NA NA NA ... ?$ Q8.15Min.IndAMI ? ? ? ? ? ? ? ? ? ? ?: num ?0 NA NA NA NA NA NA NA NA NA ... ?$ Q8.15Min.TransAMI ? ? ? ? ? ? ? ? ? ?: num ?0 NA NA NA NA NA NA NA NA NA ... ?$ Q8.15Min.OtherAMI ? ? ? ? ? ? ? ? ? ?: num ?0 NA NA NA NA NA NA NA NA NA ... ?$ Q8.15Min.TotalAMI ? ? ? ? ? ? ? ? ? ?: num ?0 0 0 155 0 0 0 0 0 0 ... ?$ Q8.Hourly.ResAMI ? ? ? ? ? ? ? ? ? ? : num ?0 NA NA NA 16100 NA NA NA NA NA ... ?$ Q8.Hourly.ComAMI ? ? ? ? ? ? ? ? ? ? : num ?0 NA NA NA 1600 NA NA NA NA NA ... .... Jim Holtman Data Munger Guru ? What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker <bbaker at reed.edu> wrote: Trying to read and clean up the FERC data on Advanced Metering infrastructure. Of course it is in XLS for the first two survey years and then converts to XLSX for the final two. Bad enough that it is all in excel, they had to change the survey design and data format as well. Still, I?m sorting through it. However, when I try and read in the 2008 data, I?m getting this error: ### Wide character in print at /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/xls2csv.pl line 270. Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,? : ? EOF within quoted string ### Here is the code I?m running to get the data: ### install.packages("gdata") library("gdata") fileUrl <- "http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls" download.file(fileUrl, destfile="./ami.data/ami-data2008.xls") list.files("ami.data") dateDown.2008 <- date() ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, header=TRUE) ### Reviewed the data in the XLS file, and both ?? and # are present within it. Don?t know how to get the read.xls to ignore them so I can read all the data into my data frame. Tried : ### ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="", header=TRUE) ### And it spits out ?More columns than column names? output. Been searching this, and I can find some ?solutions? for read.table, but nothing specific to read.xls Many thanks, Benjamin Baker ? Sent from Mailbox ? ? ? ? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Anthony, XLSX won?t read an XLS file. Additionally, the legacy Java that is required for the xlsx package really effs up my computer. Have to reinstall my OS to fix it. ? Sent from Mailbox On Wed, Mar 25, 2015 at 3:51 PM, Anthony Damico <ajdamico at gmail.com> wrote:> maybe > library(xlsx) > tf <- tempfile() > ami <- " > http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls > " > download.file( ami , tf , mode = 'wb' ) > ami.data2008 <- read.xlsx( tf , sheetIndex = 1 ) > On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker <bbaker at reed.edu> wrote: >> Trying to read and clean up the FERC data on Advanced Metering >> infrastructure. Of course it is in XLS for the first two survey years and >> then converts to XLSX for the final two. Bad enough that it is all in >> excel, they had to change the survey design and data format as well. Still, >> I?m sorting through it. However, when I try and read in the 2008 data, I?m >> getting this error: >> ### >> Wide character in print at >> /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/ >> xls2csv.pl line 270. >> Warning message: >> In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : >> EOF within quoted string >> ### >> >> >> >> Here is the code I?m running to get the data: >> ### >> install.packages("gdata") >> library("gdata") >> fileUrl <- " >> http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls >> " >> download.file(fileUrl, destfile="./ami.data/ami-data2008.xls") >> list.files("ami.data") >> dateDown.2008 <- date() >> ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, >> header=TRUE) >> ### >> >> >> Reviewed the data in the XLS file, and both ?? and # are present within >> it. Don?t know how to get the read.xls to ignore them so I can read all the >> data into my data frame. Tried : >> ### >> ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="", >> header=TRUE) >> ### >> >> >> And it spits out ?More columns than column names? output. >> >> >> Been searching this, and I can find some ?solutions? for read.table, but >> nothing specific to read.xls >> >> >> Many thanks, >> >> >> Benjamin Baker >> >> >> >> ? >> Sent from Mailbox >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
Jim, Thanks, XLConnect with proper syntax works great for both types of files. ? Sent from Mailbox On Thu, Mar 26, 2015 at 5:15 AM, jim holtman <jholtman at gmail.com> wrote:> My suggestion is to use XLConnect to read the file: >> x <- > "C:\\Users\\jh52822\\AppData\\Local\\Temp\\Rtmp6nVgFC\\file385c632aba3.xls" >> require(XLConnect) > Loading required package: XLConnect > Loading required package: XLConnectJars > XLConnect 0.2-10 by Mirai Solutions GmbH [aut], > Martin Studer [cre], > The Apache Software Foundation [ctb, cph] (Apache POI, Apache Commons > Codec), > Stephen Colebourne [ctb, cph] (Joda-Time Java library) > http://www.mirai-solutions.com , > http://miraisolutions.wordpress.com >> input <- f.readXLSheet(x, 1) >> >> str(input) > 'data.frame': 2266 obs. of 51 variables: > $ EIA : num 34 59 87 97 108 118 123 149 > 150 157 ... > $ Entity.Name : chr "City of Abbeville" "City of > Abbeville" "City of Ada" "Adams Electric Cooperative" ... > $ State : chr "SC" "LA" "MN" "IL" ... > $ NERC.Region : chr "SERC" "SPP" "MRO" "SERC" ... > $ Filing.Order : num 12 11 1237 392 252 ... > $ Q5.MultRegion : chr "" "" "" "" ... > $ Q6.OwnMeters. : chr "Yes" "Yes" "Yes" "Yes" ... > $ Q7.ResMeters : num 3051 4253 857 8154 33670 ... > $ Q7.ComMeters : num 531 972 132 155 1719 ... > $ Q7.IntMeters : num 0 19 32 NA 626 NA 29 0 2 NA > ... > $ Q7.TransMeters : num 0 NA NA NA NA NA NA 0 0 NA > ... > $ Q7.OtherMeters : num 0 NA NA 57 NA NA NA 0 0 NA > ... > $ Q7...total.meters : num 3582 5244 1021 8366 36015 ... > $ Q8.15Min.ResAMI : num 0 NA NA NA NA NA NA NA NA NA > ... > $ Q8.15Min.ComAMI : num 0 NA NA 155 NA NA NA NA NA > NA ... > $ Q8.15Min.IndAMI : num 0 NA NA NA NA NA NA NA NA NA > ... > $ Q8.15Min.TransAMI : num 0 NA NA NA NA NA NA NA NA NA > ... > $ Q8.15Min.OtherAMI : num 0 NA NA NA NA NA NA NA NA NA > ... > $ Q8.15Min.TotalAMI : num 0 0 0 155 0 0 0 0 0 0 ... > $ Q8.Hourly.ResAMI : num 0 NA NA NA 16100 NA NA NA NA > NA ... > $ Q8.Hourly.ComAMI : num 0 NA NA NA 1600 NA NA NA NA > NA ... > .... > Jim Holtman > Data Munger Guru > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > On Wed, Mar 25, 2015 at 5:01 PM, Benjamin Baker <bbaker at reed.edu> wrote: >> Trying to read and clean up the FERC data on Advanced Metering >> infrastructure. Of course it is in XLS for the first two survey years and >> then converts to XLSX for the final two. Bad enough that it is all in >> excel, they had to change the survey design and data format as well. Still, >> I?m sorting through it. However, when I try and read in the 2008 data, I?m >> getting this error: >> ### >> Wide character in print at >> /Library/Frameworks/R.framework/Versions/3.1/Resources/library/gdata/perl/ >> xls2csv.pl line 270. >> Warning message: >> In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : >> EOF within quoted string >> ### >> >> >> >> Here is the code I?m running to get the data: >> ### >> install.packages("gdata") >> library("gdata") >> fileUrl <- " >> http://www.ferc.gov/industries/electric/indus-act/demand-response/2008/survey/ami_survey_responses.xls >> " >> download.file(fileUrl, destfile="./ami.data/ami-data2008.xls") >> list.files("ami.data") >> dateDown.2008 <- date() >> ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, >> header=TRUE) >> ### >> >> >> Reviewed the data in the XLS file, and both ?? and # are present within >> it. Don?t know how to get the read.xls to ignore them so I can read all the >> data into my data frame. Tried : >> ### >> ami.data2008 <- read.xls("./ami.data/ami-data2008.xls", sheet=1, quote="", >> header=TRUE) >> ### >> >> >> And it spits out ?More columns than column names? output. >> >> >> Been searching this, and I can find some ?solutions? for read.table, but >> nothing specific to read.xls >> >> >> Many thanks, >> >> >> Benjamin Baker >> >> >> >> ? >> Sent from Mailbox >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]