Hi all, Does anybody know the easiest way to import excel data into R? I copied and pasted the excel data into a txt file, and tried read.table, but R reported that Error in read.table("data_support.txt", sep = " ", header = T) : more columns than column names Thanks! Ling
Your error message tells me that you have different numbers of fields in different lines. You say you, "copied and pasted the excel data into a txt file". I usually copy what I want into a clean sheet then File -> Save, then File -> "Save As" -> "Save as type" = "CSV (Comma delimited) (*.csv)" or "Text (Tab delimited) (*.txt)". Excel will ask if I'm sure a couple of times, and I say yes. If that's what you've done and still have a problem, then I have other tools: First, I'll assign the file name to something like "File". Then, 'readLines(File, n=9)' tells me if the file starts as I think it does. If I've got extra headers, it will tell me that. Then, I do something like the following: n.flds <- count.fields(File, sep="\t") plot(n.flds) sd(n.flds) Then I play with the arguments to "count.fields" until 'sd(n.flds)' is 0. Then I use "read.table" with arguments as I used to get everything right in 'count.fields'. If I can't get sd(n.flds) to 0, you can try read.table with 'fill=TRUE'. However, when you do that, you need to check to make sure all the columns line up correctly with the shorter lines. Also, this issue has been discussed many times. 'RSiteSearch("read excel")' just produced 1196 hits for me. If the above doesn't work, you might try skimming a few from that list. hope this helps. spencer graves Ling Jin wrote:> Hi all, > > Does anybody know the easiest way to import excel data into R? I copied > and pasted the excel data into a txt file, and tried read.table, but R > reported that > > Error in read.table("data_support.txt", sep = " ", header = T) : > more columns than column names > > Thanks! > > Ling > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html-- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA spencer.graves at pdf.com www.pdf.com <http://www.pdf.com> Tel: 408-938-4420 Fax: 408-280-7915
Ling, You might take a look at the function read.xls() in gdata library. HTH. On 6/22/05, Ling Jin <ljin at lbl.gov> wrote:> Hi all, > > Does anybody know the easiest way to import excel data into R? I copied > and pasted the excel data into a txt file, and tried read.table, but R > reported that > > Error in read.table("data_support.txt", sep = " ", header = T) : > more columns than column names > > Thanks! > > Ling > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- WenSui Liu, MS MA Senior Decision Support Analyst Division of Health Policy and Clinical Effectiveness Cincinnati Children Hospital Medical Center
Ling, If any column has text with spaces between words, this will lead to the "more columns ..." problem. Delete the spaces and try again. e.g., if the Excel file is Var1 Var2 Var3 text 1 2 more text 3 4 yet more 5 6 and more 7 8 blahblah 9 10 On a Mac, this will lead to the error message "Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1 did not have 4 elements" (which I believe is the equivalent message to what you are getting on a PC) But, if your remove the blanks in column 1, this reads as> x <- read.table("test.txt",header=T) > xVar1 Var2 Var3 1 text 1 2 2 moretext 3 4 3 yetmore 5 6 4 andmore 7 8 5 blahblah 9 10 with no error message. Alternatively, for small files, if using a PC try copying the Excel spreadsheet to your clipboard and x <- read.table(file("clipboard"), header = TRUE) or, if using a Mac x <- read.table(pipe("pbpaste"), header = TRUE) Bill At 8:38 PM -0400 6/22/05, Wensui Liu wrote:>Ling, > >You might take a look at the function read.xls() in gdata library. > >HTH. > > >On 6/22/05, Ling Jin <ljin at lbl.gov> wrote: >> Hi all, >> >> Does anybody know the easiest way to import excel data into R? I copied >> and pasted the excel data into a txt file, and tried read.table, but R >> reported that >> >> Error in read.table("data_support.txt", sep = " ", header = T) : >> more columns than column names >> >> Thanks! >> > > Ling >>-- William Revelle http://pmc.psych.northwestern.edu/revelle.html Professor http://personality-project.org/personality.html Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern University http://www.northwestern.edu/
Hi, you can use the library "RODBC" to import Excel-Files. This works for me: #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ library(RODBC) setwd("C:\\R1B2") channel <- odbcConnectExcel("pk2003.xls") tab <- sqlFetch(channel, "Tabelle3") #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ HTH Patrick
This is really great. I use odbc for sql all the time, but I never needed to read in excel files before. I needed to yesterday and I looked at read.xls() from library(gdata) and it took 5-10 minutes to read in the file and odbc did it in 5 seconds! I guess that is the good thing about having duplication in function in R, we can try several methods and choose which one is best/fastest. Thanks for the example. On 6/23/05, Patrick Hausmann <c18g at zfn.uni-bremen.de> wrote:> Hi, > > you can use the library "RODBC" to import Excel-Files. This works for me: > > #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > library(RODBC) > setwd("C:\\R1B2") > channel <- odbcConnectExcel("pk2003.xls") > tab <- sqlFetch(channel, "Tabelle3") > #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > HTH > Patrick > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
Hi ling, save your file as 'csv'. I always use the following script: NAME <- read.table("C:LOCATION/FILE NAME.csv", header = TRUE, sep ",", na.string=".") ________________________________________ Paulo M. Brando Instituto de Pesquisa Ambiental da Amazonia (IPAM) Santarem, PA, Brasil. Av. Rui Barbosa, 136. Fone: + 55 93 522 55 38 www.ipam.org.br E-mail: pmbrando at ipam.org.br -----Mensagem original----- De: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] Em nome de Ling Jin Enviada em: Wednesday, June 22, 2005 4:47 PM Para: r-help at stat.math.ethz.ch Assunto: [R] How to read an excel data into R? Hi all, Does anybody know the easiest way to import excel data into R? I copied and pasted the excel data into a txt file, and tried read.table, but R reported that Error in read.table("data_support.txt", sep = " ", header = T) : more columns than column names Thanks! Ling ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
On 22 Jun 2005 at 16:46, Ling Jin wrote:> Hi all, > > Does anybody know the easiest way to import excel data into R? I > copied and pasted the excel data into a txt file, and tried > read.table, but R reported that > > Error in read.table("data_support.txt", sep = " ", header = T) : > more columns than column names >Or simply Open Excell file Decide what you want to copy and put it to clipboard by Ctrl-C In R issue temp1<-read.delim("clipboard") and you have your data in temp1 HTH Petr> Thanks! > > Ling > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.htmlPetr Pikal petr.pikal at precheza.cz
Ling Jin a ??crit :> Hi all, > > Does anybody know the easiest way to import excel data into R? I copied > and pasted the excel data into a txt file, and tried read.table, but R > reported that > > Error in read.table("data_support.txt", sep = " ", header = T) : > more columns than column names > > Thanks! > > Ling > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >Here is a function from a not-yet-released package written by a colleague and I, based on package RODBC written by Pr Ripley. The idea is to wrap - in the same function, GUI (suite of pop-up windows) and command-line facilities. It is a preliminary, unoptimized version. Suggestions for improvements and bug reports are welcome. Let me know if you want the packaged version. Best, Renaud #### query <- function(tab = NULL, db = NULL, query = "all"){ # load the RODBC package and stops the program if not available if(!require(RODBC)) stop("This function requires the RODBC package.\n") # close all databases in case of error on.exit(odbcCloseAll()) ## name of the database is not provided if(is.null(db)){ Databases <- matrix(c("MS Access database (*.mdb)", "*.mdb", "MS Excel file (*.xls)", "*.xls", "dBase-like file (*.dbf)", "*.dbf"), nrow = 3, byrow = TRUE) File <- choose.files(filters = Databases, multi = FALSE, caption = "Select a database") sop <- match(".", rev(strsplit(File, NULL)[[1]]))[1] ext <- tolower(substring(File, nchar(File) - sop + 2, nchar(File))) channel <- switch(EXPR = ext, xls = odbcConnectExcel(File), mdb = odbcConnectAccess(File), dbf = odbcConnectDbase(File)) # For Excel and Access cases, need to select a particular sheet or table if(ext != "dbf"){ # sheet or table name is not provided if(is.null(tab)){ tabdat <- sqlTables(channel) names(tabdat) <- tolower(names(tabdat)) if(ext == "mdb") tabdat <- tabdat[tabdat$table_type == "TABLE", 3] if(ext == "xls"){ tabname <- tabdat$table_name namfil <- tabdat[substring(tabname, nchar(tabname), nchar(tabname)) == "$", 3] tabdat <- substring(namfil, 1, nchar(namfil) - 1) } fil <- select.list(sort(tabdat)) if(length(fil) == 0) stop("No file was selected.") if(ext == "xls") fil <- paste("[", fil, "$]", sep = "") } else # sheet or table name is provided fil <- if(ext != "xls") tab else paste("[", tab, "$]", sep = "") } else{ # dBase file sop <- match(".", rev(strsplit(File, NULL)[[1]]))[1] root <- tolower(substring(File, 1, nchar(File) - sop)) revstr <- rev(strsplit(root, NULL)[[1]]) sop <- if(is.na(match(c("/", "\\"), revstr)[1])) length(revstr) else match(c("/", "\\"), revstr)[1] - 1 toor <- revstr[seq(sop)] fil <- paste(rev(toor), collapse = "") } } ## name of the database is provided else{ sop <- match(".", rev(strsplit(db, NULL)[[1]]))[1] if(is.na(sop)) stop("You must provide the full path and the extension for the database.\n") else{ ext <- tolower(substring(db, nchar(db) - sop + 2, nchar(db))) channel <- switch(EXPR = ext, xls = odbcConnectExcel(db), mdb = odbcConnectAccess(db), dbf = odbcConnectDbase(db), stop("query not yet implemented for databases of format .", ext, "\n")) # dBase file if(ext == "dbf"){ sop <- match(".", rev(strsplit(db, NULL)[[1]]))[1] root <- tolower(substring(db, 1, nchar(db) - sop)) revstr <- rev(strsplit(root, NULL)[[1]]) sop <- if(is.na(match(c("/", "\\"), revstr)[1])) length(revstr) else match(c("/", "\\"), revstr)[1] - 1 toor <- revstr[seq(sop)] fil <- paste(rev(toor), collapse = "") } else{ # name of the table is not provided (Excel or Access) if(is.null(tab)){ tabdat <- sqlTables(channel) names(tabdat) <- tolower(names(tabdat)) if(ext == "mdb") tabdat <- tabdat[tabdat$table_type == "TABLE", 3] if(ext == "xls"){ tabname <- tabdat$table_name namfil <- tabdat[substring(tabname, nchar(tabname), nchar(tabname)) == "$", 3] tabdat <- substring(namfil, 1, nchar(namfil) - 1) } fil <- select.list(sort(tabdat)) if(length(fil) == 0) stop("No file was selected.") if(ext == "xls") fil <- paste("[", fil, "$]", sep = "") } else fil <- if(ext != "xls") tab else paste("[", tab, "$]", sep = "") } } } # retrieve the data if(query == "all") dat <- sqlQuery(channel = channel, query = paste("select * from", fil)) else dat <- sqlQuery(channel = channel, query = query) odbcCloseAll() dat } -- Dr Renaud Lancelot, v??t??rinaire Projet FSP r??gional ??pid??miologie v??t??rinaire C/0 Ambassade de France - SCAC BP 834 Antananarivo 101 - Madagascar e-mail: renaud.lancelot at cirad.fr tel.: +261 32 40 165 53 (cell) +261 20 22 665 36 ext. 225 (work) +261 20 22 494 37 (home)
Ling Jin a ??crit :> Hi all, > > Does anybody know the easiest way to import excel data into R? I copied > and pasted the excel data into a txt file, and tried read.table, but R > reported that > > Error in read.table("data_support.txt", sep = " ", header = T) : > more columns than column names > > Thanks! > > Ling > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >Here is a function from a not-yet-released package written by a colleague and I, based on package RODBC written by Pr Ripley. The idea is to wrap - in the same function, GUI (suite of pop-up windows) and command-line facilities. It allows the importation of MS Excel and MS Access sheet or tables (within databases), and dBase-like files. It is a preliminary, unoptimized version. Suggestions for improvements and bug reports are welcome. Let me know if you want the packaged version (with help file). Best, Renaud #### query <- function(tab = NULL, db = NULL, query = "all"){ # load the RODBC package and stops the program if not available if(!require(RODBC)) stop("This function requires the RODBC package.\n") # close all databases in case of error on.exit(odbcCloseAll()) ## name of the database is not provided if(is.null(db)){ Databases <- matrix(c("MS Access database (*.mdb)", "*.mdb", "MS Excel file (*.xls)", "*.xls", "dBase-like file (*.dbf)", "*.dbf"), nrow = 3, byrow = TRUE) File <- choose.files(filters = Databases, multi = FALSE, caption = "Select a database") sop <- match(".", rev(strsplit(File, NULL)[[1]]))[1] ext <- tolower(substring(File, nchar(File) - sop + 2, nchar(File))) channel <- switch(EXPR = ext, xls = odbcConnectExcel(File), mdb = odbcConnectAccess(File), dbf = odbcConnectDbase(File)) # For Excel and Access cases, need to select a particular sheet or table if(ext != "dbf"){ # sheet or table name is not provided if(is.null(tab)){ tabdat <- sqlTables(channel) names(tabdat) <- tolower(names(tabdat)) if(ext == "mdb") tabdat <- tabdat[tabdat$table_type == "TABLE", 3] if(ext == "xls"){ tabname <- tabdat$table_name namfil <- tabdat[substring(tabname, nchar(tabname), nchar(tabname)) == "$", 3] tabdat <- substring(namfil, 1, nchar(namfil) - 1) } fil <- select.list(sort(tabdat)) if(length(fil) == 0) stop("No file was selected.") if(ext == "xls") fil <- paste("[", fil, "$]", sep = "") } else # sheet or table name is provided fil <- if(ext != "xls") tab else paste("[", tab, "$]", sep = "") } else{ # dBase file sop <- match(".", rev(strsplit(File, NULL)[[1]]))[1] root <- tolower(substring(File, 1, nchar(File) - sop)) revstr <- rev(strsplit(root, NULL)[[1]]) sop <- if(is.na(match(c("/", "\\"), revstr)[1])) length(revstr) else match(c("/", "\\"), revstr)[1] - 1 toor <- revstr[seq(sop)] fil <- paste(rev(toor), collapse = "") } } ## name of the database is provided else{ sop <- match(".", rev(strsplit(db, NULL)[[1]]))[1] if(is.na(sop)) stop("You must provide the full path and the extension for the database.\n") else{ ext <- tolower(substring(db, nchar(db) - sop + 2, nchar(db))) channel <- switch(EXPR = ext, xls = odbcConnectExcel(db), mdb = odbcConnectAccess(db), dbf = odbcConnectDbase(db), stop("query not yet implemented for databases of format .", ext, "\n")) # dBase file if(ext == "dbf"){ sop <- match(".", rev(strsplit(db, NULL)[[1]]))[1] root <- tolower(substring(db, 1, nchar(db) - sop)) revstr <- rev(strsplit(root, NULL)[[1]]) sop <- if(is.na(match(c("/", "\\"), revstr)[1])) length(revstr) else match(c("/", "\\"), revstr)[1] - 1 toor <- revstr[seq(sop)] fil <- paste(rev(toor), collapse = "") } else{ # name of the table is not provided (Excel or Access) if(is.null(tab)){ tabdat <- sqlTables(channel) names(tabdat) <- tolower(names(tabdat)) if(ext == "mdb") tabdat <- tabdat[tabdat$table_type == "TABLE", 3] if(ext == "xls"){ tabname <- tabdat$table_name namfil <- tabdat[substring(tabname, nchar(tabname), nchar(tabname)) == "$", 3] tabdat <- substring(namfil, 1, nchar(namfil) - 1) } fil <- select.list(sort(tabdat)) if(length(fil) == 0) stop("No file was selected.") if(ext == "xls") fil <- paste("[", fil, "$]", sep = "") } else fil <- if(ext != "xls") tab else paste("[", tab, "$]", sep = "") } } } # retrieve the data if(query == "all") dat <- sqlQuery(channel = channel, query = paste("select * from", fil)) else dat <- sqlQuery(channel = channel, query = query) odbcCloseAll() dat } -- Dr Renaud Lancelot, v??t??rinaire Projet FSP r??gional ??pid??miologie v??t??rinaire C/0 Ambassade de France - SCAC BP 834 Antananarivo 101 - Madagascar e-mail: renaud.lancelot at cirad.fr tel.: +261 32 40 165 53 (cell) +261 20 22 665 36 ext. 225 (work) +261 20 22 494 37 (home)
'RSiteSearch' is an R command new with R 2.0.0 or 2.1.0, I believe. It essentially passes the argument string to "www.r-project.org" -> Search -> "R site search". Consequently, it requires internet access to work. When I have an R (or S-Plus) question for which I do not already know where to find the answer, "R site search" has been my primary search tool for some time. spencer graves Ling Jin wrote: > Could you be more specific about RSiteSearch("read excel")? I think it > must be useful. > ################################ Your error message tells me that you have different numbers of fields in different lines. You say you, "copied and pasted the excel data into a txt file". I usually copy what I want into a clean sheet then File -> Save, then File -> "Save As" -> "Save as type" = "CSV (Comma delimited) (*.csv)" or "Text (Tab delimited) (*.txt)". Excel will ask if I'm sure a couple of times, and I say yes. If that's what you've done and still have a problem, then I have other tools: First, I'll assign the file name to something like "File". Then, 'readLines(File, n=9)' tells me if the file starts as I think it does. If I've got extra headers, it will tell me that. Then, I do something like the following: n.flds <- count.fields(File, sep="\t") plot(n.flds) sd(n.flds) Then I play with the arguments to "count.fields" until 'sd(n.flds)' is 0. Then I use "read.table" with arguments as I used to get everything right in 'count.fields'. If I can't get sd(n.flds) to 0, you can try read.table with 'fill=TRUE'. However, when you do that, you need to check to make sure all the columns line up correctly with the shorter lines. Also, this issue has been discussed many times. 'RSiteSearch("read excel")' just produced 1196 hits for me. If the above doesn't work, you might try skimming a few from that list. hope this helps. spencer graves Ling Jin wrote:> Hi all, > > Does anybody know the easiest way to import excel data into R? I copied > and pasted the excel data into a txt file, and tried read.table, but R > reported that > > Error in read.table("data_support.txt", sep = " ", header = T) : > more columns than column names > > Thanks! > > Ling > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html-- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA spencer.graves at pdf.com www.pdf.com <http://www.pdf.com> Tel: 408-938-4420 Fax: 408-280-7915