Colleagues, I'm trying to extract a cell from all Excel files in a directory. library(readxl) files <- list.files(pattern="*.xls", full.names = FALSE) date <- lapply(files, read_excel, sheet="Sheet1", range=("B5")) date_df <- as.data.frame(date) trans_date <-t(date_df) mydates <- list(trans_date) write.table(mydates,"mydates.txt",sep="\t") Looking at mydates.txt shows: "" "Saturday..June.09..2018" "Saturday..June.09..2018.1" "Saturday..June.09..2018.2" But the original Excel contents are: Saturday, June 09, 2018 Saturday, June 09, 2018 Saturday, June 09, 2018 I get a similar problem with my serial numbers serial <-lapply(files, read_excel, sheet="Sheet1", range=("B9")) serial_df <- as.data.frame(serial) trans_serial <-t(serial_df) myserials <- list(trans_serial) write.table(myserials,"myserials.txt",sep="\t") R Output "" "X96739.0027.1" "X96739.0041.1" "X96739.0044.1" Original Excel Content 96739-0027/1 96739-0041/1 96739-0044/1 How can I amend my script so that the output matches the original Excel content? Thomas Subia Statistician / Senior Quality Engineer IMG Precision
On Thu, 5 Dec 2019 15:39:56 +0000 Thomas Subia <tsubia at imgprecision.com> wrote:> date <- lapply(files, read_excel, sheet="Sheet1", range=("B5")) > date_df <- as.data.frame(date) > trans_date <-t(date_df) > mydates <- list(trans_date)This feels a bit excessive for what looks like a one-dimensional string vector. Why is it needed? Can you get better results with sapply or vapply (which return vectors, not lists)? In particular, as.data.frame might be responsible for the name mangling. Also, your data seems to end up inside the row names. Try using str() on every step of the transformation to check if that is the case. Also check out the .name_repair argument of the read_excel function, but I think that as.data.frame is part of the problem. -- Best regards, Ivan
The best advice that anyone could give: See fortunes::fortune("Friends") . cheers, Rolf Turner On 6/12/19 4:39 am, Thomas Subia wrote:> Colleagues, > > I'm trying to extract a cell from all Excel files in a directory. > > library(readxl) > files <- list.files(pattern="*.xls", full.names = FALSE) > > date <- lapply(files, read_excel, sheet="Sheet1", range=("B5")) > > date_df <- as.data.frame(date) > trans_date <-t(date_df) > mydates <- list(trans_date) > write.table(mydates,"mydates.txt",sep="\t") > > Looking at mydates.txt shows: > > "" > "Saturday..June.09..2018" > "Saturday..June.09..2018.1" > "Saturday..June.09..2018.2" > > But the original Excel contents are: > > Saturday, June 09, 2018 > Saturday, June 09, 2018 > Saturday, June 09, 2018 > > I get a similar problem with my serial numbers > > serial <-lapply(files, read_excel, sheet="Sheet1", range=("B9")) > > serial_df <- as.data.frame(serial) > trans_serial <-t(serial_df) > myserials <- list(trans_serial) > write.table(myserials,"myserials.txt",sep="\t") > > R Output > > "" > "X96739.0027.1" > "X96739.0041.1" > "X96739.0044.1" > > Original Excel Content > 96739-0027/1 > 96739-0041/1 > 96739-0044/1 > > How can I amend my script so that the output matches the original Excel content?