Hi All, I am using xlsx package to extract and clean data from an Excel Workbook. I ran into a strange behavior that I don?t understand. The gsub doesn?t work inside the loop but does outside the loop as shown on my code.. Tried to Google for help but nothing came up. My code loads and reads data from sheets in the workbook as a list of data frames and assign them names. I wanted to replace the numbers with spaces inside each part of the description column on each data frame using gsub. Example data: Date description number 12/12/12 AAAA234BBB 1 1/3/12 cccc65bb35ff 2 2/7/13 234abababab 3 I want to have the description column to be like this. AAAA BBB Cccc bb ff abababab My code MyFile <- "C:/Users/name/Documents/Testing2.xlsx" MyWBook <- loadWorkbook(MyFile) MySNames <- list(names(getSheets (MyWBook))) NumSheets <- length(getSheets(MyWBook)) for (i in 1:NumSheets) { MySNames[[i]] <-read.xlsx(MyFile,i,as.data.frame=TRUE,header=TRUE,keepFormulas=FALSE,stringsAsFactors=FALSE) gsub("'|-|[0-9]","",MySNames[[i]]$Description) } The gsub function above doesn?t work, but when I tried the function outside the loops, as shown below, it worked. gsub("'|-|[0-9]","",MySNames[[2]]$Description) Thanks in advance--EK
peter dalgaard
2019-Mar-05 16:36 UTC
[R] Function doesn't work inside loop but works outside
You need a print() around the gsub(...) when inside a loop. -pd> On 5 Mar 2019, at 17:18 , Ek Esawi <esawiek at gmail.com> wrote: > > Hi All, > > I am using xlsx package to extract and clean data from an Excel > Workbook. I ran into a strange behavior that I don?t understand. The > gsub doesn?t work inside the loop but does outside the loop as shown > on my code.. Tried to Google for help but nothing came up. > > My code loads and reads data from sheets in the workbook as a list of > data frames and assign them names. I wanted to replace the numbers > with spaces inside each part of the description column on each data > frame using gsub. > > Example data: > Date description number > 12/12/12 AAAA234BBB 1 > 1/3/12 cccc65bb35ff 2 > 2/7/13 234abababab 3 > > I want to have the description column to be like this. > AAAA BBB > Cccc bb ff > abababab > > My code > > MyFile <- "C:/Users/name/Documents/Testing2.xlsx" > MyWBook <- loadWorkbook(MyFile) > MySNames <- list(names(getSheets (MyWBook))) > NumSheets <- length(getSheets(MyWBook)) > > for (i in 1:NumSheets) { > MySNames[[i]] > <-read.xlsx(MyFile,i,as.data.frame=TRUE,header=TRUE,keepFormulas=FALSE,stringsAsFactors=FALSE) > gsub("'|-|[0-9]","",MySNames[[i]]$Description) > } > > The gsub function above doesn?t work, but when I tried the function > outside the loops, as shown below, it worked. > gsub("'|-|[0-9]","",MySNames[[2]]$Description) > > > Thanks in advance--EK > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Thank you Peter. That's a dumb question on my part! At least i should have known that i need an assignment statement. Thanks again--EK On Tue, Mar 5, 2019 at 11:36 AM peter dalgaard <pdalgd at gmail.com> wrote:> > You need a print() around the gsub(...) when inside a loop. > > -pd > > > On 5 Mar 2019, at 17:18 , Ek Esawi <esawiek at gmail.com> wrote: > > > > Hi All, > > > > I am using xlsx package to extract and clean data from an Excel > > Workbook. I ran into a strange behavior that I don?t understand. The > > gsub doesn?t work inside the loop but does outside the loop as shown > > on my code.. Tried to Google for help but nothing came up. > > > > My code loads and reads data from sheets in the workbook as a list of > > data frames and assign them names. I wanted to replace the numbers > > with spaces inside each part of the description column on each data > > frame using gsub. > > > > Example data: > > Date description number > > 12/12/12 AAAA234BBB 1 > > 1/3/12 cccc65bb35ff 2 > > 2/7/13 234abababab 3 > > > > I want to have the description column to be like this. > > AAAA BBB > > Cccc bb ff > > abababab > > > > My code > > > > MyFile <- "C:/Users/name/Documents/Testing2.xlsx" > > MyWBook <- loadWorkbook(MyFile) > > MySNames <- list(names(getSheets (MyWBook))) > > NumSheets <- length(getSheets(MyWBook)) > > > > for (i in 1:NumSheets) { > > MySNames[[i]] > > <-read.xlsx(MyFile,i,as.data.frame=TRUE,header=TRUE,keepFormulas=FALSE,stringsAsFactors=FALSE) > > gsub("'|-|[0-9]","",MySNames[[i]]$Description) > > } > > > > The gsub function above doesn?t work, but when I tried the function > > outside the loops, as shown below, it worked. > > gsub("'|-|[0-9]","",MySNames[[2]]$Description) > > > > > > Thanks in advance--EK > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Office: A 4.23 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > > > > > > > >