Dear R community members I did find a good way to merge my 200 text data files in to a single data file with one column added will show indicator for that file. filelist = list.files(pattern = "K*cd.txt") # the file names are K1cd.txt .................to K200cd.txt data_list <-lapply(filelist, read.table, header=T, comment=";", fill=T) This will create list, but this is not what I want. I want a single dataframe (all separate dataframes have same variable headings) with additional row for example ; just for example, two small datasets are created by my component datasets are huge, need automation ;read from file K1cd.txt var1 var2 var3 var4 1 6 0.3 8 3 4 0.4 9 2 3 0.4 6 1 0.4 0.9 3 ;read from file K2cd.txt var1 var2 var3 var4 1 16 0.6 7 3 14 0.4 6 2 1 3 0.4 5 1 0.6 0.9 2 the output dataframe should look like Fileno var1 var2 var3 var4 1 1 6 0.3 8 1 3 4 0.4 9 1 2 3 0.4 6 1 1 0.4 0.9 3 2 1 16 0.6 7 2 3 14 0.4 6 2 2 1 3 0.4 5 2 1 0.6 0.9 2 Please note that new file no column is added Thank you for the help. Umesh R [[alternative HTML version deleted]]
On 04.04.2011 16:41, Umesh Rosyara wrote:> Dear R community members > > > > I did find a good way to merge my 200 text data files in to a single data > file with one column added will show indicator for that file. > > > > filelist = list.files(pattern = "K*cd.txt")I doubt you meant "K*cd.txt" but "^K[[:digit:]]*cd\\.txt$". # the file names are K1cd.txt> .................to K200cd.txt > > data_list<-lapply(filelist, read.table, header=T, comment=";", fill=T)Replace by: data_list <- lapply(filelist, function(x) cbind(Filename = x, read.table(x, header=T, comment=";", fill=TRUE)) And then: result <- do.call("rbind", data_list) Uwe Ligges> > > > This will create list, but this is not what I want. > > > > I want a single dataframe (all separate dataframes have same variable > headings) with additional row for example > > > > ; just for example, two small datasets are created by my component datasets > are huge, need automation > > ;read from file K1cd.txt > > var1 var2 var3 var4 > > 1 6 0.3 8 > > 3 4 0.4 9 > > 2 3 0.4 6 > > 1 0.4 0.9 3 > > > > ;read from file K2cd.txt > > var1 var2 var3 var4 > > 1 16 0.6 7 > > 3 14 0.4 6 > > 2 1 3 0.4 5 > > 1 0.6 0.9 2 > > > > the output dataframe should look like > > > > Fileno var1 var2 var3 var4 > > 1 1 6 0.3 8 > > 1 3 4 0.4 9 > > 1 2 3 0.4 6 > > 1 1 0.4 0.9 3 > > 2 1 16 0.6 7 > > 2 3 14 0.4 6 > > 2 2 1 3 0.4 5 > > 2 1 0.6 0.9 2 > > > > Please note that new file no column is added > > > > Thank you for the help. > > > > Umesh R > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Umesh Rosyara
2011-Apr-04 16:37 UTC
[R] Questions remaining: define any character as na.string RE: merging data list in to single data frame
Dear Uwe and R community members Thank you Uwe for the help. I have still a question remaining, I am trying to find answer from long time. While exporting my data, I have some characters mixed into it. I want to define any characters as na.string? Is it possible to do so? Thanks; Umesh -----Original Message----- From: Uwe Ligges [mailto:ligges at statistik.tu-dortmund.de] Sent: Monday, April 04, 2011 12:22 PM To: Umesh Rosyara Cc: r-help at r-project.org; rosyaraur at gmail.com Subject: Re: [R] merging data list in to single data frame On 04.04.2011 16:41, Umesh Rosyara wrote:> Dear R community members > > > > I did find a good way to merge my 200 text data files in to a single data > file with one column added will show indicator for that file. > > > > filelist = list.files(pattern = "K*cd.txt")I doubt you meant "K*cd.txt" but "^K[[:digit:]]*cd\\.txt$". # the file names are K1cd.txt> .................to K200cd.txt > > data_list<-lapply(filelist, read.table, header=T, comment=";", fill=T)Replace by: data_list <- lapply(filelist, function(x) cbind(Filename = x, read.table(x, header=T, comment=";", fill=TRUE)) And then: result <- do.call("rbind", data_list) Uwe Ligges> > > > This will create list, but this is not what I want. > > > > I want a single dataframe (all separate dataframes have same variable > headings) with additional row for example > > > > ; just for example, two small datasets are created by my componentdatasets> are huge, need automation > > ;read from file K1cd.txt > > var1 var2 var3 var4 > > 1 6 0.3 8 > > 3 4 0.4 9 > > 2 3 0.4 6 > > 1 0.4 0.9 3 > > > > ;read from file K2cd.txt > > var1 var2 var3 var4 > > 1 16 0.6 7 > > 3 14 0.4 6 > > 2 1 3 0.4 5 > > 1 0.6 0.9 2 > > > > the output dataframe should look like > > > > Fileno var1 var2 var3 var4 > > 1 1 6 0.3 8 > > 1 3 4 0.4 9 > > 1 2 3 0.4 6 > > 1 1 0.4 0.9 3 > > 2 1 16 0.6 7 > > 2 3 14 0.4 6 > > 2 2 1 3 0.4 5 > > 2 1 0.6 0.9 2 > > > > Please note that new file no column is added > > > > Thank you for the help. > > > > Umesh R > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
2011-Apr-04 17:22 UTC
[R] Questions remaining: define any character as na.string RE: merging data list in to single data frame
On Apr 4, 2011, at 12:37 PM, Umesh Rosyara wrote:> Dear Uwe and R community members > > Thank you Uwe for the help. > > I have still a question remaining, I am trying to find answer from > long > time. > > While exporting my data, I have some characters mixed into it. I > want to > define any characters as na.string? Is it possible to do so?Option 1: do it in an editor that is regex aware. Option 2: input the file with readLines, use gsub to remove the unwanted characters, read.table(textConnection(obj)) on the resulting object. [There are many worked examples in the archives. Search on "read.table(textConnection(" .] -- David.> > Thanks; > > Umesh > > > > -----Original Message----- > From: Uwe Ligges [mailto:ligges at statistik.tu-dortmund.de] > Sent: Monday, April 04, 2011 12:22 PM > To: Umesh Rosyara > Cc: r-help at r-project.org; rosyaraur at gmail.com > Subject: Re: [R] merging data list in to single data frame > > > > On 04.04.2011 16:41, Umesh Rosyara wrote: >> Dear R community members >> >> >> >> I did find a good way to merge my 200 text data files in to a >> single data >> file with one column added will show indicator for that file. >> >> >> >> filelist = list.files(pattern = "K*cd.txt") > > > I doubt you meant "K*cd.txt" but "^K[[:digit:]]*cd\\.txt$". > > > > # the file names are K1cd.txt >> .................to K200cd.txt >> >> data_list<-lapply(filelist, read.table, header=T, comment=";", >> fill=T) > > > Replace by: > > data_list <- lapply(filelist, function(x) > cbind(Filename = x, read.table(x, header=T, comment=";", > fill=TRUE)) > > > And then: > > result <- do.call("rbind", data_list) > > Uwe Ligges > > >> >> >> >> This will create list, but this is not what I want. >> >> >> >> I want a single dataframe (all separate dataframes have same variable >> headings) with additional row for example >> >> >> >> ; just for example, two small datasets are created by my component > datasets >> are huge, need automation >> >> ;read from file K1cd.txt >> >> var1 var2 var3 var4 >> >> 1 6 0.3 8 >> >> 3 4 0.4 9 >> >> 2 3 0.4 6 >> >> 1 0.4 0.9 3 >> >> >> >> ;read from file K2cd.txt >> >> var1 var2 var3 var4 >> >> 1 16 0.6 7 >> >> 3 14 0.4 6 >> >> 2 1 3 0.4 5 >> >> 1 0.6 0.9 2 >> >> >> >> the output dataframe should look like >> >> >> >> Fileno var1 var2 var3 var4 >> >> 1 1 6 0.3 8 >> >> 1 3 4 0.4 9 >> >> 1 2 3 0.4 6 >> >> 1 1 0.4 0.9 3 >> >> 2 1 16 0.6 7 >> >> 2 3 14 0.4 6 >> >> 2 2 1 3 0.4 5 >> >> 2 1 0.6 0.9 2 >> >> >> >> Please note that new file no column is added >> >> >> >> Thank you for the help. >> >> >> >> Umesh R >> >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
Hi: Here's an alternative using ldply() from the plyr package. The idea is to read the data frames into a list, name them accordingly and then call ldply(). # Read in the test data frames (you want to use list.files() instead to input the data per Uwe's guidelines) df1 <- read.table(textConnection(" + var1 var2 var3 var4 + 1 6 0.3 8 + 3 4 0.4 9 + 2 3 0.4 6 + 1 0.4 0.9 3"), header = TRUE)> df2 <- read.table(textConnection("+ var1 var2 var3 var4 + 1 16 0.6 7 + 3 14 0.4 6 + 2 13 0.4 5 + 1 0.6 0.9 2"), header = TRUE) closeAllConnections() # generate the list dl <- list(df1, df2) # Name the list components by number and then call ldply(): names(dl) <- 1:2 # more generally, names(dl) <- 1:length(dl) library("plyr") ldply(dl, rbind) .id var1 var2 var3 var4 1 1 1 6.0 0.3 8 2 1 3 4.0 0.4 9 3 1 2 3.0 0.4 6 4 1 1 0.4 0.9 3 5 2 1 16.0 0.6 7 6 2 3 14.0 0.4 6 7 2 2 13.0 0.4 5 8 2 1 0.6 0.9 2 You can always change .id to fileno afterwards. HTH, Dennis On Mon, Apr 4, 2011 at 7:41 AM, Umesh Rosyara <rosyara@msu.edu> wrote:> Dear R community members > > > > I did find a good way to merge my 200 text data files in to a single data > file with one column added will show indicator for that file. > > > > filelist = list.files(pattern = "K*cd.txt") # the file names are K1cd.txt > .................to K200cd.txt > > data_list <-lapply(filelist, read.table, header=T, comment=";", fill=T) > > > > This will create list, but this is not what I want. > > > > I want a single dataframe (all separate dataframes have same variable > headings) with additional row for example > > > > ; just for example, two small datasets are created by my component datasets > are huge, need automation > > ;read from file K1cd.txt > > var1 var2 var3 var4 > > 1 6 0.3 8 > > 3 4 0.4 9 > > 2 3 0.4 6 > > 1 0.4 0.9 3 > > > > ;read from file K2cd.txt > > var1 var2 var3 var4 > > 1 16 0.6 7 > > 3 14 0.4 6 > > 2 1 3 0.4 5 > > 1 0.6 0.9 2 > > > > the output dataframe should look like > > > > Fileno var1 var2 var3 var4 > > 1 1 6 0.3 8 > > 1 3 4 0.4 9 > > 1 2 3 0.4 6 > > 1 1 0.4 0.9 3 > > 2 1 16 0.6 7 > > 2 3 14 0.4 6 > > 2 2 1 3 0.4 5 > > 2 1 0.6 0.9 2 > > > > Please note that new file no column is added > > > > Thank you for the help. > > > > Umesh R > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]