Dear R community members
I did find a good way to merge my 200 text data files in to a single data
file with one column added will show indicator for that file.
filelist = list.files(pattern = "K*cd.txt") # the file names are
K1cd.txt
.................to K200cd.txt
data_list <-lapply(filelist, read.table, header=T, comment=";",
fill=T)
This will create list, but this is not what I want.
I want a single dataframe (all separate dataframes have same variable
headings) with additional row for example
; just for example, two small datasets are created by my component datasets
are huge, need automation
;read from file K1cd.txt
var1 var2 var3 var4
1 6 0.3 8
3 4 0.4 9
2 3 0.4 6
1 0.4 0.9 3
;read from file K2cd.txt
var1 var2 var3 var4
1 16 0.6 7
3 14 0.4 6
2 1 3 0.4 5
1 0.6 0.9 2
the output dataframe should look like
Fileno var1 var2 var3 var4
1 1 6 0.3 8
1 3 4 0.4 9
1 2 3 0.4 6
1 1 0.4 0.9 3
2 1 16 0.6 7
2 3 14 0.4 6
2 2 1 3 0.4 5
2 1 0.6 0.9 2
Please note that new file no column is added
Thank you for the help.
Umesh R
[[alternative HTML version deleted]]
On 04.04.2011 16:41, Umesh Rosyara wrote:> Dear R community members > > > > I did find a good way to merge my 200 text data files in to a single data > file with one column added will show indicator for that file. > > > > filelist = list.files(pattern = "K*cd.txt")I doubt you meant "K*cd.txt" but "^K[[:digit:]]*cd\\.txt$". # the file names are K1cd.txt> .................to K200cd.txt > > data_list<-lapply(filelist, read.table, header=T, comment=";", fill=T)Replace by: data_list <- lapply(filelist, function(x) cbind(Filename = x, read.table(x, header=T, comment=";", fill=TRUE)) And then: result <- do.call("rbind", data_list) Uwe Ligges> > > > This will create list, but this is not what I want. > > > > I want a single dataframe (all separate dataframes have same variable > headings) with additional row for example > > > > ; just for example, two small datasets are created by my component datasets > are huge, need automation > > ;read from file K1cd.txt > > var1 var2 var3 var4 > > 1 6 0.3 8 > > 3 4 0.4 9 > > 2 3 0.4 6 > > 1 0.4 0.9 3 > > > > ;read from file K2cd.txt > > var1 var2 var3 var4 > > 1 16 0.6 7 > > 3 14 0.4 6 > > 2 1 3 0.4 5 > > 1 0.6 0.9 2 > > > > the output dataframe should look like > > > > Fileno var1 var2 var3 var4 > > 1 1 6 0.3 8 > > 1 3 4 0.4 9 > > 1 2 3 0.4 6 > > 1 1 0.4 0.9 3 > > 2 1 16 0.6 7 > > 2 3 14 0.4 6 > > 2 2 1 3 0.4 5 > > 2 1 0.6 0.9 2 > > > > Please note that new file no column is added > > > > Thank you for the help. > > > > Umesh R > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Umesh Rosyara
2011-Apr-04 16:37 UTC
[R] Questions remaining: define any character as na.string RE: merging data list in to single data frame
Dear Uwe and R community members Thank you Uwe for the help. I have still a question remaining, I am trying to find answer from long time. While exporting my data, I have some characters mixed into it. I want to define any characters as na.string? Is it possible to do so? Thanks; Umesh -----Original Message----- From: Uwe Ligges [mailto:ligges at statistik.tu-dortmund.de] Sent: Monday, April 04, 2011 12:22 PM To: Umesh Rosyara Cc: r-help at r-project.org; rosyaraur at gmail.com Subject: Re: [R] merging data list in to single data frame On 04.04.2011 16:41, Umesh Rosyara wrote:> Dear R community members > > > > I did find a good way to merge my 200 text data files in to a single data > file with one column added will show indicator for that file. > > > > filelist = list.files(pattern = "K*cd.txt")I doubt you meant "K*cd.txt" but "^K[[:digit:]]*cd\\.txt$". # the file names are K1cd.txt> .................to K200cd.txt > > data_list<-lapply(filelist, read.table, header=T, comment=";", fill=T)Replace by: data_list <- lapply(filelist, function(x) cbind(Filename = x, read.table(x, header=T, comment=";", fill=TRUE)) And then: result <- do.call("rbind", data_list) Uwe Ligges> > > > This will create list, but this is not what I want. > > > > I want a single dataframe (all separate dataframes have same variable > headings) with additional row for example > > > > ; just for example, two small datasets are created by my componentdatasets> are huge, need automation > > ;read from file K1cd.txt > > var1 var2 var3 var4 > > 1 6 0.3 8 > > 3 4 0.4 9 > > 2 3 0.4 6 > > 1 0.4 0.9 3 > > > > ;read from file K2cd.txt > > var1 var2 var3 var4 > > 1 16 0.6 7 > > 3 14 0.4 6 > > 2 1 3 0.4 5 > > 1 0.6 0.9 2 > > > > the output dataframe should look like > > > > Fileno var1 var2 var3 var4 > > 1 1 6 0.3 8 > > 1 3 4 0.4 9 > > 1 2 3 0.4 6 > > 1 1 0.4 0.9 3 > > 2 1 16 0.6 7 > > 2 3 14 0.4 6 > > 2 2 1 3 0.4 5 > > 2 1 0.6 0.9 2 > > > > Please note that new file no column is added > > > > Thank you for the help. > > > > Umesh R > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
2011-Apr-04 17:22 UTC
[R] Questions remaining: define any character as na.string RE: merging data list in to single data frame
On Apr 4, 2011, at 12:37 PM, Umesh Rosyara wrote:> Dear Uwe and R community members > > Thank you Uwe for the help. > > I have still a question remaining, I am trying to find answer from > long > time. > > While exporting my data, I have some characters mixed into it. I > want to > define any characters as na.string? Is it possible to do so?Option 1: do it in an editor that is regex aware. Option 2: input the file with readLines, use gsub to remove the unwanted characters, read.table(textConnection(obj)) on the resulting object. [There are many worked examples in the archives. Search on "read.table(textConnection(" .] -- David.> > Thanks; > > Umesh > > > > -----Original Message----- > From: Uwe Ligges [mailto:ligges at statistik.tu-dortmund.de] > Sent: Monday, April 04, 2011 12:22 PM > To: Umesh Rosyara > Cc: r-help at r-project.org; rosyaraur at gmail.com > Subject: Re: [R] merging data list in to single data frame > > > > On 04.04.2011 16:41, Umesh Rosyara wrote: >> Dear R community members >> >> >> >> I did find a good way to merge my 200 text data files in to a >> single data >> file with one column added will show indicator for that file. >> >> >> >> filelist = list.files(pattern = "K*cd.txt") > > > I doubt you meant "K*cd.txt" but "^K[[:digit:]]*cd\\.txt$". > > > > # the file names are K1cd.txt >> .................to K200cd.txt >> >> data_list<-lapply(filelist, read.table, header=T, comment=";", >> fill=T) > > > Replace by: > > data_list <- lapply(filelist, function(x) > cbind(Filename = x, read.table(x, header=T, comment=";", > fill=TRUE)) > > > And then: > > result <- do.call("rbind", data_list) > > Uwe Ligges > > >> >> >> >> This will create list, but this is not what I want. >> >> >> >> I want a single dataframe (all separate dataframes have same variable >> headings) with additional row for example >> >> >> >> ; just for example, two small datasets are created by my component > datasets >> are huge, need automation >> >> ;read from file K1cd.txt >> >> var1 var2 var3 var4 >> >> 1 6 0.3 8 >> >> 3 4 0.4 9 >> >> 2 3 0.4 6 >> >> 1 0.4 0.9 3 >> >> >> >> ;read from file K2cd.txt >> >> var1 var2 var3 var4 >> >> 1 16 0.6 7 >> >> 3 14 0.4 6 >> >> 2 1 3 0.4 5 >> >> 1 0.6 0.9 2 >> >> >> >> the output dataframe should look like >> >> >> >> Fileno var1 var2 var3 var4 >> >> 1 1 6 0.3 8 >> >> 1 3 4 0.4 9 >> >> 1 2 3 0.4 6 >> >> 1 1 0.4 0.9 3 >> >> 2 1 16 0.6 7 >> >> 2 3 14 0.4 6 >> >> 2 2 1 3 0.4 5 >> >> 2 1 0.6 0.9 2 >> >> >> >> Please note that new file no column is added >> >> >> >> Thank you for the help. >> >> >> >> Umesh R >> >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
Hi:
Here's an alternative using ldply() from the plyr package. The idea is to
read the data frames into a list, name them accordingly and then call
ldply().
# Read in the test data frames (you want to use list.files() instead to
input the data per Uwe's guidelines)
df1 <- read.table(textConnection("
+ var1 var2 var3 var4
+ 1 6 0.3 8
+ 3 4 0.4 9
+ 2 3 0.4 6
+ 1 0.4 0.9 3"), header =
TRUE)> df2 <- read.table(textConnection("
+ var1 var2 var3 var4
+ 1 16 0.6 7
+ 3 14 0.4 6
+ 2 13 0.4 5
+ 1 0.6 0.9 2"), header = TRUE)
closeAllConnections()
# generate the list
dl <- list(df1, df2)
# Name the list components by number and then call ldply():
names(dl) <- 1:2 # more generally, names(dl) <- 1:length(dl)
library("plyr")
ldply(dl, rbind)
.id var1 var2 var3 var4
1 1 1 6.0 0.3 8
2 1 3 4.0 0.4 9
3 1 2 3.0 0.4 6
4 1 1 0.4 0.9 3
5 2 1 16.0 0.6 7
6 2 3 14.0 0.4 6
7 2 2 13.0 0.4 5
8 2 1 0.6 0.9 2
You can always change .id to fileno afterwards.
HTH,
Dennis
On Mon, Apr 4, 2011 at 7:41 AM, Umesh Rosyara <rosyara@msu.edu> wrote:
> Dear R community members
>
>
>
> I did find a good way to merge my 200 text data files in to a single data
> file with one column added will show indicator for that file.
>
>
>
> filelist = list.files(pattern = "K*cd.txt") # the file names are
K1cd.txt
> .................to K200cd.txt
>
> data_list <-lapply(filelist, read.table, header=T,
comment=";", fill=T)
>
>
>
> This will create list, but this is not what I want.
>
>
>
> I want a single dataframe (all separate dataframes have same variable
> headings) with additional row for example
>
>
>
> ; just for example, two small datasets are created by my component datasets
> are huge, need automation
>
> ;read from file K1cd.txt
>
> var1 var2 var3 var4
>
> 1 6 0.3 8
>
> 3 4 0.4 9
>
> 2 3 0.4 6
>
> 1 0.4 0.9 3
>
>
>
> ;read from file K2cd.txt
>
> var1 var2 var3 var4
>
> 1 16 0.6 7
>
> 3 14 0.4 6
>
> 2 1 3 0.4 5
>
> 1 0.6 0.9 2
>
>
>
> the output dataframe should look like
>
>
>
> Fileno var1 var2 var3 var4
>
> 1 1 6 0.3 8
>
> 1 3 4 0.4 9
>
> 1 2 3 0.4 6
>
> 1 1 0.4 0.9 3
>
> 2 1 16 0.6 7
>
> 2 3 14 0.4 6
>
> 2 2 1 3 0.4 5
>
> 2 1 0.6 0.9 2
>
>
>
> Please note that new file no column is added
>
>
>
> Thank you for the help.
>
>
>
> Umesh R
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]