John McKown
2014-Jul-01 12:06 UTC
[R] combining data from multiple read.delim() invocations.
Is there a better way to do the following? I have data in a number of tab delimited files. I am using read.delim() to read them, in a loop. I am invoking my code on Linux Fedora 20, from the BASH command line, using Rscript. The code I'm using looks like: arguments <- commandArgs(trailingOnly=TRUE); # initialize the capped_data data.frame capped_data <- data.frame(lpar="NULL", started=Sys.time(), ended=Sys.time(), stringsAsFactors=FALSE); # and empty it. capped_data <- capped_data[-1,]; # # Read in the data from the files listed for (file in arguments) { data <- read.delim(file, header=FALSE, col.names=c("lpar","started","ended"), as.is=TRUE, na.strings='\\N', colClasses=c("character","POSIXct","POSIXct")); capped_data <- rbind(capped_data,data) } # I.e. is there an easier way than doing a read.delim/rbind in a loop? -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! <>< John McKown [[alternative HTML version deleted]]
David L Carlson
2014-Jul-01 16:31 UTC
[R] combining data from multiple read.delim() invocations.
There is a better way. First we need some data. This creates three files in your home directory, each with five rows: write.table(data.frame(rep("A", 5), Sys.time(), Sys.time()), "A.tab", sep="\t", row.names=FALSE, col.names=FALSE) write.table(data.frame(rep("B", 5), Sys.time(), Sys.time()), "B.tab", sep="\t", row.names=FALSE, col.names=FALSE) write.table(data.frame(rep("C", 5), Sys.time(), Sys.time()), "C.tab", sep="\t", row.names=FALSE, col.names=FALSE) Now to read and combine them into a single data.frame: fls <- c("A.tab", "B.tab", "C.tab") df.list <- lapply(fls, read.delim, header=FALSE, col.names=c("lpar","started","ended"), as.is=TRUE, na.strings='\\N', colClasses=c("character","POSIXct","POSIXct")) df.all <- do.call(rbind, df.list)> str(df.all)'data.frame': 15 obs. of 3 variables: $ lpar : chr "A" "A" "A" "A" ... $ started: POSIXct, format: "2014-07-01 11:25:05" "2014-07-01 11:25:05" ... $ ended : POSIXct, format: "2014-07-01 11:25:05" "2014-07-01 11:25:05" ... ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of John McKown Sent: Tuesday, July 1, 2014 7:07 AM To: r-help at r-project.org Subject: [R] combining data from multiple read.delim() invocations. Is there a better way to do the following? I have data in a number of tab delimited files. I am using read.delim() to read them, in a loop. I am invoking my code on Linux Fedora 20, from the BASH command line, using Rscript. The code I'm using looks like: arguments <- commandArgs(trailingOnly=TRUE); # initialize the capped_data data.frame capped_data <- data.frame(lpar="NULL", started=Sys.time(), ended=Sys.time(), stringsAsFactors=FALSE); # and empty it. capped_data <- capped_data[-1,]; # # Read in the data from the files listed for (file in arguments) { data <- read.delim(file, header=FALSE, col.names=c("lpar","started","ended"), as.is=TRUE, na.strings='\\N', colClasses=c("character","POSIXct","POSIXct")); capped_data <- rbind(capped_data,data) } # I.e. is there an easier way than doing a read.delim/rbind in a loop? -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! <>< John McKown [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.