Data processing? I have a large number of csv files from animal tracks that look like this: Date_ Time_ Speed Course Type_ Distance 30/03/2012 11:15:05 108 121 -2 0 30/03/2012 11:15:06 0 79 0 0 30/03/2012 11:15:07 0 76 0 1 30/03/2012 11:15:08 0 86 0 2 30/03/2012 11:15:09 0 77 0 3 Each file has a name like this ?G7_pig328_unit328_Site141_30MAR2012_RNo4_SitNo1.csv? To automate the processing I would like to 1. Add on various columns calculated from within the data frame e.g. cumulative distance traveled (cDistance) by Summing the distance column from [1 :n] for each row 2 Add columns derived from the file name so when I merge all the files together I know what observation corresponds to which group and bird etc. For example G7 stands for group 7, pig328 is pigeon328: The file look the same with but with these columns (plus others) added cDistance Group BIRD 0 7 328 0 7 328 1 7 328 3 7 328 6 7 328 I was thinking a function like this for cDistance (if I can get it to work) cdistamce <-funtion(x){ i = 1 j=nrow(temp1.df) while(i<=j,ifelse(i=1,"Distance[i]",Sum("Distance")) i=i+1 } But hit a brick wall and I have no idea about adding columns from the name. Am I on the right track with the first one and any ideas, coz I can't brain today I have the dumb!? Cheers Josh -- View this message in context: r.789695.n4.nabble.com/data-frame-adding-columns-from-data-and-file-title-tp4651099.html Sent from the R help mailing list archive at Nabble.com.
Rui Barradas
2012-Nov-28 11:59 UTC
[R] data frame: adding columns from data and file title
Hello, First of all, the best way of posting data examples is ?dput. Anyway, try the following. dat <- read.table(text=" Date_ Time_ Speed Course Type_ Distance 30/03/2012 11:15:05 108 121 -2 0 30/03/2012 11:15:06 0 79 0 0 30/03/2012 11:15:07 0 76 0 1 30/03/2012 11:15:08 0 86 0 2 30/03/2012 11:15:09 0 77 0 3 ", header = TRUE, stringsAsFactors = FALSE) dat str(dat) filename <- "G7_pig328_unit328_Site141_30MAR2012_RNo4_SitNo1.csv" dat$cDistance <- cumsum(dat$Distance) x <- unlist(strsplit(filename, "_"))[1:2] x <- as.integer(sub("[[:alpha:]]+", "", x)) dat$Group <- x[1] dat$BIRD <- x[2] dat Hope this helps, Rui Barradas Em 28-11-2012 09:33, jgui001 escreveu:> Data processing? > > I have a large number of csv files from animal tracks that look like this: > > Date_ Time_ Speed > Course Type_ Distance > 30/03/2012 11:15:05 108 > 121 -2 > 0 > 30/03/2012 11:15:06 0 > 79 0 0 > 30/03/2012 11:15:07 0 > 76 0 1 > 30/03/2012 11:15:08 0 > 86 0 2 > 30/03/2012 11:15:09 0 > 77 0 3 > > Each file has a name like this > ?G7_pig328_unit328_Site141_30MAR2012_RNo4_SitNo1.csv? > > To automate the processing I would like to > 1. Add on various columns calculated from within the data frame e.g. > cumulative distance traveled (cDistance) by Summing the distance column from > [1 :n] for each row > 2 Add columns derived from the file name so when I merge all the files > together I know what observation corresponds to which group and bird etc. > For example G7 stands for group 7, pig328 is pigeon328: > > The file look the same with but with these columns (plus others) added > > cDistance Group BIRD > 0 7 328 > 0 7 328 > 1 7 328 > 3 7 328 > 6 7 328 > > I was thinking a function like this for cDistance (if I can get it to work) > > cdistamce <-funtion(x){ > i = 1 > j=nrow(temp1.df) > while(i<=j,ifelse(i=1,"Distance[i]",Sum("Distance")) > i=i+1 > } > > But hit a brick wall and I have no idea about adding columns from the name. > Am I on the right track with the first one and any ideas, coz I can't brain > today I have the dumb!? > > Cheers > Josh > > > > -- > View this message in context: r.789695.n4.nabble.com/data-frame-adding-columns-from-data-and-file-title-tp4651099.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi, TRy this: dat1 <- read.table(text=" Date_ Time_ Speed? Course? Type_? Distance 30/03/2012? 11:15:05? 108? 121? -2 0 30/03/2012? 11:15:06??? 0? 79? 0 0 30/03/2012? 11:15:07??? 0? 76? 0 1 30/03/2012? 11:15:08??? 0? 86? 0 2 30/03/2012? 11:15:09??? 0? 77? 0 3 ", header = TRUE, stringsAsFactors = FALSE) fileN <- "G7_pig328_unit328_Site141_30MAR2012_RNo4_SitNo1.csv" dat1$cDistance<-cumsum(dat1$Distance) dat1$Group<-as.numeric(unlist(strsplit(gsub("^\\D+(\\d+)\\D+(\\d+).*","\\1 \\2",fileN)," ")))[1] ?dat1$BIRD<-as.numeric(unlist(strsplit(gsub("^\\D+(\\d+)\\D+(\\d+).*","\\1 \\2",fileN)," ")))[2] dat1 A.K. ----- Original Message ----- From: jgui001 <j.guilbert at auckland.ac.nz> To: r-help at r-project.org Cc: Sent: Wednesday, November 28, 2012 4:33 AM Subject: [R] data frame: adding columns from data and file title Data processing? I have a large number of csv files from animal tracks that look like this: Date_??? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Time_??? ??? ? ? ? ? ? ? ? ? ? ? ? Speed??? ??? ? ? ? ? ? ? ? ? ? ? Course??? ? ? ? ? ? ? ? ? ? ? ? ??? Type_??? ??? ? ? ? ? ? ? ? ? ? ? ? Distance 30/03/2012??? ? ? ? ? ? ? ? ? ? ? ? 11:15:05??? ??? ? ? ? ? ? ? ? ? ? ? ? 108??? ??? ? ? ? ? ? ? ? 121??? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ??? -2??? ??? ? ? ? ? ? ? ? ? ? ? 0 30/03/2012??? ? ? ? ? ? ? ? ? ? ? ? 11:15:06??? ??? ? ? ? ? ? ? ? ? ? ? ? 0??? ??? ? ? ? ? ? ? ? ? ? ? 79??? ??? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 0??? ??? ? ? ? ? ? ? ? ? ? ? ? 0 30/03/2012??? ??? ? ? ? ? ? ? ? ? ? ? ? 11:15:07??? ??? ? ? ? ? ? ? ? ? ? ? ? 0??? ??? ? ? ? ? ? ? ? ? ? ? 76??? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ? ??? 0??? ??? ? ? ? ? ? ? ? ? ? ? ? 1 30/03/2012??? ??? ? ? ? ? ? ? ? ? ? ? ? 11:15:08??? ??? ? ? ? ? ? ? ? ? ? ? ? 0??? ??? ? ? ? ? ? ? ? ? ? ? 86??? ??? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 0??? ? ? ? ? ? ? ? ? ? ? ? ??? 2 30/03/2012??? ??? ? ? ? ? ? ? ? ? ? ? ? 11:15:09??? ??? ? ? ? ? ? ? ? ? ? ? ? 0??? ??? ? ? ? ? ? ? ? ? ? ? 77??? ??? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 0??? ??? ? ? ? ? ? ? ? ? ? ? ? 3 Each file has a name like this ?G7_pig328_unit328_Site141_30MAR2012_RNo4_SitNo1.csv? To automate the processing I would like to 1. Add on various columns calculated from within the data frame e.g. cumulative distance traveled (cDistance) by Summing the distance column from [1 :n] for each row 2 Add columns derived from the file name so when I merge all the files together I know what observation corresponds to which group and bird etc. For example G7 stands for group 7, pig328 is pigeon328: The file look the same with but with these columns (plus others) added? cDistance??? ??? ? ? ? ? ? ? ? ? ? ? ? Group??? ? ? ? ??? BIRD 0? ? ? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ??? 7??? ??? ? ? ? 328 0??? ? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ? 7??? ??? ? ? ? 328 1??? ? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ? 7??? ??? ? ? ? 328 3??? ? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? 7??? ??? ? ? ? 328 6??? ? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? 7??? ??? ? ? ? 328 I was thinking a function like this for cDistance (if I can get it to work) cdistamce <-funtion(x){ ? i = 1 ? j=nrow(temp1.df) ? while(i<=j,ifelse(i=1,"Distance[i]",Sum("Distance")) ? ? ? ? i=i+1 } But hit a brick wall and I have no idea about adding columns from the name. Am I on the right track with the first one and any ideas, coz I can't brain today I have the dumb!? Cheers Josh -- View this message in context: r.789695.n4.nabble.com/data-frame-adding-columns-from-data-and-file-title-tp4651099.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.