Data processing?
I have a large number of csv files from animal tracks that look like this:
Date_ Time_ Speed
Course Type_ Distance
30/03/2012 11:15:05 108
121 -2
0
30/03/2012 11:15:06 0
79 0 0
30/03/2012 11:15:07 0
76 0 1
30/03/2012 11:15:08 0
86 0 2
30/03/2012 11:15:09 0
77 0 3
Each file has a name like this
?G7_pig328_unit328_Site141_30MAR2012_RNo4_SitNo1.csv?
To automate the processing I would like to
1. Add on various columns calculated from within the data frame e.g.
cumulative distance traveled (cDistance) by Summing the distance column from
[1 :n] for each row
2 Add columns derived from the file name so when I merge all the files
together I know what observation corresponds to which group and bird etc.
For example G7 stands for group 7, pig328 is pigeon328:
The file look the same with but with these columns (plus others) added
cDistance Group BIRD
0 7 328
0 7 328
1 7 328
3 7 328
6 7 328
I was thinking a function like this for cDistance (if I can get it to work)
cdistamce <-funtion(x){
i = 1
j=nrow(temp1.df)
while(i<=j,ifelse(i=1,"Distance[i]",Sum("Distance"))
i=i+1
}
But hit a brick wall and I have no idea about adding columns from the name.
Am I on the right track with the first one and any ideas, coz I can't brain
today I have the dumb!?
Cheers
Josh
--
View this message in context:
http://r.789695.n4.nabble.com/data-frame-adding-columns-from-data-and-file-title-tp4651099.html
Sent from the R help mailing list archive at Nabble.com.
Rui Barradas
2012-Nov-28 11:59 UTC
[R] data frame: adding columns from data and file title
Hello,
First of all, the best way of posting data examples is ?dput. Anyway,
try the following.
dat <- read.table(text="
Date_ Time_ Speed Course Type_ Distance
30/03/2012 11:15:05 108 121 -2 0
30/03/2012 11:15:06 0 79 0 0
30/03/2012 11:15:07 0 76 0 1
30/03/2012 11:15:08 0 86 0 2
30/03/2012 11:15:09 0 77 0 3
", header = TRUE, stringsAsFactors = FALSE)
dat
str(dat)
filename <- "G7_pig328_unit328_Site141_30MAR2012_RNo4_SitNo1.csv"
dat$cDistance <- cumsum(dat$Distance)
x <- unlist(strsplit(filename, "_"))[1:2]
x <- as.integer(sub("[[:alpha:]]+", "", x))
dat$Group <- x[1]
dat$BIRD <- x[2]
dat
Hope this helps,
Rui Barradas
Em 28-11-2012 09:33, jgui001 escreveu:> Data processing?
>
> I have a large number of csv files from animal tracks that look like this:
>
> Date_ Time_ Speed
> Course Type_ Distance
> 30/03/2012 11:15:05 108
> 121 -2
> 0
> 30/03/2012 11:15:06 0
> 79 0 0
> 30/03/2012 11:15:07 0
> 76 0 1
> 30/03/2012 11:15:08 0
> 86 0 2
> 30/03/2012 11:15:09 0
> 77 0 3
>
> Each file has a name like this
> ?G7_pig328_unit328_Site141_30MAR2012_RNo4_SitNo1.csv?
>
> To automate the processing I would like to
> 1. Add on various columns calculated from within the data frame e.g.
> cumulative distance traveled (cDistance) by Summing the distance column
from
> [1 :n] for each row
> 2 Add columns derived from the file name so when I merge all the files
> together I know what observation corresponds to which group and bird etc.
> For example G7 stands for group 7, pig328 is pigeon328:
>
> The file look the same with but with these columns (plus others) added
>
> cDistance Group BIRD
> 0 7 328
> 0 7 328
> 1 7 328
> 3 7 328
> 6 7 328
>
> I was thinking a function like this for cDistance (if I can get it to work)
>
> cdistamce <-funtion(x){
> i = 1
> j=nrow(temp1.df)
>
while(i<=j,ifelse(i=1,"Distance[i]",Sum("Distance"))
> i=i+1
> }
>
> But hit a brick wall and I have no idea about adding columns from the name.
> Am I on the right track with the first one and any ideas, coz I can't
brain
> today I have the dumb!?
>
> Cheers
> Josh
>
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/data-frame-adding-columns-from-data-and-file-title-tp4651099.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hi,
TRy this:
dat1 <- read.table(text="
Date_ Time_ Speed? Course? Type_? Distance
30/03/2012? 11:15:05? 108? 121? -2 0
30/03/2012? 11:15:06??? 0? 79? 0 0
30/03/2012? 11:15:07??? 0? 76? 0 1
30/03/2012? 11:15:08??? 0? 86? 0 2
30/03/2012? 11:15:09??? 0? 77? 0 3
", header = TRUE, stringsAsFactors = FALSE)
fileN <- "G7_pig328_unit328_Site141_30MAR2012_RNo4_SitNo1.csv"
dat1$cDistance<-cumsum(dat1$Distance)
dat1$Group<-as.numeric(unlist(strsplit(gsub("^\\D+(\\d+)\\D+(\\d+).*","\\1
\\2",fileN)," ")))[1]
?dat1$BIRD<-as.numeric(unlist(strsplit(gsub("^\\D+(\\d+)\\D+(\\d+).*","\\1
\\2",fileN)," ")))[2]
dat1
A.K.
----- Original Message -----
From: jgui001 <j.guilbert at auckland.ac.nz>
To: r-help at r-project.org
Cc:
Sent: Wednesday, November 28, 2012 4:33 AM
Subject: [R] data frame: adding columns from data and file title
Data processing?
I have a large number of csv files from animal tracks that look like this:
Date_??? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Time_??? ??? ? ? ? ? ? ? ? ? ? ? ?
Speed??? ??? ? ? ? ? ? ? ? ? ? ?
Course??? ? ? ? ? ? ? ? ? ? ? ? ??? Type_??? ??? ? ? ? ? ? ? ? ? ? ? ? Distance
30/03/2012??? ? ? ? ? ? ? ? ? ? ? ? 11:15:05??? ??? ? ? ? ? ? ? ? ? ? ? ? 108???
??? ? ? ? ? ? ? ?
121??? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ??? -2??? ??? ? ? ? ? ? ?
? ? ? ?
0
30/03/2012??? ? ? ? ? ? ? ? ? ? ? ? 11:15:06??? ??? ? ? ? ? ? ? ? ? ? ? ? 0???
??? ? ? ? ? ? ? ? ? ? ?
79??? ??? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 0??? ??? ? ? ? ? ? ? ?
? ? ? ? 0
30/03/2012??? ??? ? ? ? ? ? ? ? ? ? ? ? 11:15:07??? ??? ? ? ? ? ? ? ? ? ? ? ?
0??? ??? ? ? ? ? ? ? ? ? ? ?
76??? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ? ??? 0??? ??? ? ? ? ? ? ? ?
? ? ? ? 1
30/03/2012??? ??? ? ? ? ? ? ? ? ? ? ? ? 11:15:08??? ??? ? ? ? ? ? ? ? ? ? ? ?
0??? ??? ? ? ? ? ? ? ? ? ? ?
86??? ??? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 0??? ? ? ? ? ? ? ? ? ?
? ? ??? 2
30/03/2012??? ??? ? ? ? ? ? ? ? ? ? ? ? 11:15:09??? ??? ? ? ? ? ? ? ? ? ? ? ?
0??? ??? ? ? ? ? ? ? ? ? ? ?
77??? ??? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 0??? ??? ? ? ? ? ? ? ?
? ? ? ? 3
Each file has a name like this
?G7_pig328_unit328_Site141_30MAR2012_RNo4_SitNo1.csv?
To automate the processing I would like to
1. Add on various columns calculated from within the data frame e.g.
cumulative distance traveled (cDistance) by Summing the distance column from
[1 :n] for each row
2 Add columns derived from the file name so when I merge all the files
together I know what observation corresponds to which group and bird etc.
For example G7 stands for group 7, pig328 is pigeon328:
The file look the same with but with these columns (plus others) added?
cDistance??? ??? ? ? ? ? ? ? ? ? ? ? ? Group??? ? ? ? ??? BIRD
0? ? ? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ??? 7??? ??? ? ? ? 328
0??? ? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ? 7??? ??? ? ? ? 328
1??? ? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ? 7??? ??? ? ? ? 328
3??? ? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? 7??? ??? ? ? ? 328
6??? ? ? ? ? ? ? ? ? ? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? 7??? ??? ? ? ? 328
I was thinking a function like this for cDistance (if I can get it to work)
cdistamce <-funtion(x){
? i = 1
? j=nrow(temp1.df)
? while(i<=j,ifelse(i=1,"Distance[i]",Sum("Distance"))
? ? ? ? i=i+1
}
But hit a brick wall and I have no idea about adding columns from the name.
Am I on the right track with the first one and any ideas, coz I can't brain
today I have the dumb!?
Cheers
Josh
--
View this message in context:
http://r.789695.n4.nabble.com/data-frame-adding-columns-from-data-and-file-title-tp4651099.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.