Hi Santiago
Keep conversation in list. Others can have better ideas.
I am still messing the reasoning
Merge seems to me the solution but I am lost in your resoning what to keep and
what to discard from resulting object.
After merge I have this
result <- structure(list(Ring = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), .Label = c("6106933", "6134701", "6140497",
"6140719", "6140756",
"6140855", "6143070", "6143090",
"6143093", "6175711", "6175726",
"6175730", "6175769", "6175776",
"6175784", "6188609", "6188705",
"6195159", "6195171", "6198153",
"6198154", "6198156", "6198157",
"6198172"), class = "factor"), jul = c(15135, 15135, 15135,
15135,
15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135,
15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135,
15135, 15135), timepos = structure(c(1307680575, 1307680740,
1307681040, 1307681340, 1307681640, 1307681940, 1307682240, 1307682540,
1307682780, 1307683080, 1307683380, 1307683680, 1307683980, 1307684280,
1307684397, 1307684424, 1307684484, 1307684490, 1307684580, 1307684880,
1307685180, 1307685243, 1307685321, 1307685336), class = c("POSIXct",
"POSIXt"), tzone = "GMT"), act = c(3822L, NA, NA, NA, NA,
NA,
NA, NA, NA, NA, NA, NA, NA, NA, 27L, 60L, 6L, 753L, NA, NA, NA,
78L, 15L, 18L), wd = c("dry", NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, "wet", "dry", "wet",
"dry", NA, NA, NA, "wet",
"dry", "wet")), .Names = c("Ring",
"jul", "timepos", "act", "wd"
), row.names = c(NA, -24L), class = "data.frame")
> result
Ring jul timepos act wd
1 6106933 15135 2011-06-10 04:36:15 3822 dry
2 6106933 15135 2011-06-10 04:39:00 NA <NA>
3 6106933 15135 2011-06-10 04:44:00 NA <NA>
4 6106933 15135 2011-06-10 04:49:00 NA <NA>
5 6106933 15135 2011-06-10 04:54:00 NA <NA>
6 6106933 15135 2011-06-10 04:59:00 NA <NA>
7 6106933 15135 2011-06-10 05:04:00 NA <NA>
8 6106933 15135 2011-06-10 05:09:00 NA <NA>
9 6106933 15135 2011-06-10 05:13:00 NA <NA>
10 6106933 15135 2011-06-10 05:18:00 NA <NA>
11 6106933 15135 2011-06-10 05:23:00 NA <NA>
12 6106933 15135 2011-06-10 05:28:00 NA <NA>
13 6106933 15135 2011-06-10 05:33:00 NA <NA>
14 6106933 15135 2011-06-10 05:38:00 NA <NA>
15 6106933 15135 2011-06-10 05:39:57 27 wet
16 6106933 15135 2011-06-10 05:40:24 60 dry
17 6106933 15135 2011-06-10 05:41:24 6 wet
18 6106933 15135 2011-06-10 05:41:30 753 dry
19 6106933 15135 2011-06-10 05:43:00 NA <NA>
20 6106933 15135 2011-06-10 05:48:00 NA <NA>
21 6106933 15135 2011-06-10 05:53:00 NA <NA>
22 6106933 15135 2011-06-10 05:54:03 78 wet
23 6106933 15135 2011-06-10 05:55:21 15 dry
24 6106933 15135 2011-06-10 05:55:36 18 wet
I understand you want to keep only time values from GPL data.frame. OK this can
be done in the last step. But I am a bit lost in the logic for discarding lines
15-18. Anyway, this can be what you want
library(zoo)
result$wd<-na.locf(result$wd)
final<-result[is.na(result$act),]> final
Ring jul timepos act wd
2 6106933 15135 2011-06-10 04:39:00 NA dry
3 6106933 15135 2011-06-10 04:44:00 NA dry
4 6106933 15135 2011-06-10 04:49:00 NA dry
5 6106933 15135 2011-06-10 04:54:00 NA dry
6 6106933 15135 2011-06-10 04:59:00 NA dry
7 6106933 15135 2011-06-10 05:04:00 NA dry
8 6106933 15135 2011-06-10 05:09:00 NA dry
9 6106933 15135 2011-06-10 05:13:00 NA dry
10 6106933 15135 2011-06-10 05:18:00 NA dry
11 6106933 15135 2011-06-10 05:23:00 NA dry
12 6106933 15135 2011-06-10 05:28:00 NA dry
13 6106933 15135 2011-06-10 05:33:00 NA dry
14 6106933 15135 2011-06-10 05:38:00 NA dry
19 6106933 15135 2011-06-10 05:43:00 NA dry
20 6106933 15135 2011-06-10 05:48:00 NA dry
21 6106933 15135 2011-06-10 05:53:00 NA dry>
Regards
Petr
From: Santiago Guallar [mailto:sguallar@yahoo.com]
Sent: Tuesday, July 09, 2013 10:02 PM
To: PIKAL Petr
Subject: Re: [R] spped up a function
Dear Petr,
I wanted the two data sets merged in such a way that the values of the
'wd' vector (from the intervals t of 'xact') are assigned to the
corresponding intervals of 'GPS'. If there is more than one value (i.e
if there is more than one interval of 'xact' for the corresponding
interval of 'GPS'), then take the maximum (i.e. the value of the
interval of 'xact' closest to the corresponding interval of
'GPS'). This is why the output of the particular sequence of the result
I copied in the previous message contains only 'dry'.
Santi
From: PIKAL Petr
<petr.pikal@precheza.cz<mailto:petr.pikal@precheza.cz>>
To: Santiago Guallar
<sguallar@yahoo.com<mailto:sguallar@yahoo.com>>; r-help
<r-help@r-project.org<mailto:r-help@r-project.org>>
Sent: Tuesday, July 9, 2013 11:19 AM
Subject: RE: [R] spped up a function
Hi Santiago
I am a bit confused how is your result organised, why there are only „dry“ value
regardless of timepos values.
It is not necessary to attach files resulting from dput. Just copy it to your
mail and anybody can copy it directly to R
Ring is factor in xact but numeric in GPS> str(xact)
'data.frame': 8 obs. of 5 variables:
$ Ring : Factor w/ 24 levels "6106933","6134701",..: 1 1 1
1 1 1 1 1
$ jul : num 15135 15135 15135 15135 15135 ...
$ timepos: POSIXct, format: "2011-06-10 04:36:15" "2011-06-10
05:39:57" ...
$ act : int 3822 27 60 6 753 78 15 18
$ wd : chr "dry" "wet" "dry" "wet"
...> str(GPS)
'data.frame': 16 obs. of 3 variables:
$ Ring : int 6106933 6106933 6106933 6106933 6106933 6106933 6106933 6106933
6106933 6106933 ...
$ jul : num 15135 15135 15135 15135 15135 ...
$ timepos: POSIXct, format: "2011-06-10 04:39:00" "2011-06-10
04:44:00" ...
So I first changed it to factor in both.
GPS$Ring<-factor(GPS$Ring)
after that I merged both files
result<-merge(xact, GPS, all=T)
and here is result
dput(result)
structure(list(Ring = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), .Label = c("6106933", "6134701", "6140497",
"6140719", "6140756",
"6140855", "6143070", "6143090",
"6143093", "6175711", "6175726",
"6175730", "6175769", "6175776",
"6175784", "6188609", "6188705",
"6195159", "6195171", "6198153",
"6198154", "6198156", "6198157",
"6198172"), class = "factor"), jul = c(15135, 15135, 15135,
15135,
15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135,
15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135, 15135,
15135, 15135), timepos = structure(c(1307680575, 1307680740,
1307681040, 1307681340, 1307681640, 1307681940, 1307682240, 1307682540,
1307682780, 1307683080, 1307683380, 1307683680, 1307683980, 1307684280,
1307684397, 1307684424, 1307684484, 1307684490, 1307684580, 1307684880,
1307685180, 1307685243, 1307685321, 1307685336), class = c("POSIXct",
"POSIXt"), tzone = "GMT"), act = c(3822L, NA, NA, NA, NA,
NA,
NA, NA, NA, NA, NA, NA, NA, NA, 27L, 60L, 6L, 753L, NA, NA, NA,
78L, 15L, 18L), wd = c("dry", NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, "wet", "dry", "wet",
"dry", NA, NA, NA, "wet",
"dry", "wet")), .Names = c("Ring",
"jul", "timepos", "act", "wd"
), row.names = c(NA, -24L), class = "data.frame")
there are empty values in act and wd column. You can fill it eg. by „na.locf“
function from „zoo“ package.
> result$wd
[1] "dry" NA NA NA NA NA NA NA NA NA NA
NA
[13] NA NA "wet" "dry" "wet" "dry"
NA NA NA "wet" "dry"
"wet"> na.locf(result$wd)
[1] "dry" "dry" "dry" "dry"
"dry" "dry" "dry" "dry" "dry"
"dry" "dry" "dry"
[13] "dry" "dry" "wet" "dry"
"wet" "dry" "dry" "dry" "dry"
"wet" "dry" "wet">
Is this what you want?
Regards
Petr
From: Santiago Guallar [mailto:sguallar@yahoo.com]
Sent: Tuesday, July 09, 2013 8:53 AM
To: PIKAL Petr; r-help
Subject: Re: [R] spped up a function
Hi Petr, yes the function basically consists on merging two time series with
different time intervals: one regular 'GPS' and one irregular
'xact' (the latter containing the binomial variable 'wd' that I
want to add to 'GPS'.
Apparently my attachments did not go through. Here you have the dputs you
requested plus the desired result based on them:
head(xact)
Ring jul timepos act wd
6106933 15135 2011-06-10 04:36:15 3822 dry
6106933 15135 2011-06-10 05:39:57 27 wet
6106933 15135 2011-06-10 05:40:24 60 dry
6106933 15135 2011-06-10 05:41:24 6 wet
6106933 15135 2011-06-10 05:41:30 753 dry
6106933 15135 2011-06-10 05:54:03 78 wet
6106933 15135 2011-06-10 05:55:21 15 dry
6106933 15135 2011-06-10 05:55:36 18 wet
head(GPS1, 16) and desired result (added column wd)
Ring jul timepos wd
5 6106933 15135 2011-06-10 04:39:00 dry
6 6106933 15135 2011-06-10 04:44:00 dry
7 6106933 15135 2011-06-10 04:49:00 dry
8 6106933 15135 2011-06-10 04:54:00 dry
9 6106933 15135 2011-06-10 04:59:00 dry
10 6106933 15135 2011-06-10 05:04:00 dry
11 6106933 15135 2011-06-10 05:09:00 dry
12 6106933 15135 2011-06-10 05:13:00 dry
13 6106933 15135 2011-06-10 05:18:00 dry
14 6106933 15135 2011-06-10 05:23:00 dry
15 6106933 15135 2011-06-10 05:28:00 dry
16 6106933 15135 2011-06-10 05:33:00 dry
17 6106933 15135 2011-06-10 05:38:00 dry
18 6106933 15135 2011-06-10 05:43:00 dry
19 6106933 15135 2011-06-10 05:48:00 dry
20 6106933 15135 2011-06-10 05:53:00 dry
Santi
________________________________
From: PIKAL Petr
<petr.pikal@precheza.cz<mailto:petr.pikal@precheza.cz>>
To: Santiago Guallar
<sguallar@yahoo.com<mailto:sguallar@yahoo.com>>; r-help
<r-help@r-project.org<mailto:r-help@r-project.org>>
Sent: Monday, July 8, 2013 11:34 AM
Subject: RE: [R] spped up a function
Hi
It seems to me, that you basically want merge, but I can miss the point. Try
post
dput(head(xact))
dput(head(GPS))
and what shall be desired result based on those 2 datasets.
Regards
Petr
> -----Original Message-----
> From:
r-help-bounces@r-project.org<mailto:r-help-bounces@r-project.org>
[mailto:r-help-bounces@r-
> project.org<http://project.org/>] On Behalf Of Santiago Guallar
> Sent: Tuesday, July 02, 2013 7:47 PM
> To: r-help
> Subject: [R] spped up a function
>
> Hi,
>
> I have written a function to assign the values of a certain variable
> 'wd' from a dataset to another dataset. Both contain data from the
> same time period but differ in the length of their time intervals:
> 'GPS' has regular 10-minute intervals whereas 'xact' has
irregular
> intervals. I attached simplified text versions from write.table. You
> can also get a dput of 'xact' in this address:
> http://www.megafileupload.com/en/file/431569/xact-dput.html).
> The original objects are large and the function takes almost one hour
> to finish.
> Here's the function:
>
> fxG= function(xact, GPS){
> l <- rep( 'A', nrow(GPS) )
> v <- unique(GPS$Ring) # the process is carried out for several
> individuals identified by 'Ring'
> for(k in 1:length(v) ){
> I = v[k]
> df <- xact[xact$Ring == I,]
> for(i in 1:nrow(GPS)){
> if(GPS[i,]$Ring== I){# the code runs along the whole data.frame for
> each i; it'd save time to make it stop with the last record of each i
> instead u <- df$timepos <= GPS[i,]$timepos # fill vector l for each
> interval t from xact <= each interval from GPS (take the max if
there's
> > 1 interval) l[i] <- df[max( which(u == TRUE) ),]$wd } } }
return(l)}
>
> vwd <- fxG(xact, GPS)
>
>
> My question is: how can I speed up (optimize) this function?
>
> Thank you for your help
[[alternative HTML version deleted]]