Jared,
I am not sure how you converted your 'time' variable from a factor to
numeric, but you probably actually want to convert it to one of the
'time' classes. To learn more about them in R, see ?DateTimeClasses
Another nice feature of these special time classes is that they can
handle year, month, day, and time all in one column. This means you
only need to sort by two columns (ID and time). You can also look at
?strptime for details on converting character strings into time
variables. An example using your data follows below.
Best regards,
Josh
samp.dat <- structure(list(ID = c(2836L, 2836L, 2836L, 2836L, 2836L, 2836L,
2836L, 2836L, 2836L, 2836L, 2836L, 2836L, 2836L, 2836L, 2836L,
2836L), year = c(2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L,
2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L
), month = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L), day = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), time = structure(c(12L, 13L, 14L,
15L, 16L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L), .Label =
c("0:01:35",
"10:00:15", "11:00:44", "12:00:17",
"13:00:38", "14:00:25", "15:00:53",
"16:00:11", "17:00:23", "18:00:47",
"21:01:13", "3:00:50", "6:00:20",
"7:00:42", "8:00:42", "9:00:12"), class =
"factor"), Lat = c(-1.2402597,
-1.2397508, -1.2431248, -1.2396636, -1.2304111, -1.2255532, -1.2248113,
-1.2251362, -1.2246384, -1.2245949, -1.2269631, -1.2264911, -1.2251153,
-1.2315372, -1.2578944, -1.242075), Long = c(35.5405911, 35.5406318,
35.5388285, 35.5285848, 35.5139149, 35.5162895, 35.5147305, 35.491731,
35.4918846, 35.4918647, 35.4880909, 35.4837137, 35.4817967, 35.4806165,
35.4670629, 35.5449559), test = c(77L, 120L, 214L, 300L, 345L,
436L, 528L, 585L, 665L, 727L, 813L, 846L, 928L, 1027L, 1093L,
1132L)), .Names = c("ID", "year", "month",
"day", "time", "Lat",
"Long", "test"), class = "data.frame", row.names =
c(NA, -16L
))
str(samp.dat)
#first combine all time columns using paste()
#then convert to POSIXlt
samp.dat$time2 <- strptime(x = paste(samp.dat$year, "-",
samp.dat$month, "-",
samp.dat$day, " ",
samp.dat$time,
sep=""),
format = "%Y-%m-%d %H:%M:%S")
str(samp.dat) #note how 'time2' is actually a time class now
#ordering becomes easier
temp.or <- order(samp.dat$ID, samp.dat$time2, decreasing=FALSE)
samp.dat <- samp.dat[temp.or, ]
samp.dat #print to screen
On Thu, Jul 8, 2010 at 4:28 PM, Jared Stabach
<jstabach at rams.colostate.edu> wrote:> I have a dataframe of animal locations that I need to have in incremental
> order so that I can calculate the distance traveled between each time step.
> However, I have identified a few values that don't seem to sort
properly.
> For instance, the last value in the table below should be the first value
> after sorting, since its time value is '00:01:35'. ?But, for some
reason, it
> seems to be recognized after the '21:01:13' value. ?I also defined
the time
> column as a numeric value (originally a factor) with the result shown in
the
> 'test' column. ?As the value is reported as '1132', it
seems there is an
> issue with the time value listed.
>
> ?ID ? ? ?year ? ?month ?day ?time ? ? ? ? ?Lat
> Long ? ? ? ? ? ?test
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?03:00:50 ? ? -1.2402597 ? ?35.5405911 ?77
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?06:00:20 ? ? -1.2397508 ? ?35.5406318 ?120
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?07:00:42 ? ? -1.2431248 ? ?35.5388285 ?214
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?08:00:42 ? ? -1.2396636 ? ?35.5285848 ?300
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?09:00:12 ? ? -1.2304111 ? ?35.5139149 ?345
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?10:00:15 ? ? -1.2255532 ? ?35.5162895 ?436
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?11:00:44 ? ? -1.2248113 ? ?35.5147305 ?528
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?12:00:17 ? ? -1.2251362 ? ?35.4917310 ?585
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?13:00:38 ? ? -1.2246384 ? ?35.4918846 ?665
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?14:00:25 ? ? -1.2245949 ? ?35.4918647 ?727
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?15:00:53 ? ? -1.2269631 ? ?35.4880909 ?813
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?16:00:11 ? ? -1.2264911 ? ?35.4837137 ?846
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?17:00:23 ? ? -1.2251153 ? ?35.4817967 ?928
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?18:00:47 ? ? -1.2315372 ? ?35.4806165 ?1027
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?21:01:13 ? ? -1.2578944 ? ?35.4670629 ?1093
> 2836 ?2010 ? 7 ? ? ? ? 1 ? ? ?00:01:35 ? ? -1.2420750 ? ?35.5449559 ?1132
>
> The code I used to sort the dataframe is:
>
> # Sort dataset so values are in incremental order
> temp.or
>
<-order(wildebeest$ID,wildebeest$year,wildebeest$month,wildebeest$day,wildebeest$time,decreasing=FALSE)
> wildebeest <-wildebeest[temp.or,]
> Eventually, I will have around 400,000 records, so my script is designed at
> problem solving these errors. ?Is there something that I am missing or is
> there something in this field that could possibly be hidden? ?Any
> suggestions?
>
> Thanks in advance for any help.
>
> Jared
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/