I am new to Using R for data analysis. I have an incomplete time series dataset that is in daily format. I want to extract only Friday data from it. However, there are two problems with it. First, if Friday data is missing in that week, I need to extract the data of the day prior to that Friday (e.g. Thursday). Second, sometimes there are duplicate Friday data (say Friday morning and afternoon), but I only need the latest one (Friday afternoon). My question is how I can only extract the Friday data and make it a new dataset so that I have data for every single week for the convenience of data analysis. Your help and time will be appreciated. Thanks. Kevin Below is what my dataset looks like: views number timestamp day time 1 views 910401 1246192687 Sun 6/28/2009 12:38 2 views 921537 1246278917 Mon 6/29/2009 12:35 3 views 934280 1246365403 Tue 6/30/2009 12:36 4 views 986463 1246888699 Mon 7/6/2009 13:58 5 views 995002 1246970243 Tue 7/7/2009 12:37 6 views 1005211 1247079398 Wed 7/8/2009 18:56 7 views 1011144 1247135553 Thu 7/9/2009 10:32 8 views 1026765 1247308591 Sat 7/11/2009 10:36 9 views 1036856 1247436951 Sun 7/12/2009 22:15 10 views 1040909 1247481564 Mon 7/13/2009 10:39 11 views 1057337 1247568387 Tue 7/14/2009 10:46 12 views 1066999 1247665787 Wed 7/15/2009 13:49 13 views 1077726 1247778752 Thu 7/16/2009 21:12 14 views 1083059 1247845413 Fri 7/17/2009 15:43 15 views 1083059 1247845824 Fri 7/17/2009 18:45 16 views 1089529 1247914194 Sat 7/18/2009 10:49 -- View this message in context: http://r.789695.n4.nabble.com/How-to-extract-Friday-data-from-daily-data-tp3029050p3029050.html Sent from the R help mailing list archive at Nabble.com.
Hey, This should work, but after you read in your data make sure that your day, date and time are separate, this should work just fine, or something like it.> testdataviews number timestamp day date time 1 views 910401 1246192687 Sun 6/28/2009 12:38 2 views 921537 1246278917 Mon 6/29/2009 12:35 3 views 934280 1246365403 Tue 6/30/2009 12:36 4 views 986463 1246888699 Mon 7/6/2009 13:58 5 views 995002 1246970243 Tue 7/7/2009 12:37 6 views 1005211 1247079398 Wed 7/8/2009 18:56 7 views 1011144 1247135553 Thu 7/9/2009 10:32 8 views 1026765 1247308591 Sat 7/11/2009 10:36 9 views 1036856 1247436951 Sun 7/12/2009 22:15 10 views 1040909 1247481564 Mon 7/13/2009 10:39 11 views 1057337 1247568387 Tue 7/14/2009 10:46 12 views 1066999 1247665787 Wed 7/15/2009 13:49 13 views 1077726 1247778752 Thu 7/16/2009 21:12 14 views 1083059 1247845413 Fri 7/17/2009 15:43 15 views 1083059 1247845824 Fri 7/17/2009 18:45 16 views 1089529 1247914194 Sat 7/18/2009 10:49 testdata$date = as.Date(testdata$date,"%m/%d/%Y") Thudat = subset(testdata,day=="Thu") Fridat = subset(testdata,day=="Fri") Friday_dates = Thudat$date+1 Friday_info = NULL for(i in 1:length(Friday_dates)){ temp = subset(Fridat,date==Friday_dates[i]) # select the Friday dates from Fridat if(nrow(temp)>0){ # if that Friday date value exists in Friday Friday_info = rbind(Friday_info,temp[nrow(temp),]) # by saying nrow(temp) with the data organized chronologically already, you don't have to add an additional if statement for multiple measurements in the same day. } else { # if that Friday date value doesn't exist in Fridat Friday_info = rbind(Friday_info,Thudat[i,]) # choosing the date from Thudat instead. } } Friday_info views number timestamp day date time 7 views 1011144 1247135553 Thu 2009-07-09 10:32 15 views 1083059 1247845824 Fri 2009-07-17 18:45 Also, for other things involving getting data out to monthly or weekly, you might want to try working with some functions from the chron package. Things like seq.dates can allow you to get the appropriate dates for a specific day of the week for every week that you want. something like this for instance: as.Date(seq.dates("7/3/2009","7/24/2009",by="weeks"),"%m/%d/%Y") for all the Fridays in July 2009. Hope this helps! A -- Adrienne Wootten Graduate Research Assistant State Climate Office of North Carolina Department of Marine, Earth and Atmospheric Sciences North Carolina State University On Fri, Nov 5, 2010 at 1:22 PM, thornbird <huachang396@gmail.com> wrote:> > I am new to Using R for data analysis. I have an incomplete time series > dataset that is in daily format. I want to extract only Friday data from > it. > However, there are two problems with it. > > First, if Friday data is missing in that week, I need to extract the data > of > the day prior to that Friday (e.g. Thursday). > > Second, sometimes there are duplicate Friday data (say Friday morning and > afternoon), but I only need the latest one (Friday afternoon). > > My question is how I can only extract the Friday data and make it a new > dataset so that I have data for every single week for the convenience of > data analysis. > > Your help and time will be appreciated. Thanks. Kevin > > > Below is what my dataset looks like: > > views number timestamp day time > 1 views 910401 1246192687 Sun 6/28/2009 12:38 > 2 views 921537 1246278917 Mon 6/29/2009 12:35 > 3 views 934280 1246365403 Tue 6/30/2009 12:36 > 4 views 986463 1246888699 Mon 7/6/2009 13:58 > 5 views 995002 1246970243 Tue 7/7/2009 12:37 > 6 views 1005211 1247079398 Wed 7/8/2009 18:56 > 7 views 1011144 1247135553 Thu 7/9/2009 10:32 > 8 views 1026765 1247308591 Sat 7/11/2009 10:36 > 9 views 1036856 1247436951 Sun 7/12/2009 22:15 > 10 views 1040909 1247481564 Mon 7/13/2009 10:39 > 11 views 1057337 1247568387 Tue 7/14/2009 10:46 > 12 views 1066999 1247665787 Wed 7/15/2009 13:49 > 13 views 1077726 1247778752 Thu 7/16/2009 21:12 > 14 views 1083059 1247845413 Fri 7/17/2009 15:43 > 15 views 1083059 1247845824 Fri 7/17/2009 18:45 > 16 views 1089529 1247914194 Sat 7/18/2009 10:49 > > -- > View this message in context: > http://r.789695.n4.nabble.com/How-to-extract-Friday-data-from-daily-data-tp3029050p3029050.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
On Fri, Nov 5, 2010 at 1:22 PM, thornbird <huachang396 at gmail.com> wrote:> > I am new to Using R for data analysis. I have an incomplete time series > dataset that is in daily format. I want to extract only Friday data from it. > However, there are two problems with it. > > First, if Friday data is missing in that week, I need to extract the data of > the day prior to that Friday (e.g. Thursday). > > Second, sometimes there are duplicate Friday data (say Friday morning and > afternoon), but I only need the latest one (Friday afternoon). > > My question is how I can only extract the Friday data and make it a new > dataset so that I have data for every single week for the convenience of > data analysis. >There are several approaches depending on exactly what is to be produced. We show two of them here using zoo. # read in data Lines <- " views number timestamp day time 1 views 910401 1246192687 Sun 6/28/2009 12:38 2 views 921537 1246278917 Mon 6/29/2009 12:35 3 views 934280 1246365403 Tue 6/30/2009 12:36 4 views 986463 1246888699 Mon 7/6/2009 13:58 5 views 995002 1246970243 Tue 7/7/2009 12:37 6 views 1005211 1247079398 Wed 7/8/2009 18:56 7 views 1011144 1247135553 Thu 7/9/2009 10:32 8 views 1026765 1247308591 Sat 7/11/2009 10:36 9 views 1036856 1247436951 Sun 7/12/2009 22:15 10 views 1040909 1247481564 Mon 7/13/2009 10:39 11 views 1057337 1247568387 Tue 7/14/2009 10:46 12 views 1066999 1247665787 Wed 7/15/2009 13:49 13 views 1077726 1247778752 Thu 7/16/2009 21:12 14 views 1083059 1247845413 Fri 7/17/2009 15:43 15 views 1083059 1247845824 Fri 7/17/2009 18:45 16 views 1089529 1247914194 Sat 7/18/2009 10:49" library(zoo) # read in and create a zoo series # - skip= over the header # - index=. the time index is third non-removed column. # - format=. convert the index to Date class using indicated format # - col.names= as specified # - aggregate= over duplicate dates keeping last # - colClasses= specifies "NULL" for columns we want to remove colClasses <- c("NULL", "NULL", "numeric", "numeric", "NULL", "character", "NULL") col.names <- c(NA, NA, "views", "number", NA, NA, NA) # z <- read.zoo("myfile.dat", skip = 1, index = 3, z <- read.zoo(textConnection(Lines), skip = 1, index = 3, format = "%m/%d/%Y", col.names = col.names, aggregate = function(x) tail(x, 1), colClasses = colClasses) ## Now that we have read it in lets process it ## 1. # extract all Thursdays and Fridays z45 <- z[format(time(z), "%w") %in% 4:5,] # keep last entry in each week # and show result on R console z45[!duplicated(format(time(z45), "%U"), fromLast = TRUE), ] # 2. alternative approach # above approach labels each point as it was originally labelled # so if Thursday is used it gets the date of that Thursday # Another approach is to always label the resulting point as Friday # and also use the last available value even if its not Thursday # create daily grid g <- seq(start(z), end(z), by = "day") # fill in daily grid so Friday is filled in with prior value # if Friday is NA z.filled <- na.locf(z, xout = g) # extract Fridays (including those filled in from previous) # and show result on R console z.filled[format(time(z.filled), "%w") == "5", ] -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
On Fri, Nov 5, 2010 at 8:24 PM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:> On Fri, Nov 5, 2010 at 1:22 PM, thornbird <huachang396 at gmail.com> wrote: >> >> I am new to Using R for data analysis. I have an incomplete time series >> dataset that is in daily format. I want to extract only Friday data from it. >> However, there are two problems with it. >> >> First, if Friday data is missing in that week, I need to extract the data of >> the day prior to that Friday (e.g. Thursday). >> >> Second, sometimes there are duplicate Friday data (say Friday morning and >> afternoon), but I only need the latest one (Friday afternoon). >> >> My question is how I can only extract the Friday data and make it a new >> dataset so that I have data for every single week for the convenience of >> data analysis. >> > > > There are several approaches depending on exactly what is to be > produced. ?We show two of them here using zoo. > > > # read in data > > Lines <- " ?views ?number ?timestamp day ? ? ? ? ? ?time > 1 ?views ?910401 1246192687 Sun 6/28/2009 12:38 > 2 ?views ?921537 1246278917 Mon 6/29/2009 12:35 > 3 ?views ?934280 1246365403 Tue 6/30/2009 12:36 > 4 ?views ?986463 1246888699 Mon ?7/6/2009 13:58 > 5 ?views ?995002 1246970243 Tue ?7/7/2009 12:37 > 6 ?views 1005211 1247079398 Wed ?7/8/2009 18:56 > 7 ?views 1011144 1247135553 Thu ?7/9/2009 10:32 > 8 ?views 1026765 1247308591 Sat 7/11/2009 10:36 > 9 ?views 1036856 1247436951 Sun 7/12/2009 22:15 > 10 views 1040909 1247481564 Mon 7/13/2009 10:39 > 11 views 1057337 1247568387 Tue 7/14/2009 10:46 > 12 views 1066999 1247665787 Wed 7/15/2009 13:49 > 13 views 1077726 1247778752 Thu 7/16/2009 21:12 > 14 views 1083059 1247845413 Fri 7/17/2009 15:43 > 15 views 1083059 1247845824 Fri 7/17/2009 18:45 > 16 views 1089529 1247914194 Sat 7/18/2009 10:49" > > library(zoo) > > # read in and create a zoo series > # - skip= over the header > # - index=. the time index is third non-removed column. > # - format=. convert the index to Date class using indicated format > # - col.names= as specified > # - aggregate= over duplicate dates keeping last > # - colClasses= specifies "NULL" for columns we want to remove > > colClasses <- > ?c("NULL", "NULL", "numeric", "numeric", "NULL", "character", "NULL") > > col.names <- c(NA, NA, "views", "number", NA, NA, NA) > > # z <- read.zoo("myfile.dat", skip = 1, index = 3, > z <- read.zoo(textConnection(Lines), skip = 1, index = 3, > ? ? ? ?format = "%m/%d/%Y", col.names = col.names, > ? ? ? ?aggregate = function(x) tail(x, 1), colClasses = colClasses) > > ## Now that we have read it in lets process it > > ## 1. > > # extract all Thursdays and Fridays > z45 <- z[format(time(z), "%w") %in% 4:5,] > > # keep last entry in each week > # and show result on R console > z45[!duplicated(format(time(z45), "%U"), fromLast = TRUE), ] > > > # 2. alternative approach > # above approach labels each point as it was originally labelled > # so if Thursday is used it gets the date of that Thursday > # Another approach is to always label the resulting point as Friday > # and also use the last available value even if its not Thursday > > # create daily grid > g <- seq(start(z), end(z), by = "day") > > # fill in daily grid so Friday is filled in with prior value > # if Friday is NA > z.filled <- na.locf(z, xout = g) > > # extract Fridays (including those filled in from previous) > # and show result on R console > z.filled[format(time(z.filled), "%w") == "5", ] >Note that if the data can span more than one year then "%U" above should be replaced with "%Y-%U" so that weeks in one year are not lumped with weeks in other years. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
Thank you very much. I learned a lot through your help. It worked great for the sample data. But when I try to apply the command to my dataset, I ran into two more problems. First, the dataset is huge, it has thousands of lines. I can read it in R. Using Lines <- " data " may not work such a huge dataset. Is there a way to use data name in the commands. Second, I have two more variables, webpage and item. Using your command, I can extract data for one item (say "fans") with no problem. But under "item", there is also "views", they may have same time and date as "fans", how can I sort it out without replacing the same date? Also, under "webpage", I have two actors/actresses, it has the similar issue. How can I sort it out without replacing the same date? Thanks for the time and help. Below is the sample:> testdata <- read.csv("C:\\Users\\Kevin\\Desktop\\testdata.csv", > header=TRUE) > testdatawebpage item value day date time 1 MattDamon fans 613031 Wed 10-Jun-09 9:40:53 2 MattDamon fans 630242 Thu 11-Jun-09 5:27:47 3 MattDamon fans 631966 Thu 11-Jun-09 9:23:23 4 MattDamon fans 642045 Thu 11-Jun-09 22:11:33 5 MattDamon fans 669791 Sat 13-Jun-09 13:07:53 6 MattDamon fans 700180 Mon 15-Jun-09 5:07:06 7 MattDamon fans 702949 Mon 15-Jun-09 13:09:43 8 MattDamon fans 726624 Tue 16-Jun-09 22:45:27 9 MattDamon fans 734412 Wed 17-Jun-09 13:08:19 10 MattDamon fans 765057 Fri 19-Jun-09 12:37:09 11 MattDamon fans 782741 Sat 20-Jun-09 12:38:36 12 MattDamon fans 796054 Sun 21-Jun-09 12:36:25 13 MattDamon fans 809816 Mon 22-Jun-09 12:39:36 14 MattDamon fans 833996 Tue 23-Jun-09 12:40:25 15 MattDamon fans 871237 Thu 25-Jun-09 12:38:27 16 MattDamon fans 887175 Fri 26-Jun-09 12:36:12 17 MattDamon fans 887195 Fri 26-Jun-09 13:36:12 18 MattDamon fans 899900 Sat 27-Jun-09 12:40:36 19 MattDamon fans 910401 Sun 28-Jun-09 12:38:07 20 MattDamon fans 921537 Mon 29-Jun-09 12:35:17 21 MattDamon fans 934280 Tue 30-Jun-09 12:36:43 22 MattDamon fans 986463 Mon 6-Jul-09 13:58:19 23 MattDamon views 613031 Wed 10-Jun-09 9:40:53 24 MattDamon views 630242 Thu 11-Jun-09 5:27:47 25 MattDamon views 631966 Thu 11-Jun-09 9:23:23 26 MattDamon views 642045 Thu 11-Jun-09 22:11:33 27 MattDamon views 669791 Sat 13-Jun-09 13:07:53 28 MattDamon views 700180 Mon 15-Jun-09 5:07:06 29 MattDamon views 702949 Mon 15-Jun-09 13:09:43 30 MattDamon views 726624 Tue 16-Jun-09 22:45:27 31 MattDamon views 734412 Wed 17-Jun-09 13:08:19 32 MattDamon views 765057 Fri 19-Jun-09 12:37:09 33 MattDamon views 782741 Sat 20-Jun-09 12:38:36 34 MattDamon views 796054 Sun 21-Jun-09 12:36:25 35 MattDamon views 809816 Mon 22-Jun-09 12:39:36 36 MattDamon views 833996 Tue 23-Jun-09 12:40:25 37 MattDamon views 871237 Thu 25-Jun-09 12:38:27 38 MattDamon views 887175 Fri 26-Jun-09 12:36:12 39 MattDamon views 887195 Fri 26-Jun-09 13:36:12 40 MattDamon views 899900 Sat 27-Jun-09 12:40:36 41 MattDamon views 910401 Sun 28-Jun-09 12:38:07 42 MattDamon views 921537 Mon 29-Jun-09 12:35:17 43 MattDamon views 934280 Tue 30-Jun-09 12:36:43 44 MattDamon views 986463 Mon 6-Jul-09 13:58:19 45 AngelinaJolie fans 613031 Wed 10-Jun-09 9:40:53 46 AngelinaJolie fans 630242 Thu 11-Jun-09 5:27:47 47 AngelinaJolie fans 631966 Thu 11-Jun-09 9:23:23 48 AngelinaJolie fans 642045 Thu 11-Jun-09 22:11:33 49 AngelinaJolie fans 669791 Sat 13-Jun-09 13:07:53 50 AngelinaJolie fans 700180 Mon 15-Jun-09 5:07:06 51 AngelinaJolie fans 702949 Mon 15-Jun-09 13:09:43 52 AngelinaJolie fans 726624 Tue 16-Jun-09 22:45:27 53 AngelinaJolie fans 734412 Wed 17-Jun-09 13:08:19 54 AngelinaJolie fans 765057 Fri 19-Jun-09 12:37:09 55 AngelinaJolie fans 782741 Sat 20-Jun-09 12:38:36 56 AngelinaJolie fans 796054 Sun 21-Jun-09 12:36:25 57 AngelinaJolie fans 809816 Mon 22-Jun-09 12:39:36 58 AngelinaJolie fans 833996 Tue 23-Jun-09 12:40:25 59 AngelinaJolie fans 871237 Thu 25-Jun-09 12:38:27 60 AngelinaJolie fans 887175 Fri 26-Jun-09 12:36:12 61 AngelinaJolie fans 887195 Fri 26-Jun-09 13:36:12 62 AngelinaJolie fans 899900 Sat 27-Jun-09 12:40:36 63 AngelinaJolie fans 910401 Sun 28-Jun-09 12:38:07 64 AngelinaJolie fans 921537 Mon 29-Jun-09 12:35:17 65 AngelinaJolie fans 934280 Tue 30-Jun-09 12:36:43 66 AngelinaJolie fans 986463 Mon 6-Jul-09 13:58:19 67 AngelinaJolie views 613031 Wed 10-Jun-09 9:40:53 68 AngelinaJolie views 630242 Thu 11-Jun-09 5:27:47 69 AngelinaJolie views 631966 Thu 11-Jun-09 9:23:23 70 AngelinaJolie views 642045 Thu 11-Jun-09 22:11:33 71 AngelinaJolie views 669791 Sat 13-Jun-09 13:07:53 72 AngelinaJolie views 700180 Mon 15-Jun-09 5:07:06 73 AngelinaJolie views 702949 Mon 15-Jun-09 13:09:43 74 AngelinaJolie views 726624 Tue 16-Jun-09 22:45:27 75 AngelinaJolie views 734412 Wed 17-Jun-09 13:08:19 76 AngelinaJolie views 765057 Fri 19-Jun-09 12:37:09 77 AngelinaJolie views 782741 Sat 20-Jun-09 12:38:36 78 AngelinaJolie views 796054 Sun 21-Jun-09 12:36:25 79 AngelinaJolie views 809816 Mon 22-Jun-09 12:39:36 80 AngelinaJolie views 833996 Tue 23-Jun-09 12:40:25 81 AngelinaJolie views 871237 Thu 25-Jun-09 12:38:27 82 AngelinaJolie views 887175 Fri 26-Jun-09 12:36:12 83 AngelinaJolie views 887195 Fri 26-Jun-09 13:36:12 84 AngelinaJolie views 899900 Sat 27-Jun-09 12:40:36 85 AngelinaJolie views 910401 Sun 28-Jun-09 12:38:07 86 AngelinaJolie views 921537 Mon 29-Jun-09 12:35:17 87 AngelinaJolie views 934280 Tue 30-Jun-09 12:36:43 88 AngelinaJolie views 986463 Mon 6-Jul-09 13:58:19 -- View this message in context: http://r.789695.n4.nabble.com/How-to-extract-Friday-data-from-daily-data-tp3029050p3030555.html Sent from the R help mailing list archive at Nabble.com.
On Sat, Nov 6, 2010 at 11:05 PM, thornbird <huachang396 at gmail.com> wrote:> > Thank you very much. I learned a lot through your help. It worked great for > the sample data. But when I try to apply the command to my dataset, I ran > into two more problems. > > First, the dataset is huge, it has thousands of lines. I can read it in R. > Using ? Lines <- " data " ?may not work such a huge dataset. Is there a way > to use data name in the commands.That was just to keep the example self contained. The commented out line before the read.zoo line shows how it would be done with a file.> > Second, I have two more variables, webpage and item. Using your command, I > can extract data for one item (say "fans") with no problem. But under > "item", there is also "views", they may have same time and date as "fans", > how can I sort it out without replacing the same date? Also, under > "webpage", I have two actors/actresses, it has the similar issue. How can I > sort it out without replacing the same date?Please provide a reproducible example illustrating what is to be produced and include code to do as much of it as you can.> > Thanks for the time and help. > > > Below is the sample: > > >> testdata <- read.csv("C:\\Users\\Kevin\\Desktop\\testdata.csv", >> header=TRUE) >> testdata > ? ? ? ? webpage ?item ?value day ? ? ?date ? ? time > 1 ? ? ?MattDamon ?fans 613031 Wed 10-Jun-09 ?9:40:53 > 2 ? ? ?MattDamon ?fans 630242 Thu 11-Jun-09 ?5:27:47 > 3 ? ? ?MattDamon ?fans 631966 Thu 11-Jun-09 ?9:23:23 > 4 ? ? ?MattDamon ?fans 642045 Thu 11-Jun-09 22:11:33 > 5 ? ? ?MattDamon ?fans 669791 Sat 13-Jun-09 13:07:53 > 6 ? ? ?MattDamon ?fans 700180 Mon 15-Jun-09 ?5:07:06 > 7 ? ? ?MattDamon ?fans 702949 Mon 15-Jun-09 13:09:43 > 8 ? ? ?MattDamon ?fans 726624 Tue 16-Jun-09 22:45:27 > 9 ? ? ?MattDamon ?fans 734412 Wed 17-Jun-09 13:08:19 > 10 ? ? MattDamon ?fans 765057 Fri 19-Jun-09 12:37:09 > 11 ? ? MattDamon ?fans 782741 Sat 20-Jun-09 12:38:36 > 12 ? ? MattDamon ?fans 796054 Sun 21-Jun-09 12:36:25 > 13 ? ? MattDamon ?fans 809816 Mon 22-Jun-09 12:39:36 > 14 ? ? MattDamon ?fans 833996 Tue 23-Jun-09 12:40:25 > 15 ? ? MattDamon ?fans 871237 Thu 25-Jun-09 12:38:27 > 16 ? ? MattDamon ?fans 887175 Fri 26-Jun-09 12:36:12 > 17 ? ? MattDamon ?fans 887195 Fri 26-Jun-09 13:36:12 > 18 ? ? MattDamon ?fans 899900 Sat 27-Jun-09 12:40:36 > 19 ? ? MattDamon ?fans 910401 Sun 28-Jun-09 12:38:07 > 20 ? ? MattDamon ?fans 921537 Mon 29-Jun-09 12:35:17 > 21 ? ? MattDamon ?fans 934280 Tue 30-Jun-09 12:36:43 > 22 ? ? MattDamon ?fans 986463 Mon ?6-Jul-09 13:58:19 > 23 ? ? MattDamon views 613031 Wed 10-Jun-09 ?9:40:53 > 24 ? ? MattDamon views 630242 Thu 11-Jun-09 ?5:27:47 > 25 ? ? MattDamon views 631966 Thu 11-Jun-09 ?9:23:23 > 26 ? ? MattDamon views 642045 Thu 11-Jun-09 22:11:33 > 27 ? ? MattDamon views 669791 Sat 13-Jun-09 13:07:53 > 28 ? ? MattDamon views 700180 Mon 15-Jun-09 ?5:07:06 > 29 ? ? MattDamon views 702949 Mon 15-Jun-09 13:09:43 > 30 ? ? MattDamon views 726624 Tue 16-Jun-09 22:45:27 > 31 ? ? MattDamon views 734412 Wed 17-Jun-09 13:08:19 > 32 ? ? MattDamon views 765057 Fri 19-Jun-09 12:37:09 > 33 ? ? MattDamon views 782741 Sat 20-Jun-09 12:38:36 > 34 ? ? MattDamon views 796054 Sun 21-Jun-09 12:36:25 > 35 ? ? MattDamon views 809816 Mon 22-Jun-09 12:39:36 > 36 ? ? MattDamon views 833996 Tue 23-Jun-09 12:40:25 > 37 ? ? MattDamon views 871237 Thu 25-Jun-09 12:38:27 > 38 ? ? MattDamon views 887175 Fri 26-Jun-09 12:36:12 > 39 ? ? MattDamon views 887195 Fri 26-Jun-09 13:36:12 > 40 ? ? MattDamon views 899900 Sat 27-Jun-09 12:40:36 > 41 ? ? MattDamon views 910401 Sun 28-Jun-09 12:38:07 > 42 ? ? MattDamon views 921537 Mon 29-Jun-09 12:35:17 > 43 ? ? MattDamon views 934280 Tue 30-Jun-09 12:36:43 > 44 ? ? MattDamon views 986463 Mon ?6-Jul-09 13:58:19 > 45 AngelinaJolie ?fans 613031 Wed 10-Jun-09 ?9:40:53 > 46 AngelinaJolie ?fans 630242 Thu 11-Jun-09 ?5:27:47 > 47 AngelinaJolie ?fans 631966 Thu 11-Jun-09 ?9:23:23 > 48 AngelinaJolie ?fans 642045 Thu 11-Jun-09 22:11:33 > 49 AngelinaJolie ?fans 669791 Sat 13-Jun-09 13:07:53 > 50 AngelinaJolie ?fans 700180 Mon 15-Jun-09 ?5:07:06 > 51 AngelinaJolie ?fans 702949 Mon 15-Jun-09 13:09:43 > 52 AngelinaJolie ?fans 726624 Tue 16-Jun-09 22:45:27 > 53 AngelinaJolie ?fans 734412 Wed 17-Jun-09 13:08:19 > 54 AngelinaJolie ?fans 765057 Fri 19-Jun-09 12:37:09 > 55 AngelinaJolie ?fans 782741 Sat 20-Jun-09 12:38:36 > 56 AngelinaJolie ?fans 796054 Sun 21-Jun-09 12:36:25 > 57 AngelinaJolie ?fans 809816 Mon 22-Jun-09 12:39:36 > 58 AngelinaJolie ?fans 833996 Tue 23-Jun-09 12:40:25 > 59 AngelinaJolie ?fans 871237 Thu 25-Jun-09 12:38:27 > 60 AngelinaJolie ?fans 887175 Fri 26-Jun-09 12:36:12 > 61 AngelinaJolie ?fans 887195 Fri 26-Jun-09 13:36:12 > 62 AngelinaJolie ?fans 899900 Sat 27-Jun-09 12:40:36 > 63 AngelinaJolie ?fans 910401 Sun 28-Jun-09 12:38:07 > 64 AngelinaJolie ?fans 921537 Mon 29-Jun-09 12:35:17 > 65 AngelinaJolie ?fans 934280 Tue 30-Jun-09 12:36:43 > 66 AngelinaJolie ?fans 986463 Mon ?6-Jul-09 13:58:19 > 67 AngelinaJolie views 613031 Wed 10-Jun-09 ?9:40:53 > 68 AngelinaJolie views 630242 Thu 11-Jun-09 ?5:27:47 > 69 AngelinaJolie views 631966 Thu 11-Jun-09 ?9:23:23 > 70 AngelinaJolie views 642045 Thu 11-Jun-09 22:11:33 > 71 AngelinaJolie views 669791 Sat 13-Jun-09 13:07:53 > 72 AngelinaJolie views 700180 Mon 15-Jun-09 ?5:07:06 > 73 AngelinaJolie views 702949 Mon 15-Jun-09 13:09:43 > 74 AngelinaJolie views 726624 Tue 16-Jun-09 22:45:27 > 75 AngelinaJolie views 734412 Wed 17-Jun-09 13:08:19 > 76 AngelinaJolie views 765057 Fri 19-Jun-09 12:37:09 > 77 AngelinaJolie views 782741 Sat 20-Jun-09 12:38:36 > 78 AngelinaJolie views 796054 Sun 21-Jun-09 12:36:25 > 79 AngelinaJolie views 809816 Mon 22-Jun-09 12:39:36 > 80 AngelinaJolie views 833996 Tue 23-Jun-09 12:40:25 > 81 AngelinaJolie views 871237 Thu 25-Jun-09 12:38:27 > 82 AngelinaJolie views 887175 Fri 26-Jun-09 12:36:12 > 83 AngelinaJolie views 887195 Fri 26-Jun-09 13:36:12 > 84 AngelinaJolie views 899900 Sat 27-Jun-09 12:40:36 > 85 AngelinaJolie views 910401 Sun 28-Jun-09 12:38:07 > 86 AngelinaJolie views 921537 Mon 29-Jun-09 12:35:17 > 87 AngelinaJolie views 934280 Tue 30-Jun-09 12:36:43 > 88 AngelinaJolie views 986463 Mon ?6-Jul-09 13:58:19 > > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-extract-Friday-data-from-daily-data-tp3029050p3030555.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
Hi thanks for quick reply. I am new to using R and still tried to figure out how to use Zoo package. Here is the code I have so far: library(zoo) colClasses <- c("NULL", "character", "character", "numeric", "character", "character", "NULL") col.names <- c(NA, "webpage", "item", "value", "day", "date", NA) # z <- read.zoo("myfile.dat", skip = 1, index = as.list(1:6), z <- read.zoo("C:\\Users\\Kevin\\Desktop\\testdata.csv", sep = ",", skip 1, index = as.list(1:6), format = "%d/%m/%Y", col.names = col.names, aggregate = function(x) tail(x, 1), colClasses = colClasses) # extract all Thursdays and Fridays z45 <- z[format(time(z), "%w") %in% 4:5,] # keep last entry in each week # and show result on R console z45[!duplicated(format(time(z45), "%U"), fromLast = TRUE), ] I attached a reproducible dataset in excel and I hope to get the results as follows. It would be great if I can get all days in Friday format as you suggested in the second approach the first time. Again your time and help is appreciated! http://r.789695.n4.nabble.com/file/n3031420/testdata.csv testdata.csv webpage item value day date time MattDamon fans 642045 Thu 11-Jun-09 22:11:33 MattDamon fans 765057 Fri 19-Jun-09 12:37:09 MattDamon fans 899900 Sat 27-Jun-09 12:40:36 (no Fri or Thu, so I chose Sat) MattDamon views 642045 Thu 11-Jun-09 22:11:33 MattDamon views 765057 Fri 19-Jun-09 12:37:09 MattDamon views 887195 Fri 26-Jun-09 13:36:12 AngieeJolie fans 642045 Thu 11-Jun-09 22:11:33 AngieeJolie fans 765057 Fri 19-Jun-09 12:37:09 AngieeJolie fans 887195 Fri 26-Jun-09 13:36:12 AngieeJolie views 642045 Thu 11-Jun-09 22:11:33 AngieeJolie views 765057 Fri 19-Jun-09 12:37:09 AngieeJolie views 887195 Fri 26-Jun-09 13:36:12 -- View this message in context: http://r.789695.n4.nabble.com/How-to-extract-Friday-data-from-daily-data-tp3029050p3031420.html Sent from the R help mailing list archive at Nabble.com.
Hi thanks for quick reply. I am new to using R and still tried to figure out how to use Zoo package. Here is the code I have so far: library(zoo) colClasses <- c("NULL", "character", "character", "numeric", "character", "character", "NULL") col.names <- c(NA, "webpage", "item", "value", "day", "date", NA) # z <- read.zoo("myfile.dat", skip = 1, index = as.list(1:6), z <- read.zoo("C:\\Users\\Kevin\\Desktop\\testdata.csv", sep = ",", skip 1, index = as.list(1:6), format = "%d/%m/%Y", col.names = col.names, aggregate = function(x) tail(x, 1), colClasses = colClasses) # extract all Thursdays and Fridays z45 <- z[format(time(z), "%w") %in% 4:5,] # keep last entry in each week # and show result on R console z45[!duplicated(format(time(z45), "%U"), fromLast = TRUE), ] I attached a reproducible dataset in excel http://r.789695.n4.nabble.com/file/n3031422/testdata.csv testdata.csv and I hope to get the results as follows. It would be great if I can get all days in Friday format as you suggested in the second approach the first time. Again your time and help is appreciated! webpage item value day date time MattDamon fans 642045 Thu 11-Jun-09 22:11:33 MattDamon fans 765057 Fri 19-Jun-09 12:37:09 MattDamon fans 899900 Sat 27-Jun-09 12:40:36 (no Fri or Thu, so I chose Sat) MattDamon views 642045 Thu 11-Jun-09 22:11:33 MattDamon views 765057 Fri 19-Jun-09 12:37:09 MattDamon views 887195 Fri 26-Jun-09 13:36:12 AngieeJolie fans 642045 Thu 11-Jun-09 22:11:33 AngieeJolie fans 765057 Fri 19-Jun-09 12:37:09 AngieeJolie fans 887195 Fri 26-Jun-09 13:36:12 AngieeJolie views 642045 Thu 11-Jun-09 22:11:33 AngieeJolie views 765057 Fri 19-Jun-09 12:37:09 AngieeJolie views 887195 Fri 26-Jun-09 13:36:12 -- View this message in context: http://r.789695.n4.nabble.com/How-to-extract-Friday-data-from-daily-data-tp3029050p3031422.html Sent from the R help mailing list archive at Nabble.com.