Hi, I have a small test sample with lab reports (PAP smears) from a number of different providers. These have Collection Dates and the relevant columns glimpse() something like this: $ Provider <chr> "Dr C", "Dr D", "Dr C", "Dr D" $ CollectionDate <chr> "2016-11-03", "2016-11-02", "2016-11-03", "2016-11-03" I am looking to find (filter) the reports which were collected in the time period common to all providers? Something like the largest First Common CollectionDate and the smallest Last Common CollectionDate How would I do that? I can of course do this "manually", ie collect all Providers and their first and last Collection dates and then find the Common First and Last one, but wonder if there is an elegant way of doing this :-)-O greetings, el -- If you want to email me, replace nospam with el Dr. Eberhard W. Lisse \ / Obstetrician & Gynaecologist el at lisse.NA / * | Telephone: +264 81 124 6733 (cell) PO Box 8421 Bachbrecht \ / If this email is signed with GPG/PGP 10007, Namibia ;____/ Sect 20 of Act No. 4 of 2019 may apply
On 2020-08-21 09:03 +0200, Dr Eberhard Lisse wrote:> Hi, > > I have a small test sample with lab > reports (PAP smears) from a number of > different providers. These have > Collection Dates and the relevant > columns glimpse() something like > this: > > $ Provider <chr> "Dr C", "Dr D", "Dr C", "Dr D" > $ CollectionDate <chr> "2016-11-03", "2016-11-02", "2016-11-03", "2016-11-03" > > I am looking to find (filter) the > reports which were collected in the > time period common to all providers? > > Something like > > the largest First Common CollectionDate > and > the smallest Last Common CollectionDate > > How would I do that? > > I can of course do this "manually", ie > collect all Providers and their first > and last Collection dates and then > find the Common First and Last one, > but wonder if there is an elegant way > of doing this :-)-ODear Eberhard, Is each report in a csv file with those two columns, and you want to unify them into a dataframe with CollectionDate along the rows, and other details for each provider along the columns? This can be done with various apply calls and reshape. Can you please subset some more example data here using dput. It makes it so much easier. /Rasmus -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200821/c8448814/attachment.sig>
Hi Eberhard, Here is one possibility using dplyr. library(dplyr) set.seed(3) ## set up some fake data dtV <- as.Date("2020-08-01") + 0:4 x <- sample(dtV,20,repl=TRUE) provider <- sample(LETTERS[1:3],20,repl=TRUE) lDf <- data.frame(Provider=provider,CollectionDate=x,stringsAsFactors=FALSE) ## get min/max date for each provider a <- lDf %>% dplyr::group_by( Provider ) %>% dplyr::mutate( minDt=min(CollectionDate), maxDt=max(CollectionDate)) %>% dplyr::summarize( u = min(minDt), v = max(maxDt) ) ## get the common interval c(max(a$u), min(a$v)) # [1] "2020-08-02" "2020-08-04" HTH, Eric On Fri, Aug 21, 2020 at 12:34 PM Rasmus Liland <jral at posteo.no> wrote:> On 2020-08-21 09:03 +0200, Dr Eberhard Lisse wrote: > > Hi, > > > > I have a small test sample with lab > > reports (PAP smears) from a number of > > different providers. These have > > Collection Dates and the relevant > > columns glimpse() something like > > this: > > > > $ Provider <chr> "Dr C", "Dr D", "Dr C", "Dr D" > > $ CollectionDate <chr> "2016-11-03", "2016-11-02", "2016-11-03", > "2016-11-03" > > > > I am looking to find (filter) the > > reports which were collected in the > > time period common to all providers? > > > > Something like > > > > the largest First Common CollectionDate > > and > > the smallest Last Common CollectionDate > > > > How would I do that? > > > > I can of course do this "manually", ie > > collect all Providers and their first > > and last Collection dates and then > > find the Common First and Last one, > > but wonder if there is an elegant way > > of doing this :-)-O > > Dear Eberhard, > > Is each report in a csv file with those > two columns, and you want to unify them > into a dataframe with CollectionDate > along the rows, and other details for > each provider along the columns? This > can be done with various apply calls and > reshape. Can you please subset some > more example data here using dput. It > makes it so much easier. > > /Rasmus > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]